When running load test on with two agents running
in a loop after some view change pool stopped ordering transactions. Ledgers were NOT corrupted, all nodes eventually reached same state, but refused to order any new txn. Restarting all nodes in a pool fixed problem temporarily.
After investigation of log files it turned out that in the beginning of some view primary generated first PREPREPARE with more than 9000 txns, with size about 640k, while default MSG_LEN_LIMIT is just 128k. So, this PREPREPARE was not delivered to other nodes and no new transactions were ordered.
It's recommended to either increase MSG_LEN_LIMIT to 768k or decrease Max3PCBatchSize accordingly.
Find the root cause and create a ticket for fixing