Affects Version/s: None
Fix Version/s: None
Sprint:Sprint 18.03 Stability, DKMS, Sprint 18.04, Sprint 18.05, 18.06, 18.07 Stability & Monitoring
We saw cases when a primary replica sends a PREPREPARE message with duplicates of requests already committed to a ledger. The cases from
INDY-959 and INDY-1045 are examples of this incorrect behavior.
The general scenario in these cases was as follows:
- The primary replica in some instance sends PREPREPARE with already ordered requests to all the other replicas in the instance.
- The nodes containing these other replicas send MESSAGE_REQUESTs for PROPAGATEs of these requests to all the others.
- The node containing the specified primary replica responds to received MESSAGE_REQUESTs by MESSAGE_RESPONSEs with requested PROPAGATEs.
- Having seen these requests as if for the first time, the rest of the nodes send PROPAGATEs to all the others (but actually the nodes just do not detect that these old already processed requests were received earlier).
- All the nodes reach quorums for PROPAGATEs and forward these requests to their replicas. 3PC-process for these requests proceeds in the specified instance and also starts in all the other instances.
- Eventually these requests are ordered for the second time.
In scope of
INDY-959 we made a fix preventing processing of a PROPAGATE message with a request-duplicate. The version 1.1.43-stable of indy-node does not contain this fix, so we could observe the described behavior with committing requests-duplicates to a ledger in INDY-959 and INDY-1045. However, we still have not found the cause why a primary replica may send a PREPREPARE message with duplicates of requests already committed to a ledger. In scope of this ticket we must investigate and fix this issue.