If a backup primary node is disconnected, all replicas on this backup instance store all new requests and other replicas can't remove already ordered messages.
For solve this problem we should detect that a backup primary node was disconnected for a long constant time and switch off the replica with this primary. (task:
- Implement an abstract strategy to detect malicious backup primaries
- Implement a strategy which detects malicious backup primaries by disconnection
- We need to have a tolerance time we wait before reporting disconnection (like being disconnected for 10 secs in a row)
- switch off a replica (a code in
INDY-1680) once strategy detects malicious.
- make sure that all replicas are switch on after a View Change
- add tests
- testing performance changes (shouldn't be worse) after disconnect a backup primary node
- testing memory consumption (should be better)