Affects Version/s: None
Fix Version/s: None
STN running 1.3.57
Sprint:EV 18.11 Stability/ViewChange
The STN currently has 11 nodes, 7 of which are owned by Sovrin. When one node of our seven is brought down, the network fails to post transactions. We should be well above consensus. An additional fact that confuses matters is that when we attempt to connect to the pool using the legacy CLI, it shows that it is connecting to nodes that are not currently part of the pool, but are now part of the live pool. These nodes have all been demoted on this ledger.
Validator-info shows the correct pool nodes:
If you look in the attached cli log file, you will see erroneous connections to nodes such as TNO. The strange behavior of the CLI is not the thrust of this ticket, it is only a strange symptom. The emphasis of the investigation should be why one node being up or down can prevent consensus.
This problem is repeatable on the STN. If you bring down any node, the pool does not achieve consensus. Korea was down at the time that these logs were obtained. When all seven of the sovrin-owned nodes are up, the pool is in consensus, and the CLI connects and acts normally.
Logs for the sovrin-owned validators are also included. Logs will be requested from our external stewards and will be attached as they are received.
- Diagnose the issue and create a Plan of Attack, including associated stories and epics that can be scheduled.
- If the problem proves to be a configuration issue, we can solve it immediately.