Uploaded image for project: 'Indy Node'
  1. Indy Node
  2. INDY-1897

PoA: View Change needs to be triggered in BFT way



    • Type: Task
    • Status: Complete
    • Priority: High
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: 1.6.83
    • Component/s: None
    • Labels:


      The issue caused problems from INDY-1893.

      • As of now, INSTANCE_CHANGE message doesn't have instance change ID as required by RBFT paper (see IV.D).
      • So, a node can process ALL instance change messages, including very old ones (from the nodes which are already not available).
      • So, a node may start a view change receiving INSTANCE_CHANGE from less than n-f nodes, if it's stashed some old INSTANCE_CHANGEs from nodes that didn't send INSTANCE_CHANGE now.
      • This breaks BFT principles, that either all nodes should start the view change, or all nodes should not start the view change.


      Acceptance criteria

      • Create a test reproducing a problem
        • Send INSTANCE_CHANGE by f+1 nodes only, so that other nodes don't accept it, and don't start the view change.
          One of this nodes must be a master primary.
        • Stop these f+q nodes
        • Wait until other nodes send INSTANCE_CHANGE because they lost primary
        • Make sure they don't start view change (this will fail with the current code)
        • Restart f+1 nodes
        • make sure the pool is functional.
      • Explore options of how we can fix it
      • Create necessary tickets for implementation

      One of the possible options

      • Add instance change ID into INSTANCE_CHANGE message
        • See https://pakupaku.me/plaublin/rbft/5000a297.pdf, IV.D. - looks like there is a bug there
        • Every new INSTANCE_CHANGE needs to be sent with a new (incremented) ID - this is not explicitly said in RBFT paper (is it a bug there?)
        • Think about whether we need a new protocol version for nodes (most probably no, since we can add it as an optional field at the end).
      • Change INSTANCE_CHANGE processing logic to start view change only if there is a quorum of INSTANCE_CHANGEs with the same ID and viewNo.
        • Has an INSTANCE_CHANGE ID parameter for the node.
        • Discard (stash?) INSTANCE_CHANGE¬† messages with ID less than the current node's one
        • Check and send INSTANCE_CHANGE to others if INSTANCE_CHANGE ID is greater than the current one
      • Think about restoring INSTANCE_CHANGE ID
        • Either clear it after the view change, or make it persistent and increment during the whole life of the pool
        • If clear it after a view change / restart, then a node needs to start a view change if it gets a quorum of INSTANCE_CHANGE with ids less than this node's one.
          This is needed to handle the situation when all nodes re-started except the current one.


          Issue Links



              sergey.khoroshavin Sergey Khoroshavin
              ashcherbakov Alexander Shcherbakov
              Alexander Shcherbakov, Sergey Khoroshavin
              0 Vote for this issue
              2 Start watching this issue