Uploaded image for project: 'Indy Node'
  1. Indy Node
  2. INDY-1025

Pool stopped working and lost consensus while new node was performing a catch-up

    XMLWordPrintable

Details

    • Bug
    • Status: Complete
    • High
    • Resolution: Done
    • None
    • None
    • None
    • INDY 18.01: Stability+, Sprint 18.02 Stability

    Description

      The pool lost it's ability to reach consensus while a new node was performing a catch-up.
      I don't have detailed logs only at the info level.

      Setup
      I have pool of 13 nodes with 5,022 transactions. I was adding 3 more nodes to the pool (14, 15, 16)

      Steps

      1. I added Node14 and let it perform a catch up before adding the next node
      2. I added Node15 to the pool after Node14 was at 5,022 transactions
      3. I then added Node16 after Node15 was at 5,022 transactions

      Other Info

      • The catch-up is pretty fast so I run "read_ledger --type domain --count" around every 20 - 30 seconds to see when it has completed.
      • The ledger tool was displaying the incorrect ledger count (this is a different issue) while performing a catch-up. I jumped from 12 to 6,000 transactions (more than what the pool has) and then to 9,337.
      • The ledger on Node16 settled at 4,688 on Node16 and did not change.
      • I sent a new transaction from the CLI on a different machine and the pool stopped taking transactions.

      Error
      With only info level debugging this is all I captured

      (  29) | discard | Node1 discarding message INSTANCE_CHANGE{'reason': 26, 'viewNo': 1} because Received instance change request with view no 1 which is not more than its view no 1
      

      It appears that a view change might have been attempted while Node16 was performing a catch-up.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              krw910 Kelly Wilson
              Alexander Shcherbakov, Kelly Wilson, Olga Zheregelya
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: