Uploaded image for project: 'Indy Node'
  1. Indy Node
  2. INDY-1410

One of the nodes stopped writing after 44287 txns with errors in status

    XMLWordPrintable

Details

    • Bug
    • Status: Complete
    • High
    • Resolution: Done
    • None
    • None
    • None
    • indy-node 1.3.450
      libindy 1.4.0~565
      AWS pool of 25 nodes (QA Performance)

    • EV 18.12 Release RocksDB

    Description

      Steps to Reproduce:
      1. Setup the pool of 25 nodes and 1 client for load test.
      2. Run the load test from client machine:
      python3.5 perf_processes.py -n 1 -t 1 -c 10 -r 100 -g perf25_transactions_genesis 2>error.txt

      Actual Result:
      Load test stopped working on 52215 txns. One of the nodes (Node19) stopped writing on 44287 txns. After that pool was without the load during the night and lagged node was not processed missed txns. After restart of load test all nodes exclude lagged one (Node19) continued writing.
      Following error appear in `systemctl status indy-node` of lagged node:

      Jun 11 15:14:24 canadaQALive.qatest.evernym.com env[24509]: self.doOrder(commit)
      Jun 11 15:14:24 canadaQALive.qatest.evernym.com env[24509]: File "/usr/local/lib/python3.5/dist-packages/plenum/server/replica.py", line 1640, in doOrder
      Jun 11 15:14:24 canadaQALive.qatest.evernym.com env[24509]: return self.order_3pc_key(key)
      Jun 11 15:14:24 canadaQALive.qatest.evernym.com env[24509]: File "/usr/local/lib/python3.5/dist-packages/plenum/server/replica.py", line 1671, in order_3pc_key
      Jun 11 15:14:24 canadaQALive.qatest.evernym.com env[24509]: self.addToCheckpoint(pp.ppSeqNo, pp.digest)
      Jun 11 15:14:24 canadaQALive.qatest.evernym.com env[24509]: File "/usr/local/lib/python3.5/dist-packages/plenum/server/replica.py", line 1791, in addToCheckpoint
      Jun 11 15:14:24 canadaQALive.qatest.evernym.com env[24509]: self.processStashedCheckpoints((s, e))
      Jun 11 15:14:24 canadaQALive.qatest.evernym.com env[24509]: File "/usr/local/lib/python3.5/dist-packages/plenum/server/replica.py", line 1879, in processStashedCheckpoints
      Jun 11 15:14:24 canadaQALive.qatest.evernym.com env[24509]: del self.stashedRecvdCheckpoints[self.viewNo][key]
      Jun 11 15:14:24 canadaQALive.qatest.evernym.com env[24509]: KeyError: 0
      

      Expected Results:
      All nodes should write txns all the time.

      Additional Information:
      Logs shared with sergey.khoroshavin.

      Attachments

        Activity

          People

            ozheregelya Olga Zheregelya
            ozheregelya Olga Zheregelya
            Nikita Spivachuk, Olga Zheregelya
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: