Uploaded image for project: 'Indy SDK'
  1. Indy SDK
  2. IS-626

Pool refresh hanging

    XMLWordPrintable

Details

    • Bug
    • Status: Complete
    • High
    • Resolution: Done
    • None
    • 1.4
    • None
    • None

    Description

      Sometimes poll_refresh call is hanging during the tests.
      test is simple:
      1 - create pool
      2 - send node upgrade txn with node port changed
      3 - call pool refresh

      According to rust logs

      ```

      {"op":"CATCHUP_REQ","ledgerId":0,"seqNoStart":8,"seqNoEnd":5,"catchupTill":5}

      message_processor.py 29 WARNING Beta discarding message CATCHUP_REQ

      {'seqNoStart': 8, 'catchupTill': 5, 'ledgerId': 0, 'seqNoEnd': 5}

      because Invalid range
      message_processor.py 29 WARNING Gamma discarding message CATCHUP_REQ

      {'seqNoStart': 6, 'catchupTill': 5, 'ledgerId': 0, 'seqNoEnd': 5}

      because Invalid range
      message_processor.py 29 WARNING Delta discarding message CATCHUP_REQ

      {'seqNoStart': 7, 'catchupTill': 5, 'ledgerId': 0, 'seqNoEnd': 5}

      because Invalid range
      WARN|indy::services::pool::catchup | src/services/pool/catchup.rs:307 | Fail to continue catch-up response(s) not received from nodes with idx

      {1, 0, 3, 2}

      . Node will be blacklisted and catchup will be restarted
      thread '<unnamed>' panicked at 'attempt to divide by zero', src/services/pool/catchup.rs:180:23
      note: Run with `RUST_BACKTRACE=1` for a backtrace.
      ```

      After some debug with Artem we found that sdk blacklisted all the nodes and it caused division by zero. Artem tried to fix it and send PR https://github.com/hyperledger/indy-sdk/pull/628
      But it looks like the actual reason is nodes do not send CATCHUP_REP to sdk.

      In nodes's logs we have

      ```
      2018-04-05 18:35:18,597 | DEBUG | node.py (1714) | processClientInBox | BetaC processing b'l/Uy>7Vln6nE]w&&^T!}S{WJR9y%!pN3#:uSa)$%' request CATCHUP_REQ

      {'seqNoStart': 8, 'catchupTill': 5, 'ledgerId': 0, 'seqNoEnd': 5}

      2018-04-05 18:35:18,597 | DEBUG | ledger_manager.py ( 430) | processCatchupReq | Beta received catchup request: CATCHUP_REQ

      {'seqNoStart': 8, 'catchupTill': 5, 'ledgerId': 0, 'seqNoEnd': 5}

      from b'l/Uy>7Vln6nE]w&&^T!}S{WJR9y%!pN3#:uSa)$%'
      2018-04-05 18:35:18,597 | WARNING | message_processor.py ( 29) | discard | Beta discarding message CATCHUP_REQ

      {'seqNoStart': 8, 'catchupTill': 5, 'ledgerId': 0, 'seqNoEnd': 5}

      because Invalid range
      ```

      And it corresponds to rust logs.
      So some strange request were generated.

      Attachments

        Activity

          People

            anikitinDSR Andrew Nikitin
            dsurnin Dmitry Surnin
            Dmitry Surnin, Sergey Khoroshavin, Vyacheslav Gudkov
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: