Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform manual testing on ordering nodes terminating badly in a submit scenario #52

Open
davidkel opened this issue Nov 3, 2021 · 2 comments

Comments

@davidkel
Copy link
Owner

davidkel commented Nov 3, 2021

To note, if you kill the leader then that will cause problems to occur
For this test I turned of submit timeouts because of the potential time it takes to move to try another orderer if 1 goes down (upto 20 seconds has been seen).

@davidkel
Copy link
Owner Author

davidkel commented Nov 10, 2021

The following output can be expected if you kill the leader

{"component":"CLIENT","timestamp":"2021-11-10T10:56:18.988Z","txnId":"15b22c21689832ec1516b3bd3ace0b3646471534cf3c5a2fa40f6698b0629b5a","stage":"Endorsing","message":"addUpdateAssets({\"arguments\":[\"1\",\"2000\"]})"}
{"component":"CLIENT","timestamp":"2021-11-10T10:56:26.192Z","txnId":"15b22c21689832ec1516b3bd3ace0b3646471534cf3c5a2fa40f6698b0629b5a","stage":"Submitting","message":""}
{"component":"CLIENT","timestamp":"2021-11-10T10:56:45.300Z","txnId":"15b22c21689832ec1516b3bd3ace0b3646471534cf3c5a2fa40f6698b0629b5a","stage":"Failed","message":"10 ABORTED: no orderers could successfully process transaction, Details: [{\"address\":\"orderer4.example.com:7053\",\"message\":\"failed to create BroadcastClient: rpc error: code = Unavailable desc = connection error: desc = \\\"transport: error while dialing: dial tcp 172.30.0.8:7053: connect: connection refused\\\"\",\"mspId\":\"OrdererMSP\"},{\"address\":\"orderer3.example.com:7052\",\"message\":\"received unsuccessful response from orderer: SERVICE_UNAVAILABLE\",\"mspId\":\"OrdererMSP\"},{\"address\":\"orderer1.example.com:7050\",\"message\":\"received unsuccessful response from orderer: SERVICE_UNAVAILABLE\",\"mspId\":\"OrdererMSP\"},{\"address\":\"orderer2.example.com:7051\",\"message\":\"received unsuccessful response from orderer: SERVICE_UNAVAILABLE\",\"mspId\":\"OrdererMSP\"},{\"address\":\"orderer5.example.com:7054\",\"message\":\"failed to create BroadcastClient: rpc error: code = Unavailable desc = connection error: desc = \\\"transport: error while dialing: dial tcp 172.30.0.10:7054: i/o timeout\\\"\",\"mspId\":\"OrdererMSP\"}]"}

and this is situation can be seen in the gateway logs

2021-11-10 10:56:26.195 UTC 344a INFO [gateway] Submit -> Sending transaction to orderer TxID=15b22c21689832ec1516b3bd3ace0b3646471534cf3c5a2fa40f6698b0629b5a endpoint=orderer4.example.com:7053
2021-11-10 10:56:26.196 UTC 344b WARN [gateway] Submit -> Error sending transaction to orderer TxID=15b22c21689832ec1516b3bd3ace0b3646471534cf3c5a2fa40f6698b0629b5a endpoint=orderer4.example.com:7053 err="failed to create BroadcastClient: rpc error: code = Unavailable desc = connection error: desc = \"transport: error while dialing: dial tcp 172.30.0.8:7053: connect: connection refused\""
2021-11-10 10:56:26.196 UTC 344c INFO [gateway] Submit -> Sending transaction to orderer TxID=15b22c21689832ec1516b3bd3ace0b3646471534cf3c5a2fa40f6698b0629b5a endpoint=orderer3.example.com:7052
2021-11-10 10:56:26.198 UTC 344d WARN [gateway] Submit -> Error sending transaction to orderer TxID=15b22c21689832ec1516b3bd3ace0b3646471534cf3c5a2fa40f6698b0629b5a endpoint=orderer3.example.com:7052 err="received unsuccessful response from orderer: SERVICE_UNAVAILABLE"
2021-11-10 10:56:26.198 UTC 344e INFO [gateway] Submit -> Sending transaction to orderer TxID=15b22c21689832ec1516b3bd3ace0b3646471534cf3c5a2fa40f6698b0629b5a endpoint=orderer1.example.com:7050
2021-11-10 10:56:26.200 UTC 3450 WARN [gateway] Submit -> Error sending transaction to orderer TxID=15b22c21689832ec1516b3bd3ace0b3646471534cf3c5a2fa40f6698b0629b5a endpoint=orderer1.example.com:7050 err="received unsuccessful response from orderer: SERVICE_UNAVAILABLE"
2021-11-10 10:56:26.200 UTC 3451 INFO [gateway] Submit -> Sending transaction to orderer TxID=15b22c21689832ec1516b3bd3ace0b3646471534cf3c5a2fa40f6698b0629b5a endpoint=orderer2.example.com:7051
2021-11-10 10:56:26.202 UTC 3452 WARN [gateway] Submit -> Error sending transaction to orderer TxID=15b22c21689832ec1516b3bd3ace0b3646471534cf3c5a2fa40f6698b0629b5a endpoint=orderer2.example.com:7051 err="received unsuccessful response from orderer: SERVICE_UNAVAILABLE"
2021-11-10 10:56:26.202 UTC 3453 INFO [gateway] Submit -> Sending transaction to orderer TxID=15b22c21689832ec1516b3bd3ace0b3646471534cf3c5a2fa40f6698b0629b5a endpoint=orderer5.example.com:7054
2021-11-10 10:56:45.297 UTC 34d8 WARN [gateway] Submit -> Error sending transaction to orderer TxID=15b22c21689832ec1516b3bd3ace0b3646471534cf3c5a2fa40f6698b0629b5a endpoint=orderer5.example.com:7054 err="failed to create BroadcastClient: rpc error: code = Unavailable desc = connection error: desc = \"transport: error while dialing: dial tcp 172.30.0.10:7054: i/o timeout\""

@davidkel
Copy link
Owner Author

A different issue can also be seen if you kill the leader

{"component":"CLIENT","timestamp":"2021-11-10T10:56:13.545Z","txnId":"ab3f69fe9d0cf0123a180260c1b794a9a11975d5cc70daa71b8a22a504f37920","stage":"Endorsing","message":"addUpdateAssets({\"arguments\":[\"1\",\"2000\"]})"}
{"component":"CLIENT","timestamp":"2021-11-10T10:56:24.156Z","txnId":"ab3f69fe9d0cf0123a180260c1b794a9a11975d5cc70daa71b8a22a504f37920","stage":"Submitting","message":""}
{"component":"CLIENT","timestamp":"2021-11-10T10:56:24.168Z","txnId":"ab3f69fe9d0cf0123a180260c1b794a9a11975d5cc70daa71b8a22a504f37920","stage":"Submitted","message":""}
{"component":"CLIENT","timestamp":"2021-11-10T10:56:54.170Z","txnId":"ab3f69fe9d0cf0123a180260c1b794a9a11975d5cc70daa71b8a22a504f37920","stage":"Failed","message":"4 DEADLINE_EXCEEDED: Deadline exceeded"}

And the gateway log isn't so helpful

2021-11-10 10:56:24.165 UTC 3404 INFO [gateway] Submit -> Sending transaction to orderer TxID=ab3f69fe9d0cf0123a180260c1b794a9a11975d5cc70daa71b8a22a504f37920 endpoint=orderer4.example.com:7053

but what that actually implies is that it got submitted successfully to the orderer (because we got no error, the client stated it had been submitted)

We never got the commit status, which means that the leader was killed before the block was cut and concensus from the other orderers was achieved.

This can also cause the following to happen

{"component":"CLIENT","timestamp":"2021-11-10T10:56:33.880Z","stage":"STATS","message":"WARNING: Client may have stalled, no new transactions are being evaluated or endorsed"}
{"component":"CLIENT","timestamp":"2021-11-10T10:56:38.880Z","stage":"STATS","message":"WARNING: Client may have stalled, no new transactions are being evaluated or endorsed"}

because we have hit our limit of outstanding transactions which are either all waiting to fail or hit long delays determining a problem before moving on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant