Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky tests and fix #69 #67

Merged
merged 20 commits into from
Oct 20, 2022
Merged

Fix flaky tests and fix #69 #67

merged 20 commits into from
Oct 20, 2022

Conversation

joao-r-reis
Copy link
Collaborator

@joao-r-reis joao-r-reis commented Oct 19, 2022

Fix #69
Fix more flaky tests

┆Issue is synchronized with this Jira Task by Unito
┆friendlyId: ZDM-412

@joao-r-reis joao-r-reis reopened this Oct 19, 2022
@joao-r-reis joao-r-reis changed the title Add debug info to tls test Fix flaky tests Oct 19, 2022
@joao-r-reis joao-r-reis changed the title Fix flaky tests Fix flaky tests and fix #69 Oct 19, 2022
@@ -1056,16 +1057,36 @@ func testProxyClientTls(t *testing.T, ccmSetup *setup.CcmTestSetup,

logMessages := buffer.String()

warnCheckFail := false
warnCheckDone := false
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Done suffix here confused me for a second, I thought it would be set when the result of some async processing was done, but in reality this value is simply set when a warning is expected. My suggestion would be to name this warningExpected instead. The same goes for errCheckDone below, errExpected.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}
if errCheckDone && warnCheckDone {
require.True(t, !errCheckFail || !warnCheckFail) // only 1 check needs to pass in this scenario
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think this would be clearer given the other requires are False too:

require.False(t, errCheckFail && warnCheckFail)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually set this to require.True because to me it felt easier to understand but since I added a comment anyway I can use require.False to be consistent with the others

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -1340,6 +1361,14 @@ func getClientSideVerifyConnectionCallback(rootCAs *x509.CertPool) func(cs tls.C
}
}
_, err := cs.PeerCertificates[0].Verify(opts)
if err != nil {
// use zerolog to avoid interacting with proxy's log messages that are used for assertions in the test
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my understanding, are we suppressing the log messages from a lib to avoid throwing our log assertions off?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to avoid throwing our log assertions off but we aren't suppressing any log messages. We are capturing the log output from the logrus library which is what the proxy uses and the test client uses zerolog. To avoid messing with the log assertions I decided to use zerolog here on this test code in particular instead of logrus

if !protocolErrOccurred {
protocolErrOccurred = true
cc.sendResponseToClient(protocolErrResponseFrame)
cc.clientHandlerShutdownRequestCancelFn()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like clientHandlerShutdownRequestCancelFn is not called anywhere anymore, we should probably remove it from the struct.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to call it once #68 is addressed, maybe I can add a comment about this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

@alicel alicel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me. I left a question and a tentative suggestion but nothing that would prevent it from being approved.

setDrainModeNowFunc()
continue
}
alreadySentProtocolErr = protocolErrResponseFrame
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this assignment happen after cc.sendResponseToClient(protocolErrResponseFrame)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't really matter because there is no concurrency here, it's a single goroutine with a for loop

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep it just looked a bit more intuitive but it doesn't really matter.

protocolErrOccurred = true
cc.sendResponseToClient(protocolErrResponseFrame)
continue
} else if alreadySentProtocolErr != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to check that I understand correctly. The intended behaviour with this change is: if a protocol error is sent on the connection, then we want to ensure that all other potentially pending responses that may still be returned are replaced by that protocol error (unless they were themselves protocol errors in the first place). So we're keeping the already returned protocol error to replace the frame of any subsequent non-protocol error response with it.

The idea is to ensure that we do not send seemingly valid responses on a connection on which a protocol error was returned while this connection is being drained.

Is this correct?

Copy link
Collaborator Author

@joao-r-reis joao-r-reis Oct 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but also the main change is that we no longer trigger the connection shutdown mechanism so the connection will be left in an unusable state (constant protocol errors) and it will be up to the driver to close it (at least until we address #68 )

Copy link
Collaborator

@grighetto grighetto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@joao-r-reis joao-r-reis merged commit ee0181b into main Oct 20, 2022
@joao-r-reis joao-r-reis deleted the add-tls-test-debug branch October 20, 2022 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Client connection can be closed before proxy returns protocol error
3 participants