-
Notifications
You must be signed in to change notification settings - Fork 441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Execution of SessionFsm transition actions can create a deadlock #1300
Comments
This is still a problem that I need to figure out how to solve nicely. I don't remember if I ever looked into moving the In our client implementation using Milo we don't use any callbacks to determine when to start creating subscriptions. You "know" it's safe to create a Subscription if your call to |
But getSession() may behave as it never returns (that is the deadlock). We are using Siemens S7-1500 PLCs, that do not implement the subscription transfer service, so onSubscriptionTransferFailed is called for every subscription to those PLCs. However, sometimes (as it really depends on the scheduling of the transition actions, and not deterministic) getSession() remains stuck, and its future will never be resolved. It can happen, that the connection and reconnection/resubscriptions works for months, and after that, it deadlocks every day. It is really not deterministic. The real problem is, that a possible workaround might be to create a new client to the PLCs with a fresh connection. This is a really annoying issue, which makes Milo quite unusable in environments, where PLCs without subscription transfer restart often. |
This avoids deadlock when the user makes a blocking call that requires the Session from the SessionFsm. fixes #1300
Thank you, Kevin, this is wonderful. Can Milo users expect a new 0.6.15 release or shall we wait until 1.0? |
@persicsb I will cut a 0.6.15 release within a week, there are a few other small fixes that have gone into 0.6.x in addition to this one. |
@persicsb 0.6.15 should be available in Maven Central since yesterday. |
) This prevents deadlock when the user makes a blocking call that requires the Session from the SessionFsm. fixes eclipse-milo#1300
Describe the bug
SessionFsm transition actions are not properly executed in order, and it can cause a deadlock.
The main use case is resubscribing subscriptions, that were failed to transfer to a new session.
There are 2 main points on the client side, when we are notified to do it:
However, this is not the case, we can be in a deadlock easily.
The main problem is, that onSessionActive and onSubscriptionTransferFailed callbacks can run before the session's SessionFuture (with the key KEY_SESSION_FUTURE) is properly resolved.
The reason is, that the SessionFsm is set up in SessionFsmFactory so that the session's future is only resolved when the InitializeSuccess event is called between the Intializing and Active state transition, see https://github.com/eclipse/milo/blob/master/opc-ua-sdk/sdk-client/src/main/java/org/eclipse/milo/opcua/sdk/client/session/SessionFsmFactory.java#L513
It resolves the session's future via the executorservice, and not directly, see line 565.
However, the onSessionActive callbacks are called directly, when the state changes to Active from any other state, in https://github.com/eclipse/milo/blob/master/opc-ua-sdk/sdk-client/src/main/java/org/eclipse/milo/opcua/sdk/client/session/SessionFsmFactory.java#L587
Since these callse are not made through the client's executorservice, they can happen-before the sessionFuture is resolved.
So there is no guarantee, that even after onSessionActive is called, anything can be done, high-level APIs like ManagedSubscription.create() will hang indefinitely, as they are in a deadlock : they are waiting for the future's resolution, but that is scheduled later.
I think, the issue is the same as #614
In that issue, you mentioned that:
What is the preferred way to solve this issue? When and how can ManagedSubscription.create() (and any reentrant calls to getSession()) safely called?
Also, it needs to be documented properly in the Javadocs, and perhaps in the milo-examples code.
Expected behavior
SessionFsm transition actions have a well-defined order.
Additional context
Milo version 0.6.14, but affects all others.
The text was updated successfully, but these errors were encountered: