Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client is unable to detect when host initialization failed #31

Open
wmartins opened this issue Oct 29, 2024 · 0 comments
Open

Client is unable to detect when host initialization failed #31

wmartins opened this issue Oct 29, 2024 · 0 comments

Comments

@wmartins
Copy link

Description

I have been investigating an issue that happened in our self hosted Interval set up, which causes the client to be in a limbo, where it is not serving any actions.

To give some more information, we're hosting both client and server on AWS ECS. When the issue happened, we noticed that AWS restarted our server instance due to some network maintenance. When the server restarted, the client stopped serving actions, and it caused us to see the Nothing here yet message, which indicate that the client was not properly connected.

After some investigation, we saw that the server failed to initialize the host. Here's the error log that we could see on the server:

[31merror[39m: [31m	Failed handling INITIALIZE_HOST:[39m 
{
    "error": {
        "message": "\nInvalid `prisma.hostInstance.upsert()` invocation:\n\n\nQuery interpretation error. Error for binding '2': AssertionError(\"Expected a valid parent ID to be present for create follow-up for upsert query.\")",
        "name": "PrismaClientKnownRequestError",
        "stack": "PrismaClientKnownRequestError: \nInvalid `prisma.hostInstance.upsert()` invocation:\n\n\nQuery interpretation error. Error for binding '2': AssertionError(\"Expected a valid parent ID to be present for create follow-up for upsert query.\")\n    at Ln.handleRequestError (/usr/local/lib/node_modules/@interval/server/node_modules/@prisma/client/runtime/library.js:121:7753)\n    at Ln.handleAndLogRequestError (/usr/local/lib/node_modules/@interval/server/node_modules/@prisma/client/runtime/library.js:121:7061)\n    at Ln.request (/usr/local/lib/node_modules/@interval/server/node_modules/@prisma/client/runtime/library.js:121:6745)\n    at async l (/usr/local/lib/node_modules/@interval/server/node_modules/@prisma/client/runtime/library.js:130:9633)\n    at async INITIALIZE_HOST (/usr/local/lib/node_modules/@interval/server/dist/src/wss/wss.js:436:50)\n    at async DuplexRPCClient.handleReceivedCall (/usr/local/lib/node_modules/@interval/server/node_modules/@interval/sdk/dist/classes/DuplexRPCClient.js:90:29)\n    at async DuplexRPCClient.onmessage (/usr/local/lib/node_modules/@interval/server/node_modules/@interval/sdk/dist/classes/DuplexRPCClient.js:111:21)\n    at async Promise.all (index 0)"
    },
    "instanceId": "<instance id>",
    "organizationId": "<organization id>"
}

And this on the client:

{
    "id": "289",
    "kind": "RESPONSE",
    "methodName": "INITIALIZE_HOST",
    "data": {
        "type": "error",
        "message": "Internal Server Error"
    }
}
[Interval] Failed reestablishing connection IntervalError: Internal Server Error

Reproducing

In order to reproduce, you can simulate this issue with Interval Server by adding a throw new Error('Reproducing') in the INITIALIZE_HOST handler:

https://github.com/interval/server/blob/77ae7f0f080f8530bec9f8d64b40f1f63524984a/src/wss/wss.ts#L535-L540

This will force the host initialization to fail.

The problem

The main problem with this is that Interval's public API doesn't allow us to identify when that situation happens. For example, if you run IntervalClient#ping it will succeed, as it only checks if the websocket connection is open and responsive. There's no way for us to check that the given client is connected and initialized/online.

This is a problem in our set up because we use IntervalClient#ping in our healthcheck, however, this gives us a false positive, given that the client is not really able to serve its actions.

Ideas

One way to solve this problem would be to pass a callback that is called when the initialization fails, something that could be added to #initializeHost. If this is too specific, we could also have a new method for inspecting/reacting to the RPC responses somehow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant