Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with NATS Cluster Health Check and Jetstream on OpenShift: Recommendations Needed [v2.10.19] #6217

Open
mohamedsaleem18 opened this issue Dec 4, 2024 · 2 comments
Labels
defect Suspected defect such as a bug or regression

Comments

@mohamedsaleem18
Copy link

Observed behavior

A NATS cluster with 3 nodes (Jetstream enabled) is installed on Red Hat OpenShift. Due to an issue with Jetstream, the NATS cluster shut down and was no longer accepting Jetstream connections. When we restarted the NATS pods one by one, the cluster formed with only two pods. One pod failed to start, throwing a "health check failed" error. As a result, the NATS cluster is up and running with only 2 NATS servers. Please find the NATS pod logs below. Could you please provide your recommendations to resolve this issue?

Expected behavior

NATS cluster should be up and running with 3 nodes.

Server and client version

NATS server --version : 2.10.19

Host environment

Redhat OpenShift
nats.txt

Steps to reproduce

No response

@mohamedsaleem18 mohamedsaleem18 added the defect Suspected defect such as a bug or regression label Dec 4, 2024
@derekcollison
Copy link
Member

We would need quite a bit more information.

Might be good to consider a support agreement with Synadia.

@wallyqs wallyqs changed the title Issue with NATS Cluster Health Check and Jetstream on OpenShift: Recommendations Needed Issue with NATS Cluster Health Check and Jetstream on OpenShift: Recommendations Needed [v2.10.19] Dec 11, 2024
@wallyqs
Copy link
Member

wallyqs commented Dec 11, 2024

hi @mohamedsaleem18, what you can try is to change the readiness probe to be done instead on /varz or change the readinessprobe to be /healthz?js-server-enabled=true instead, that way the service will not detach whenever you stop one of the pods. We are changing from this default settings in the next version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect Suspected defect such as a bug or regression
Projects
None yet
Development

No branches or pull requests

3 participants