Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: Failure in first call to PRS can lead to the cluster having no primary #17710

Open
GuptaManan100 opened this issue Feb 6, 2025 · 0 comments

Comments

@GuptaManan100
Copy link
Member

Overview of the Issue

If PRS fails during the initialisation of a shard, and the failure happens while promoting the primary before it has had a chance to write to the topo-server, it won't be a primary tablet. VTOrc sees this failure and tries to fix this by calling UndoDemotePrimary, but that doesn't change the type of the tablet to PRIMARY. It only fixes the mysql level settings and this causes the cluster to not have a primary at all.

Reproduction Steps

  1. Run PRS, and simulate a failure that happens before new primary tablet has promoted itself.

Binary Version

main

Operating System and Environment details

-

Log Fragments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant