Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VReplication Workflows: Add functionality to provide a way to diagnose and repair broken workflows #17736

Open
rohit-nayak-ps opened this issue Feb 11, 2025 · 0 comments

Comments

@rohit-nayak-ps
Copy link
Contributor

rohit-nayak-ps commented Feb 11, 2025

Feature Description

There are times when SwitchTraffic or ReverseTraffic fails or times out. There is a built-in rollback to revert the state changes, but this can also fail leaving the workflow in an inconsistent state. The root cause could be a Vitess bug or network/infra failure. Today ops fixes these manually which can be error-prone and cause serious issues like writes routed to the wrong keyspace.

It will be nice to have a simpler summary that can be shown for a workflow which looks at the overall state of the workflow and provides just enough information for figuring out the current state and flagging any inconsistencies. In addition it will useful to have a repair command which takes the desired state and "fixes" the inconsistencies accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant