-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Controling resource deletions #922
Comments
A flag to skip deletion of topology objects defeats the point of having an Operator in the first place. The Operator is meant to abstract the management and lifecycle of topology objects in RabbitMQ. Skipping deletion removes a fundamental part of the Operator functionality, and would result in a mismatch between the Kubernetes state and RabbitMQ state; in other words, it would result in objects not present in Kubernetes, and present in RabbitMQ. |
Introducing a flag to control resource deletion behavior enhances the safety and flexibility of the Messaging Topology Operator without undermining its purpose. Many successful operators, like Crossplane, ArgoCD, Cert-Manager and others provide similar mechanisms to address real-world production needs, ensuring critical resources are not inadvertently deleted. While mismatches between Kubernetes and RabbitMQ states are a concern, they can be mitigated through logging and reconciliation mechanisms. Additionally, we already tested the reconciliation of resources created by the Operator after simulating the orphaned deletion and the existing implementation successfully detected and managed the orphaned resource. Being used wisely this feature should allow easier and broader adoption of the operator for the both teams being early in their k8s journey and teams managing very sensitive RMQ workloads. |
@vavdoshka Why can't you simply disallow deletion of these K8s objects via RBAC (Role Based Access Control)? |
Also, please reference where these other projects support this feature. A quick Google search suggests that this is an alpha feature in Crossplane: It's an open issue in ArgoCD: |
I disagree. The mission of the Operator is to reconcile Kubernetes state into RabbitMQ. This flag is completely opposed to that mission.
How can you inadvertently delete a resource when a deletion requires a explicit
Given that this is the first time this feature has been requested in the ~4 years of existence of this Operator, I'm highly skeptical that it will help to spread adoption. |
Thanks for the feedback. As you rightfully mentioned, there is a logical mistake in mocking the delete; however, it is not exactly what I propose. I propose to let users decide whether to optionally retain the resources that might contain a state influenced by factors outside the operator's responsibility or concern, i.e., queues or streams with data. Optional retain behavior, similar to what K8s implements for PVC Reclaim Policy. Because there might be use cases where deletion is appropriate and, in some cases, not so much. In addition one use-case which we did not face yet but fesable is migration (split/merge) of resources from one namespace to another. In terms of what currently supports these use cases, there are two schools indeed, for example, the Kafka Strimzi operator does not support this, while at the same time:
We are trying to understand if maintainers would be opened to such contribution at current moment in order to plan accordingly the actions on our side. Thanks for this great Operator and your feedback. |
This sounds perfectly reasonable. I'm compelled by the reclaim policy analogy. I agree it makes sense to tweak the "reclaim" behaviour for resources that contain data, in other words: queues, streams and vhosts. A vhost does not contain data per-se, but its sub-resources (queues) do. Other resources that do not contain data e.g. policies, I'm not sure about implementing a reclaim policy.
I'm happy to go forth with a reclaim policy feature for Any thoughts @mkuratczyk @ansd ? |
Policies, runtime parameters, virtual host limits do not need a reclaim policy. It is extremely rare to see those things consuming a non-trivial amount of either memory or disk space. Shovels and Federation upstreams, both based on runtime parameters, might need it because a shovel or a federation link can be configured to have a very high prefetch. Users and permissions need to be treated differently because their rotation is typically not driven by how much memory or compute resources they take but rather by the credential rotation policy, which can vary greatly from deployment to deployment. So I'd leave those two resources out, at first or until/if something more suitable for them is designed and proposed. |
I assume "queues' implicitly includes streams because they are considered a queue type, not a separate entity entirely, even though in most ways they are. |
Is your feature request related to a problem? Please describe.
We are at early stage of adopting the messaging-topology-operator and graduating it's usage to production. Currently removal of a k8s resource triggers the removal of that object in RMQ, we don't feel it safe for our production workloads and would like to control the removal behavior.
Describe the solution you'd like
As a flag provided in runtime controlling resource deletion behavior (and/or) on the resource level.
Describe alternatives you've considered
Using admission controller to prevent resource removal
Additional context
topology-operator can reconcile resources which were requested previously, so it seems the missing piece is logic to ignore delete.
relevant request on rmq-server side rabbitmq/rabbitmq-server#12772
The text was updated successfully, but these errors were encountered: