-
Notifications
You must be signed in to change notification settings - Fork 81
Prune Istio Namespace created by Operator when deleting Operator cr #484
base: master
Are you sure you want to change the base?
Conversation
if listenerErr != nil { | ||
log.Errorf("unexpected error occurred invoking ResourceError on listener: %s", listenerErr) | ||
} else if update { | ||
if patch, err = h.CreatePatch(receiver, mutatedObj); err == nil && patch != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe it's better to merge the two if condition checks.
@@ -40,6 +40,17 @@ func (h *HelmReconciler) Prune(all bool) error { | |||
if err != nil { | |||
allErrors = append(allErrors, err) | |||
} | |||
//prune namespace created by istio operator at last |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite a dangerous operation since it will delete anything that the user created in the namespace too. At the very least, we should create a flag to guard this - but IMO it may be better to leave this step up to the user since it's not very onerous. Thoughts @howardjohn @sdake ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not as familiar with common operator patterns, but it seems dangerous to me. I think we should error on the side of NOT deleting stuff - will a user be more upset if they have an empty namespace around, or if we delete their critical applications?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally speaking, we shouldn't be creating or deleting namespaces. If we are creating namespaces, access should be restricted as, by definition, the operator is managing the lifecycle of that namespace.
Typically, an operator will create resources in the same namespace as the CR defining the install/behavior or in a completely segregated namespace, where all interaction would be through the CR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i agree, at the minimum this should be guarded with a flag... like you see a confirmation - do you really want to delete this github repo?
Also, how do you know if the namespace isn't there prior to install of the operator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the namespace is created by operator, we will mark it via adding related operator lables and annotations, eg: install.operator.istio.io/owner-generation
.
If the namespace is created prior to install of the operator, this PR will make sure operator will not touch the namespace(not adding related labels and annotations)
So in this way, we can know whether the namespace is created by operator or not. and we delete them only if it is created by operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally speaking, we shouldn't be creating or deleting namespaces. If we are creating namespaces, access should be restricted as, by definition, the operator is managing the lifecycle of that namespace.
Typically, an operator will create resources in the same namespace as the CR defining the install/behavior or in a completely segregated namespace, where all interaction would be through the CR.
Assuming the Istio CR is registered in the istio-operator
namespace Is this not handled by https://github.com/istio/operator/blob/master/deploy/operator.yaml#L33-L34? If the answer is yes, perhaps there is some way to enforce the exact behavior we want as you have specified.
@ostromart @howardjohn @rcernich But if the namespace is created by our operator. It means operator need manage the lifecycle of the namespace as said by @rcernich So this is what the PR want to do: Delete the ns if it is create by operator. |
mutatedObj, err = patch.Apply() | ||
if err == nil { | ||
if err = h.customizer.Listener().ResourceUpdated(mutatedObj, receiver); err != nil { | ||
log.Errorf("unexpected error occurred during postprocessing of updated resource: %s", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In very overloaded clusters where the resource request is not processed, simply logging the Error and moving on is not sufficient.
} else { | ||
listenerErr := h.customizer.Listener().ResourceError(obj, err) | ||
if listenerErr != nil { | ||
log.Errorf("unexpected error occurred invoking ResourceError on listener: %s", listenerErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same problem with error processing. There are other instances of this poor error handling in this function that you didn't add. Perhaps you can get it right in your changes and return with a followup PR to address the problems in the remainder of this code?
Cheers
-steve
/test gencheck_operator |
Agree. Objects created by automation should be deleted by said automation. The objection is the side effect of deleting a namespace could have unintended consequences of deleting objects erroneously added by a user in the deployed namespace. This should be simple enough to control given the manifest we ship and the WATCH_NAMESPACE environment variable. If a CR is registered in this WATCH_NAMESPACE and the CR's https://github.com/istio/operator/blob/master/pkg/apis/istio/v1alpha2/istiocontrolplane_types.proto#L186 is set and a namespace is created setting appropriate annotations and labels indicating the namespace was created by the operator, the user no longer has an expectation of data stored within this namespace to be retained. Cheers |
Regarding the namespaces, we shouldn't be creating them at all. We should verify its existence in the validator and reject the CR if the target namespace does not exist or if the user creating the CR does not have permission to access the specified namespace. I believe this could be a potential security issue, e.g. if a user specifies a rogue image along and requests installation into a system namespace. We should really reconsider the current approach. |
+1 to @rcernich shouldn't Istio be installed in the same namespace as the ICP, which implies the namespace already exists? |
We need to create the namespace to accommodate a single line install. There's just no way around it. |
@ostromart the issue is that the user shouldn't be specifying the namespace at all when using the operator. The namespace within which the CR is created is where the operator should create the resources. The operator use case is different from the command-line use case, where the cli is creating resources which need to be created/applied separately. |
@howardjohn there are good reasons for keeping the operator and ICP in a separate ns from Istio, namely creating an administrative boundary and making it easy to uninstall operator without uninstalling Istio. RH take is a step further in their model by having the ICP in a completely different ns. We support that through a flag but currently the flag is set to the operator ns. |
@ostromart, the operator can watch any namespace and should be watching all namespaces. When an ICP is creating in a namespace, that's the namespace that should contain the control plane. The operator should be completely isolated and regular users should be able to create control planes. Typical configuration scenarios for an operator are:
The operator install should not create the namespace. The flow should be:
Removal should be: kubectl delete -n operator-namespace -f operator.yaml The goal: user must be cluster-admin to install operator, but any user should be able to install a control plane. |
@ostromart @sdake @howardjohn @rcernich @linsun SO we have two namespace here: the operator namespace and the istio installation namespace(eg: istio-system). I agree we should keep the operator namespace untouched. Now we will add operator labels and annotations into the istio installation namespace(eg: istio-system) even if it exists before hand. So we should at least make the logic in pkg/helmreconciler/rendering.go within this PR go into master code. -- This is logic to make sure operator will not touch the pre-existed Istio installation namespace. For whether we need prune the istio installation namespace(eg:istio-system) and operator namespace, Let's open an issue and move our discussion there to get a solid agreements. Does this sounds reasonable? |
The user installing the control plane should create the istio-system namespace. They should create the ICP resource in istio-system. (Replace istio-system with any namespace the user wants to use for installation.) |
Rob, what's the motivation for putting the ICP CR in istio-system? It's possible to create rbac that restricts a user from touching anything other than ICP CR in either ns. |
Once again, ICP should sit beside the installed control plane. It should not be collocated with the operator itself. The ICP CR specifies the configuration for a control plane. The operator watches namespaces for ICP resources and act on it accordingly. Regular users should have no access to the operator namespace, especially because the operator service account has elevated privileges, which are required to instantiate resources across the cluster. A regular user should be able to create an ICP in a namespace they have access to, so they can install a control plane, without needing to be cluster-admin. This is the basic problem solved by the operator: normal users can manage a control plane installation. Yes, RBAC can be used to restrict who can create/update/delete ICP CRs, but, as mentioned above, one of the benefits of using an operator is that a user does not need cluster-admin to administer a control plane installation. |
@rcernich sounds consistent to me. Some use cases may wish to restrict who can install a control plane. Istio's control plane uses cluster scoped resources, so we must have the ability to restrict where ICPs can be registered in the model you propose. This more consistent approach doesn't really feel appropriate for a backport. How can we solve this specific problem without major rework for 1.4? Cheers |
Hey @sdake, I'm not sure about 1.4, but based on my limited understand, we should reconfigure the operator deployment so that the operator watches all namespaces. As for users creating ICP resources, that will already be constrained based on RBAC, so admins will need to grant create/update/delete roles to individual users/groups. For folks used to doing everything as cluster-admin, the only difference they'd see is that they create the ICP resource in whatever namespace (e.g. istio-system). I don't know if the namespace fields in ICP are required, but if they're not, they should just be left blank. If not, the operator should balk if the namespace doesn't match the containing namespace and/or a validator checks the setting and rejects it. I'm guessing, of all this, all that needs to be done is modify the operator deployment yaml and override the target namespace with the namespace containing the ICP before rendering the charts. |
Apologies in advance if I'm missing something obvious as security is not my forte. Just trying to understand your proposal better. |
The operator namespace should be managed by cluster-admin, so moving ICP out of operator namespace allows control planes to be created with lower level permissions, although they would still need permissions to create/update/delete ICP resources. When using Operator Lifecycle Manager to install the operator, roles for managing the operator's associated CRDs are created along with the operator and bound into the standard admin/edit/view roles. This effectively means each namespace's administrators may instantiate an ICP resource, and thus their own control plane. This will become much more important when Istio supports multitenancy (similar to Maistra). It also becomes more important as the roles required by the components themselves are reduced. Also, this is the general convention used when working with operators, although most of the k8s oriented examples seem to just dump everything into default namespace (which might be great for kicking tires, but falls short when working in a managed cluster). HTH |
In general I agree with the principles @rcernich mentioned here. Operator is cluster/mesh admin, which has the privilege to read all ICPs from different namespaces. Each namespace admin is responsible for managing the resource in his own namespace. From the examples we discussed, I only see "operator" and "istio-system" namespaces are mentioned. Both should be managed by cluster/mesh admins. Do we have use cases that the ICPs are namespace specific (hence should be set by namespace admins)? |
From the examples we discussed, I only see "operator" and "istio-system"
namespaces are mentioned. Both should be managed by cluster/mesh admins. Do
we have use cases that the ICPs are namespace specific (hence should be set
by namespace admins)?
My understanding: Right now, running Istio basically requires admin
privileges today, so this is less important. In OpenShifts case, they have
some modifications to Istio that actually do allow namespace admins to run
Istio with minimal privileges, multiple Istio installs, etc. We will
hopefully start moving in this direction, so we should make sure the
operator will support this.
I recommend we follow Rob's suggestion of creating ICP in the namespace
that istio will be installed in, to follow the pattern of other operators.
…On Fri, Nov 1, 2019 at 1:45 PM Limin Wang ***@***.***> wrote:
In general I agree with the principles @rcernich
<https://github.com/rcernich> mentioned here. Operator is cluster/mesh
admin, which has the privilege to read all ICPs from different namespaces.
Each namespace admin is responsible for managing the resource in his own
namespace.
From the examples we discussed, I only see "operator" and "istio-system"
namespaces are mentioned. Both should be managed by cluster/mesh admins. Do
we have use cases that the ICPs are namespace specific (hence should be set
by namespace admins)?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#484?email_source=notifications&email_token=AAEYGXORBKQ5LLB6HV3ZUZLQRSINHA5CNFSM4JGVKCO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC4DS4A#issuecomment-548944240>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEYGXKHZNCTDIXN2VO7GFLQRSINHANCNFSM4JGVKCOQ>
.
|
@liminw, two things:
The second point is what is really driving the rationale behind this. The goal of the operator should be to allow self-provisioning within a managed k8s cluster. At present, this would mean that a service mesh admin would have greater privileges than a regular user, but would not have cluster-admin privileges. HTH |
The operator already supports the CR being in any namespace the admin doing the install chooses. I'm still wondering why that should be istio-system today, given that we're trying to restrict user privileges, rather than another ns altogether? Per Limin's point, istio-system should also be for admins. User should only be able to interact with Istio installation through the CR they control. |
The user that creates the control plane namespace (e.g. istio-system) IS the "admin" (for that control plane/mesh). The idea is you have one operator listening/watching all namespaces. Bob creates/installs Istio via a CR he creates in his bob-istio-system and Mary creates/installs Istio via her own CR in mary-istio-system. Each is the admin for their own mesh - each has permissions to their own control plane namespace. They should never have permission to even see istio-operator namespace let alone create objects in that namespace. |
From @rcernich and @jmazzitelli 's comments, there are multiple control planes shared by different tenants, and all of them share the same operator. This requirement seems reasonable for such use case. I think we should follow @rcernich 's suggestion to allow the tenants to own their individual configuration. |
I completely agree that for multi-tenant and/or managed case it makes sense not have the CR in istio-operator. I do still wonder why the cluster admin would install a managed Istio for a user and then afterwards allow them to interact directly with the control plane resources, rather than having a separate config namespace and having them do it solely through the CR. It seems difficult to support as a managed solution, but there's nothing preventing this in the operator - just listen to all namespaces. We can make both changes in a new PR. |
@ostromart, cluster admin just installs the operater. A mesh admin (designated by cluster-admin) manages the control plane entirely: creates namespace, creates CR, etc. This mesh admin should be managing the control plane installation through the CR, not manually. The mesh admin should not be modifying the resources managed by the operator. |
Just to be clear, the tenant owner should NOT be allowed to interact directly with control plane resources that are created by the operator. I think the way you are thinking it should work, is how it works (minus the part about needing a separate config namespace). The tenant owner should configure things solely through the CR - this is true. The tenant owner does not interact directly with control plane resources. They will not (should not) have rights to do that - I believe this will be the job of the cluster-admin to provide the correct permissions to the tenant owners (right @rcernich?). But the tenant owner will have the rights to create, modify, and delete CRs. It should be fine to be able to put the CR directly in the control plane namespace. So, in short (and @rcernich can explain the RBAC issues more in depth and correct me where I am wrong, but this is how I understand it) the tenant owner will not have the necessary permissions to directly manipulate the control plane resources - the tenant owner will only have permissions to manipulate the CR itself. It is the job of the cluster-admin to provide the correct permissions to the proper tenant owners. |
Hey @jmazzitelli, nothing prevents namespace admins from interacting with those objects and currently we don't watch for changes of those resources to overwrite any changes made by somebody other than the operator. Additionally, because we use three way patching during updates, it's likely that any changes made by a user will be persisted through updates, unless they changed something that is explicitly set through the operator. This was done on purpose to give users the ability to workaround any issues or temporarily change things based on operational needs (i.e. escape hatch). That said, they really shouldn't be messing with things, as they can put things into a bad state (e.g. deleting a deployment). There are also practical concerns given the amount of settings that might have to be exposed. In reality, this is no different than working with the charts/templates directly. Having said all that, the user should be doing everything through the CR, even though they may have the ability to interact with the resources created by the operator. (Sort of like buying a car and deciding to work on it yourself.) |
OK, got it. I tell people "don't mess with the operator-created resources - just edit the CR to modify your operand installation." But good analogy with the car - if you know how to change your own oil - feel free. Just don't complain to me when you seize the engine. :) |
After reading through this (long) review, I am unclear on the next actions for @irisdingbj to make to get this PR merged. Options could be simple like "change X" or "change Y" in this specific code. If the answer is more complex, as this review would seem to indicate, then please suggest filing a bug around the points of contention so they may be addressed in followup PRs. It is unreasonable to expect @irisdingbj to resolve all of this thread in one PR - the PR would be massive. I strongly prefer forward iteraataive progress over a stalled PR. Cheers |
@sdake, agreed. For now, we should not prune namespaces. Other details can be worked through separate PR/issues. @irisdingbj, sorry for hijacking this PR, but I think it stimulated an important discussion. |
Yes, agreed. Let's not prune the namespace, and move the other changes out to new PRs. |
I am pretty sure that isn't what I just stated in #484 (comment). To clarify, the right technical choice is to garbage collect any objects created by the operator. This includes namespaces, if they are created by the operator (which they are), even if that has negative side effects because human operators decide to execute operations outside of the best practices operator workflow (which I correctly assume does not include creating Istio resources not via the operator's ICP in the operator's specific namespace). I'm not sure where the confusion came from, but it doesn't matter. Now that is settled... Given this corrected understanding, what should be done with this specific PR which garbage collects resources created by the operator after the ICP resource is removed. The only viable answer I see is moving forward with this PR after the few nit-picks are resolved, and for @rcernich and @ostromart to open issues tracking improving this behavior for 1.5. Cheers |
Wow, a long thread... just to make sure we are on the same page:
I think this PR is for cleaning up with option 1, while a few folks believe we should do option 2 instead thus no namespace cleanup would be required. I do feel option 2 is particular useful for multitenancy install scenarios. |
Yes, exactly why option 2 is good - multi-tenancy use-cases. |
@irisdingbj: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
istio/istio#18384
Do not update Istio namespace if the namespace is not created by Istio Operator
Prune Istio Namespace if it is created by Istio Operator
Leave Istio namespace if it is not created by Istio Operator