Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Introduce OpenStackClusterIdentity Resource for Centralized Credential Management #2386

Open
bnallapeta opened this issue Jan 22, 2025 · 21 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@bnallapeta
Copy link

bnallapeta commented Jan 22, 2025

/kind feature

Describe the solution you'd like

The current implementation of identityRef in OpenStackCluster requires secrets containing credentials (e.g., clouds.yaml) to reside in the same namespace as the OpenStackCluster resource. This limitation poses challenges in environments with namespace-based access control, where sensitive credentials need to remain centralized and secure while still being usable by resources in other namespaces.

To address this, I propose introducing an OpenStackClusterIdentity custom resource, similar to AWSClusterStaticIdentity or AzureClusterIdentity. This resource would allow for centralized credential management while providing fine-grained access control across namespaces.

Proposed Implementation:
1. New CRD: Create OpenStackClusterIdentity to reference credentials stored in a secret, allowing usage across namespaces with access controls.
2. Identity Reference: Update OpenStackCluster.spec.identityRef to reference the new resource instead of secrets directly.
3. Access Control: Implement validation in the CAPO controller to restrict usage based on namespaces or roles.

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 22, 2025
@mdbooth
Copy link
Contributor

mdbooth commented Jan 22, 2025

Even implementing the ability to reference cross-namespace credentials raises architectural issues in CAPO akin to requiring a setuid bit in a posix application. Despite what anybody else may have done, my instinct is to actively avoid doing this. It requires us to robustly implement and enforce our own RBAC scheme. I'm not sure we're up for that.

IOW, far from protecting secrets, I believe this feature would open a potential hole where a bug might expose any secret in the cluster.

However, I'm very interested in the use case. Can you explain why the credentials cannot live in the target namespace alongside the CRs which reference them?

@s3rj1k
Copy link

s3rj1k commented Jan 22, 2025

@mdbooth Can you please elaborate on "CAPO akin to requiring a setuid bit in a posix application"?

Assuming that I understand use case correctly, to have what other major CAPI providers have, to have similar API (both AWS and Azure have this).

@wkonitzer
Copy link

Hi @mdbooth Your points about the potential complexity of enforcing a robust RBAC scheme and the risk of inadvertently exposing secrets are well taken.

The use case for cross-namespace credentials stems from the need to centralize credential management, especially in environments with a high number of clusters and frequent credential rotations. Here are a few key reasons why having credentials live in a single namespace is advantageous:

  • Centralized Management: Storing credentials in one namespace allows for better oversight and governance. For organizations managing dozens or hundreds of clusters, centralization ensures consistency, simplifies rotation, and reduces the likelihood of credential sprawl.
  • Improved Security: While I understand your concern about enforcing RBAC, centralizing credentials reduces the overall surface area of sensitive data. Rather than duplicating secrets across namespaces, centralization allows tighter access control and audit capabilities.
  • Operational Efficiency: Rotating credentials across multiple namespaces adds administrative overhead and increases the likelihood of human error, which could lead to downtime or configuration drift. A single source of truth for credentials ensures that changes propagate seamlessly without manual intervention.

I hope this helps clarify the rationale behind the feature request.

@lentzi90
Copy link
Contributor

Thank you for the feature request!
I can see the usefulness of this and also how it is more secure than direct cross namespace references to secrets.
I do share @mdbooth 's concern about bugs in the code though. This would add to CAPO's responsibilities so we do need to be careful.

@mdbooth
Copy link
Contributor

mdbooth commented Jan 23, 2025

For posterity I'll summarise the behaviour of AWSClusterStaticIdentity.

AWSClusterStaticIdentity is a cluster-scoped, non-namespaced custom resource. I assume that users are therefore not expected to create them. It looks like users are not even strictly required to be able to read them, merely know the name of one which will permit their access. They are presumably expected to be created and managed by a service provider.

AWSClusterStaticIdentity includes allowedNamespaces in its spec.

Credentials are stored in a secret in the same namespace the controller is running in.

CAPA has a method isClusterPermittedToUsePrincipal which is responsible for enforcing that only clusters in an allowed namespace may use the credentials: https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/4afd74ff9f41ef714af378adabe0f5e4fa66723e/pkg/cloud/scope/session.go#L418-L459

If this method returns true, buildAWSClusterStaticIdentity will return a AWSStaticPrincipalTypeProvider containing the credentials from the secret.

I can see how this addresses the issue of centrally-managed namespaces.

At a minimum, in this context the methods buildAWSClusterStaticIdentity and isClusterPermittedToUsePrincipal are both 'security critical', meaning that an error here could allow a tenant to use another tenant's credentials.

@mdbooth
Copy link
Contributor

mdbooth commented Jan 23, 2025

Have you considered using a tool which automatically copies secrets from a central namespace to target namespace, and updates them automatically in line with the source credentials? That solves the maintenance problem while greatly reducing the potential for security bugs in code.

But I guess you don't want to expose the OpenStack credentials to your users at all? What's the thinking here?

@s3rj1k
Copy link

s3rj1k commented Jan 24, 2025

Have you considered using a tool which automatically copies secrets from a central namespace to target namespace, and updates them automatically in line with the source credentials

This is what I was mentioning in Slack, this is exactly what I would personally like to avoid, given that ClusterIdentity is common practice in CAPI

@bnallapeta
Copy link
Author

@mdbooth Yes, we want to avoid this as this exposes the cluster secret to all the other namespaces where we would want to provision a cluster.

@bnallapeta
Copy link
Author

bnallapeta commented Jan 27, 2025

@mdbooth so, based on the conversation here and on Slack, are you now convinced that ClusterIdentity should be the way to address this? Please let me know. I am happy to take it up and work on it.

We could discuss further if we reach a conclusion here.

@richardcase
Copy link
Member

Yes, we want to avoid this as this exposes the cluster secret to all the other namespaces where we would want to provision a cluster.

We have this functionality in CAPA (i am a maintainer) and this is a valid concern. In CAPA on our equivalent to this (AWSClusterIdentitySpec) we have the ability to lock down using the identity from specific namespaces via a AllowedNamespaces field. This could be a good way around this concern.

@mdbooth
Copy link
Contributor

mdbooth commented Jan 27, 2025

Yes, we want to avoid this as this exposes the cluster secret to all the other namespaces where we would want to provision a cluster.

@bnallapeta Can you explain this further? As I mentioned before we have a custom solution to this in OpenShift, but looking around it seems to have been solved a few times. Here's an example that looks very much like the sort of thing I'd expect to use for this: https://medium.com/lonto-digital-services-integrator/why-we-developed-own-kubernetes-controller-to-copy-secrets-e46368ae6db9

In this model:

  • Credentials are centrally managed
  • Credentials are copied only to the namespaces which need them
  • Updating a centrally managed credential updates it everywhere
  • Deleting a centrally managed credential deletes it everywhere

It also requires no code in CAPO which is directly responsible for mediating access to Secrets.

I also have to consider that we would need to add support for cross-namespace Secret references to ORC. I was hoping to be able to switch ORC, and maybe even CAPO, to a per-namespace controller model, which would have the benefit of load sharding in addition to greatly reduced RBAC requirements for each controller. This is a secondary consideration, but I bring it up because it's on my mind.

I want to be 100% clear that I'm not rejecting this proposal1, btw. AWSClusterStaticIdentity is a much safer model than 'bare' cross-namespace secret references. I especially like the control that users (presumably) can't create them. Unlike cross-namespace references, I can see a path to personally approving of this, especially as it seems to be an established pattern. However, it would represent a significant change in secure coding requirements in CAPO, and I hope we would not choose to do that without a complete understanding of why we rejected the alternatives. In particular, I would very much like to understand the concerns around using a secret controller. I don't yet understand this.

Footnotes

  1. For what that would be worth. There are 2 other maintainers who are welcome to disagree with me!

@lentzi90
Copy link
Contributor

I also have to consider that we would need to add support for cross-namespace Secret references to ORC. I was hoping to be able to switch ORC, and maybe even CAPO, to a per-namespace controller model, which would have the benefit of load sharding in addition to greatly reduced RBAC requirements for each controller. This is a secondary consideration, but I bring it up because it's on my mind.

With the risk of going off topic, I don't think this should be an either/or question. Admins should be able to run CAPO namespaced or cluster scoped, and this is already possible.
I think it is important that we keep this flexibility, but other than that I do not have an issue with this feature request.

How does the cluster identity function if the controller is limited to a single namespace? I assume it is possible to run CAPA namespaced still with this. @richardcase @bnallapeta

@s3rj1k
Copy link

s3rj1k commented Jan 27, 2025

In this model:
* Credentials are centrally managed
* Credentials are copied only to the namespaces which need them
* Updating a centrally managed credential updates it everywhere
* Deleting a centrally managed credential deletes it everywhere

Lets assume running multiple CAPx providers in cluster, all of them use ClusterIdentity model and only CAPO needs that special treatment.

@richardcase
Copy link
Member

How does the cluster identity function if the controller is limited to a single namespace? I assume it is possible to run CAPA namespaced still with this. @richardcase @bnallapeta

Although it may be technically possible to run multiple instances of CAPA and each one look at a specific namespace, we don't say this is supported, and we don't have any tests for this. There's also the question do we honour "watch namespace" in all controllers and in CAPI itself.

But ignoring that, as there is no controller for AWSClusterIdentity multiple instances of CAPA could still read the identify for the clusters it manages.

@lentzi90
Copy link
Contributor

Thanks, got it. Honestly, the situation sounds quite similar to CAPO. We do not test the watch namespace flag anywhere that I know if.

I think a plan for how to support single namespace CAPO with cluster identity would still be good though. It would be unfortunate to break the functionality by ignorance.

@mdbooth
Copy link
Contributor

mdbooth commented Jan 27, 2025

A thought experiment. Lets say we added a new OpenStackClusterStaticIdentity API which is identical to AWSClusterStaticIdentity in all relevant ways. However, instead of implementing it as library code in CAPO, we instead implement it as a Secret manager in a separate controller. The behaviour of this controller is:

  • Copy the static credentials from the central namespace to all AllowedNamespaces
  • Update all copies when they are updated centrally
  • Delete credential copies either when they no longer match AllowedNamespaces, or when the owning OpenStackClusterStaticIdentity is deleted

From an API POV this would be identical to the UX of AWSClusterStaticIdentity (unless I missed something). The only difference I can see is that the credentials themselves are now directly visible to users, whereas previously they are only indirectly available.

Continuing the thought experiment to phase 2:

Instead of copying the credentials, the controller provisions new application credentials for each target namespace using the central credentials. This has the added advantage of being able to correlate the OpenStack audit trail to a specific namespace.

@wkonitzer
Copy link

Hi @mdbooth,

I looked at the CAPI spec, and potentially, the current behavior of CAPO is not in the spirit of the spec. The current behavior of CAPO concerning identityRef requiring secrets to reside in the same namespace as the OpenStackCluster resource does not explicitly violate the CAPI spec, but it does diverge from the recommended approach of using ClusterIdentity for credential abstraction. Here's my understanding:

CAPI Spec on Credential Management

  1. ClusterIdentity:
  • The CAPI spec introduces the ClusterIdentity resource to abstract and manage credentials securely. This resource allows you to reference credentials (e.g., secrets) that can be used across namespaces, enabling centralized credential management.
  1. Namespace-Scoped Secrets:
  • While the CAPI spec does not explicitly mandate that secrets must be cross-namespace accessible, it provides a mechanism (ClusterIdentity) to securely abstract and reference credentials across namespaces. This mechanism is considered best practice.
  1. Provider-Specific Behavior:
  • The spec allows providers to implement their specific behavior. However, providers are encouraged to align with the CAPI spec to maintain consistency and interoperability across different providers.

CAPI Security Guidelines

  • Credentials Management: Ensure credentials used by Cluster API are least privileged. Apply access control to Cluster API controller namespaces, restricting unauthorized access to cloud administrators only.

How CAPO Diverges

  • CAPO does not implement the ClusterIdentity resource, which other providers use to handle cross-namespace credential references.
  • Instead, CAPO enforces that the referenced secret (e.g., openstack-cloud-config) must reside in the same namespace as the OpenStackCluster. This creates additional complexity for users who want to centralize credentials.

Does this behavior violate the CAPI Spec? Not explicitly, as the CAPI spec does not strictly require providers to use ClusterIdentity. However, the absence of ClusterIdentity or an equivalent abstraction in CAPO can be seen as a misalignment with the spirit of the spec.

Unfortunately, only the AWS, Azure, OCI, and vSphere providers implement ClusterIdentity, enabling a more seamless and flexible credential management experience. Therefore, one could argue CAPO is aligned with the majority and thus is correct.
We could propose a CAPI spec update to enforce the ClusterIdentity resource, but it doesn't make sense for providers like metal3 or Kubevirt, for example.

The outcome we are trying to achieve is a consistent user experience across providers:

  • Credential management should "just work" regardless of the provider.
  • While the mechanism (e.g ClusterIdentity, or some other mechanism) may differ, the experience for end users (e.g., managing centralized credentials, cross-namespace access) should be seamless.

If needed, we could certainly build something like a secrets management controller to copy secrets, so no changes would be required in CAPO. However, this approach introduces several challenges:

  • Inconsistent Behavior: CAPO would behave differently from other similar providers that natively support centralized credential management (e.g., through ClusterIdentity). This inconsistency would create a steeper learning curve for users and potentially lead to confusion when switching between providers.
  • Increased Complexity: Adding a separate secrets controller introduces additional components to manage, deploy, and troubleshoot. This increases operational complexity, which goes against the goal of making credential management seamless and straightforward.
  • Code Maintenance: Having a separate secrets controller to address CAPO-specific behavior would make our overall codebase harder to maintain. It would require us to maintain provider-specific logic outside of CAPO, rather than addressing the root cause within the provider itself.
  • Potential Security Risks: While a secrets controller could automate the propagation of secrets, it also expands the attack surface. Maintaining a robust, secure solution for managing secrets across namespaces is better handled as part of CAPO, where proper guardrails and mechanisms can be directly integrated.

Ultimately, aligning CAPO with the behavior of other providers by supporting a centralized abstraction for credentials (or implementing an equivalent to ClusterIdentity) would result in a more consistent user experience, simplify operations, and reduce technical debt in the long run.

Hopefully, that provides some extra info outside of looking purely at the code.

@lentzi90
Copy link
Contributor

@wkonitzer could you add some links to this CAPI spec on ClusterIdentity? I cannot find any in their documentation, only an example for how it is handled during clusterctl move.

@mdbooth
Copy link
Contributor

mdbooth commented Jan 28, 2025

@wkonitzer Firstly let me reiterate that I accept the use case! It wasn't one that I was previously aware of, but new use cases are welcome.

Secondly, I appreciate the value of consistency. If this is a thing we're going to do, copying an existing pattern makes a great deal of sense. If we decide to do this, adding an OpenStackClusterStaticIdentity seems reasonable.

Lastly, I'm not 100% opposed to implementing this API similarly to how CAPA has done it. However, I am concerned that it opens new security concerns that CAPO has not previously had to consider. Specifically, in this model CAPO can now allow an object in one namespace to reference a secret in another namespace, whereas previously it could not. This opens the possibility of a new class of bug whose potential impact is a privilege escalation allowing a user able to create CAPO objects to access to any secret in the cluster. I would like to explore options for meeting the use case, possibly with the proposed API, with an architecture that does not enable this class of bug.

For the sake of argument, lets assume that we will implement OpenStackClusterStaticIdentity. Lets reduce the implementation of this API to 2 options:

  • AWS-style direct cross-namespace referencing
  • An independent 'secret manager' which copies credentials to where they are used

We assume that the API, and therefore the UX, in both cases is literally identical. Adding a new API requires new code in CAPO in either case.

So I think the only points left to consider are the final 4:

  • Inconsistent Behavior: I don't believe the behaviour of an implementation using a secrets manager would be inconsistent.
  • Increased Complexity: I'd argue it's secure architecture. It would move code requiring different security considerations into a separate component, reducing the opportunities for security bugs in the rest of the code base.
  • Code maintenance: If we were going to add this as a new CAPO API it would need an implementation in CAPO.
  • Potential Security Risks: I don't yet follow the argument about an increased attack surface. I'm arguing the opposite. What additional risks do you see?

There are trade-offs here, and it's quite possible I still don't fully understand your constraints. Can you explain why an implementation of OpenStackClusterStaticIdentity which copied secrets would not meet your use case?

@mdbooth
Copy link
Contributor

mdbooth commented Feb 5, 2025

We discussed this in the CAPO office hours call today. With the requirement to not expose credentials to users in a particular namespace, I don't see any realistic option which doesn't involve accessing secrets in another namespace.

So I'm personally convinced the best way forward is the one suggested here, namely to copy the pattern set by AWSClusterStaticIdentity.

My concerns around arbitrary secret access remain, though. I notice that an AWS credential specify's the secret's name but not its namespace. The namespace is hardcoded to be the same namespace that the controllers are running in: https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/d020303ce12b4488aadcf7009f2cbe3307e61079/pkg/cloud/scope/session.go#L347

I think this is a good and important control for a couple of reasons. Firstly it means that it is not possible for the user to express a reference to a secret we do not want them to access, e.g. something in kube-system. Therefore, it is less likely we could write a bug which accidentally enabled this access.

Secondly, it means that in a namespace-restricted deployment, while we can no longer restrict permissions to a single namespace, we can at least restrict it to a known set of namespaces.

So:

  • All cluster static credential secrets must be in the same namespace
  • Users with restricted privileges cannot create static cluster identity objects
  • Static cluster identities do not specify a secret namespace, because they are all the same

We could follow the AWS pattern and have this be 'manager namespace', or we could even allow it to be specified on the command line. As long as they are static for a particular deployment, it remains possible to write RBAC which allows access to secrets only in the intended namespaces.

@s3rj1k
Copy link

s3rj1k commented Feb 5, 2025

allow it to be specified on the command line

+1 for this, let consumer decide in some opt-in way with default in-place

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
Status: Inbox
Development

No branches or pull requests

7 participants