Skip to content

feat(cert): add certificate manager with hot reloading #8973

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Neon-White
Copy link
Contributor

@Neon-White Neon-White commented Apr 17, 2025

Describe the Problem

NooBaa needs to handle certificate changes in both core and endpoint pods, particularly for service and OCP user certificates. Currently, there's no mechanism to detect certificate changes and automatically update the Node.js certificate bundle, which can lead to connection issues when certificates are added or rotated.

Explain the Changes

TL;DR - the core counterpart for #977.

  1. Added a new certificate manager script (cert_manager.sh) that:
  • Bundles certificates from /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt and /etc/ocp-injected-ca-bundle/ca-bundle.crt
  • Uses polling to detect certificate changes
  • Automatically restarts Node.js processes when certificates change
  • Supports both core (supervisord) and endpoint pods
  • Includes proper error handling and cleanup
  1. Modified noobaa_init.sh to:
  • Start the certificate manager in endpoint pods
  • Handle cleanup on pod termination
  • Maintain compatibility with existing process management (or rather, lack thereof)
  1. Added cert_manager program to noobaa_supervisor.conf for core pods

Issues: Fixed #xxx

  1. https://issues.redhat.com/browse/DFBUGS-2038

Testing Instructions:

  1. Create a backingstore on top of an S3-compatible host which requires self-signed certs for TLS
  2. See that the backingstore fails to reach a healthy state
  3. Add the required certs to the OCP cluster on which NooBaa runs
  4. Wait for 60~ seconds
  5. Verify that the backingstore reaches a healthy state, and can be used
  • Doc added/updated
  • Tests added

@Neon-White Neon-White marked this pull request as ready for review April 22, 2025 12:15
- Add certificate bundling for service CA and OCP CA
- Implement polling-based certificate change detection
- Add automatic process restart on certificate changes
- Support both core (supervisord) and endpoint pods
- Add proper error handling and cleanup

Signed-off-by: Ben <[email protected]>
- Add timestamped logging
- Add '[Cert Manager]' prefix to all log messages for easier log filtering
- Add polling status message showing next check interval
- Maintain error/warning message handling to stderr

Signed-off-by: Ben <[email protected]>
@jackyalbo
Copy link
Contributor

@Neon-White General question here: How does the Node.js process restart in case of updating the NODE_EXTRA_CA_CERTS? FYIK, it only loads the certs in initialization and won't reload unless we restart the process. See also here https://nodejs.org/api/cli.html#node_extra_ca_certsfile
Was this tested?

@Neon-White
Copy link
Contributor Author

Neon-White commented Apr 22, 2025

@jackyalbo - The general logic was tested, but a live reload with actual certificates has not yet been tested.
The script restarts the core by running supervisorctl restart bg_workers hosted_agents webserver (see here) - is any additional restart required?

@jackyalbo
Copy link
Contributor

@jackyalbo - The general logic was tested, but a live reload with actual certificates has not yet been tested. The script restarts the core by running supervisorctl restart bg_workers hosted_agents webserver (see here - is any additional restart required?

Sorry you are correct, I missed that part. I feel this part of restarting our services is a bit risky. I think maybe we should consider one of two, or we should let the user know that once he wants to add new certs to the bundle he will need to restart the pods, or we will restart the pods ourselves - which feels less desired. Maybe as a midway solution, we can issue a warning in noobaa CR that we saw new certs and he should consider restarting in case he wants the new certs to take effect. @liranmauda @dannyzaken what do you guys think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants