Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

neonvm: continuous synchronisation of files in pod/vm #1222

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

conradludgate
Copy link

@conradludgate conradludgate commented Jan 27, 2025

For #1213 I need the renewed certificate secret to be updated within the vm.

It's ok if there's some lag on this synchronisation as certificates are renewed with generous amount of time left until expiry.

Design:
neonvm-runner will subscribe to inotify events for a specific list of secrets. If an event is detected, the secret contents are sent in a http request to neonvm-daemon.

The files will be owned by the daemon user, and read-only by postgres.

If there's an error when uploading, then there's a secondary loop every 5 minutes to catch up, using a checksum to compare.

We mirror kubernetes and use atomicwriter to atomically update multiple files in the secret.

Useful links:
https://ahmet.im/blog/kubernetes-secret-volumes-delay/
https://ahmet.im/blog/kubernetes-inotify/

Some discussion here:
https://neondb.slack.com/archives/C03TN5G758R/p1737723555448339
https://neondb.slack.com/archives/C03TN5G758R/p1738669103059979

Copy link

github-actions bot commented Jan 27, 2025

No changes to the coverage.

HTML Report

Click to open

@conradludgate conradludgate marked this pull request as ready for review February 3, 2025 18:38
@conradludgate
Copy link
Author

conradludgate commented Feb 3, 2025

open questions:

  • how to we know which secrets should be synchronised (currently /var/sync)
  • should we support configmap too
  • is relying on ..data from atomicwriter relying on an unstable implementation detail?

@mikhail-sakhnov
Copy link
Contributor

mikhail-sakhnov commented Feb 4, 2025

@conradludgate @sharnoff Have you considered instead of relying on inotify and push model to modify neonvm-daemon to subscribe for secret changes via kube-API to reduce the number of moving parts?
It feels more natural for the k8s ecosystem.

  • add serviceaccount to the pod (or modify default service account to allow fetching secrets, which is bad idea from the security perspective)
  • call kubernetes.default.svc from the VM.
  • [probably, not sure] modify iptables rules to allow calling k8s api from VM

@conradludgate
Copy link
Author

I think this inevitably has the same issue. The service account token updates in the pod, and the VM needs to notice that change

@conradludgate
Copy link
Author

I've looked into service accounts. They also use the same form of atomic directory updates.

Currently there's no projected volume disk source for VM objects, but this is the kinda thing that could be hard-coded in the neomvm-runner code. Add /var/run/secrets/kubernetes.io/serviceaccount/..data to the secrets map and it should just work:tm:

@edude03
Copy link
Contributor

edude03 commented Feb 6, 2025

High level, feels like we need a shared file system primitive for the VMs, I keep seeing items that this would potentially solve.

@conradludgate
Copy link
Author

conradludgate commented Feb 6, 2025

High level, feels like we need a shared file system primitive for the VMs

Not sure exactly what this means

Copy link
Member

@sharnoff sharnoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadly looking good -- left a thorough review; I expect 1, maybe 2, much smaller rounds remaining

Also, could you update the PR title to have the structure component: Title ? (ordinarily the PR title format checker would catch that, but feat: is a sufficient prefix to trick the regex 😄)
In this case, s/feat/neonvm/ would suffice, but it's up to you

neonvm-daemon/cmd/main.go Show resolved Hide resolved
neonvm-runner/cmd/disks.go Outdated Show resolved Hide resolved
@@ -187,6 +187,10 @@ func createISO9660runtime(
if disk.MountPath != "" {
mounts = append(mounts, fmt.Sprintf(`/neonvm/bin/mkdir -p %s`, disk.MountPath))
}
if diskNeedsSynchronisation(disk) {
// do nothing as we will mount it into the VM via neonvm-daemon later
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Does this mean that programs starting inside the VM may initially not have access to those secrets? (if so: how difficult would it be to guarantee the initial versions of the secret once everything's mounted?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct. Currently the solution is the "fast" 1-second loop to ping neonvm-daemon.

I suppose there is a solution we can do where we could mount the initial values as a static disk with a symlink, but I'm not sure how complicated that would end up being

neonvm-runner/cmd/disks.go Outdated Show resolved Hide resolved
neonvm-runner/cmd/main.go Outdated Show resolved Hide resolved
return fmt.Errorf("could not open file: %w", err)
}

dto := make(map[string]File)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: What does dto mean here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data transfer object (it's very java-y). I couldn't think of a better name at the time

var length []byte

// hash as "{name}\0{len(data)}{data}"
// this prevents any possible hash confusion problems
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this prevents any possible hash confusion problems

Why? (can you elaborate briefly in the comment? (≤ 1 sentence))

pkg/util/checksum.go Outdated Show resolved Hide resolved
tests/e2e/vm-secret-sync/01-assert.yaml Show resolved Hide resolved
@sharnoff sharnoff assigned conradludgate and unassigned sharnoff Feb 10, 2025
@conradludgate conradludgate changed the title feat: continuous synchronisation of files in pod/vm neonvm: continuous synchronisation of files in pod/vm Feb 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants