Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add kubernetes election module #3721

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

osmman
Copy link

@osmman osmman commented Jan 27, 2025

This PR adds a new leader election implementation using the Kubernetes Lease API.

By leveraging Kubernetes Lease resources, this implementation provides an alternative backend for leader election. It simplifies deployments on Kubernetes by removing the need for a secondary etcd cluster when running Kubernetes workloads.

Fixes #3431

Checklist

Copy link

google-cla bot commented Jan 27, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@osmman osmman force-pushed the k8s-election branch 3 times, most recently from 922ebcf to 68897bf Compare January 27, 2025 16:29
Copy link
Contributor

@mhutchinson mhutchinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR. Are you happy to be an owner of the new k8s election code module? The core team doesn't have the time or expertise to take on ownership of this. On the whole this role is largely being responsible for reviewing code changes from other users and any dependency updates that require code changes.

I've left some comments on the old code that was refactored to make space for this but I haven't reviewed the new code yet. I'm at a summit this week but this is on my radar for when I have time.

cmd/trillian_log_signer/main.go Outdated Show resolved Hide resolved
util/election2/etcd/provider.go Outdated Show resolved Hide resolved
util/election2/etcd/provider.go Outdated Show resolved Hide resolved
hostname, _ := os.Hostname()
instanceID := fmt.Sprintf("%s.%d", hostname, os.Getpid())

servers := flag.Lookup("etcd_servers").Value.String()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flag.Lookup is a bit gnarly but I see that there aren't many other options without rearchitecting a lot of stuff.

Comment on lines 52 to 53
// NewFactory builds an election factory that uses the given parameters. The
// passed in etcd client should remain valid for the lifetime of the object.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is a bit stale.

@@ -0,0 +1,71 @@
// Copyright 2017 Google LLC. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The year is out of date in a number of new files.

Introduce a new election module using the Kubernetes Lease API. Modify
the existing mechanism to allow selection of election mechanisms by
introducing the `--election_system` parameter, following the same
pattern used for storage and quota system selection.
@osmman
Copy link
Author

osmman commented Jan 31, 2025

Thanks for this PR. Are you happy to be an owner of the new k8s election code module? The core team doesn't have the time or expertise to take on ownership of this. On the whole this role is largely being responsible for reviewing code changes from other users and any dependency updates that require code changes.

Yes, I can take ownership

Copy link
Contributor

@mhutchinson mhutchinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Tomas, thanks for confirming you can take ownership of this submodule. That's great.

I've put a couple of comments on the existing files. My primary concern is that we don't break behaviour for existing users, and that we avoid too much entangling with introducing this new module. Once I'm happy with the changes around the old codebase I'll take a deeper dive review into the new module implementation.

BTW, one thing you can do that will make reviews far easier for me is to keep each round of commits in their own commit in the chain. We'll squash them together when merging at the end. This makes it much easier for me to step through the new code changes and be sure that no changes have been made to code I've already reviewed. Thanks!

Comment on lines 9 to 10
_ "github.com/google/trillian/util/election2/etcd"
_ "github.com/google/trillian/util/election2/k8s"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the same technique to initialize the defaults here as we do with storage and quota? It looks you've added an election2.Providers so we should be able to define a DefaultElectionSystem and initialize it in init much like the other subsystems.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, are you referring only to creating the DefaultElectionSystem with its initialization (similar to storage and quota), or are you also expecting the implementation of build tags (as in the storage module) to optimize the final binary?

If you want to add build tags for the election systems, I'll need some clarification on how to handle etcd. Currently, etcd is always included in the binary, so build tags don't have any effect on it.

go run -tags mysql ./cmd/trillian_log_signer -h 2>&1 | grep system
 -quota_system string
        Quota system to use. One of: [noop mysql etcd] (default "mysql")

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added commit which uses build tags. Right now it is behave like this:

Build tag none:

go run ./cmd/trillian_log_signer -h 2>&1 | grep system
-election_system string
        Election system to use. One of: [noop k8s etcd] (default "etcd")
-quota_system string
        Quota system to use. One of: [noop crdb mysql postgresql etcd] (default "mysql")        

Build tag k8s:

go run -tags k8s ./cmd/trillian_log_signer -h 2>&1 | grep system
-election_system string
        Election system to use. One of: [noop k8s] (default "k8s")
-quota_system string
        Quota system to use. One of: [noop crdb mysql postgresql etcd] (default "mysql")

Build tag etcd:

go run -tags etcd ./cmd/trillian_log_signer -h 2>&1 | grep system
-election_system string
        Election system to use. One of: [noop etcd] (default "etcd")
-quota_system string
        Quota system to use. One of: [mysql postgresql etcd noop crdb] (default "mysql")        

util/election2/etcd/provider.go Outdated Show resolved Hide resolved
@mhutchinson mhutchinson marked this pull request as ready for review February 5, 2025 12:03
@mhutchinson mhutchinson requested a review from a team as a code owner February 5, 2025 12:03
@mhutchinson mhutchinson requested a review from roger2hk February 5, 2025 12:03
@osmman osmman requested a review from mhutchinson February 7, 2025 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Leader election using Kubernetes leases
2 participants