|
| 1 | +# Etcd Cluster Components |
| 2 | + |
| 3 | +For every `Etcd` cluster that is provisioned by `etcd-druid` it deploys a set of resources. Following sections provides information and code reference to each such resource. |
| 4 | + |
| 5 | +## StatefulSet |
| 6 | + |
| 7 | +[StatefulSet](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/) is the primary kubernetes resource that gets provisioned for an etcd cluster. |
| 8 | + |
| 9 | +* Replicas for the StatefulSet are derived from `Etcd.Spec.Replicas` in the custom resource. |
| 10 | + |
| 11 | +* Each pod comprises of two containers: |
| 12 | + * `etcd-wrapper` : This is the main container which runs an etcd process. |
| 13 | + |
| 14 | + * `etcd-backup-restore` : This is a side-container which does the following: |
| 15 | + |
| 16 | + * Orchestrates the initialization of etcd. This includes validation of any existing etcd data directory, restoration in case of corrupt etcd data directory files for a single-member etcd cluster. |
| 17 | + * Periodically renewes member lease. |
| 18 | + * Optionally takes schedule and thresold based delta and full snapshots and pushes them to a configured object store. |
| 19 | + * Orchestrates scheduled etcd-db defragmentation. |
| 20 | + |
| 21 | + > NOTE: This is not a complete list of functionalities offered out of `etcd-backup-restore`. |
| 22 | +
|
| 23 | +**Code reference:** [StatefulSet-Component](https://github.com/gardener/etcd-druid/tree/480213808813c5282b19aff5f3fd6868529e779c/internal/component/statefulset) |
| 24 | + |
| 25 | +> For detailed information on each container you can visit [etcd-wrapper](https://github.com/gardener/etcd-wrapper) and [etcd-backup-restore](https://github.com/gardener/etcd-backup-restore) respositories. |
| 26 | +
|
| 27 | +## ConfigMap |
| 28 | + |
| 29 | +Every `etcd` member requires [configuration](https://etcd.io/docs/v3.4/op-guide/configuration/) with which it must be started. `etcd-druid` creates a [ConfigMap](https://kubernetes.io/docs/concepts/configuration/configmap/) which gets mounted onto the `etcd-backup-restore` container. `etcd-backup-restore` container will modify the etcd configuration and serve it to the `etcd-wrapper` container upon request. |
| 30 | + |
| 31 | +**Code reference:** [ConfigMap-Component](https://github.com/gardener/etcd-druid/tree/480213808813c5282b19aff5f3fd6868529e779c/internal/component/configmap) |
| 32 | + |
| 33 | +## PodDisruptionBudget |
| 34 | + |
| 35 | +An etcd cluster requires quorum for all write operations. Clients can additionally configure quorum based reads as well to ensure [linearizable](https://jepsen.io/consistency/models/linearizable) reads (kube-apiserver's etcd client is configured for linearizable reads and writes). In a cluster of size 3, only 1 member failure is tolerated. [Failure tolerance](https://etcd.io/docs/v3.3/faq/#what-is-failure-tolerance) for an etcd cluster with replicas `n` is computed as `(n-1)/2`. |
| 36 | + |
| 37 | +To ensure that etcd pods are not evicted more than its failure tolerance, `etcd-druid` creates a [PodDisruptionBudget](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/#pod-disruption-budgets). |
| 38 | + |
| 39 | +> **NOTE:** For a single node etcd cluster a `PodDisruptionBudget` will be created, however `pdb.spec.minavailable` is set to 0 effectively disabling it. |
| 40 | +
|
| 41 | +**Code reference:** [PodDisruptionBudget-Component](https://github.com/gardener/etcd-druid/tree/480213808813c5282b19aff5f3fd6868529e779c/internal/component/poddistruptionbudget) |
| 42 | + |
| 43 | +## ServiceAccount |
| 44 | + |
| 45 | +`etch-backup-restore` container running as a side-car in every etcd-member, requires permissions to access resources like `Lease`, `StatefulSet` etc. A dedicated [ServiceAccount](https://kubernetes.io/docs/concepts/security/service-accounts/) is created per `Etcd` cluster for this purpose. |
| 46 | + |
| 47 | +**Code reference:** [ServiceAccount-Component](https://github.com/gardener/etcd-druid/tree/3383e0219a6c21c6ef1d5610db964cc3524807c8/internal/component/serviceaccount) |
| 48 | + |
| 49 | +## Role & RoleBinding |
| 50 | + |
| 51 | +`etch-backup-restore` container running as a side-car in every etcd-member, requires permissions to access resources like `Lease`, `StatefulSet` etc. A dedicated [Role]() and [RoleBinding]() is created and linked to the [ServiceAccount](https://kubernetes.io/docs/concepts/security/service-accounts/) created per `Etcd` cluster. |
| 52 | + |
| 53 | +**Code reference:** [Role-Component](https://github.com/gardener/etcd-druid/tree/3383e0219a6c21c6ef1d5610db964cc3524807c8/internal/component/role) & [RoleBinding-Component](https://github.com/gardener/etcd-druid/tree/master/internal/component/rolebinding) |
| 54 | + |
| 55 | +## Client & Peer Service |
| 56 | + |
| 57 | +To enable clients to connect to an etcd cluster a ClusterIP `Client` [Service](https://kubernetes.io/docs/concepts/services-networking/service/) is created. To enable `etcd` members to talk to each other(for discovery, leader-election, raft consensus etc.) `etcd-druid` also creates a [Headless Service](https://kubernetes.io/docs/concepts/services-networking/service/#headless-services). |
| 58 | + |
| 59 | +**Code reference:** [Client-Service-Component](https://github.com/gardener/etcd-druid/tree/480213808813c5282b19aff5f3fd6868529e779c/internal/component/clientservice) & [Peer-Service-Component](https://github.com/gardener/etcd-druid/tree/480213808813c5282b19aff5f3fd6868529e779c/internal/component/peerservice) |
| 60 | + |
| 61 | +## Member Lease |
| 62 | + |
| 63 | +Every member in an `Etcd` cluster has a dedicated [Lease](https://kubernetes.io/docs/concepts/architecture/leases/) that gets created which signifies that the member is alive. It is the responsibility of the `etcd-backup-store` side-car container to periodically renew the lease. |
| 64 | + |
| 65 | +> Today the lease object is also used to indicate the member-ID and the role of the member in an etcd cluster. Possible roles are `Leader`, `Member`(which denotes that this is a member but not a leader). This will change in the future with [EtcdMember resource](https://github.com/gardener/etcd-druid/blob/3383e0219a6c21c6ef1d5610db964cc3524807c8/docs/proposals/04-etcd-member-custom-resource.md). |
| 66 | +
|
| 67 | +**Code reference:** [Member-Lease-Component](https://github.com/gardener/etcd-druid/tree/3383e0219a6c21c6ef1d5610db964cc3524807c8/internal/component/memberlease) |
| 68 | + |
| 69 | +## Delta & Full Snapshot Leases |
| 70 | + |
| 71 | +One of the responsibilities of `etcd-backup-restore` container is to take periodic or threshold based snapshots (delta and full) of the etcd DB. Today `etcd-backup-restore` communicates the end-revision of the latest full/delta snapshots to `etcd-druid` operator via leases. |
| 72 | + |
| 73 | +`etcd-druid` creates two [Lease](https://kubernetes.io/docs/concepts/architecture/leases/) resources one for delta and another for full snapshot. This information is used by the operator to trigger [snapshot-compaction](../proposals/02-snapshot-compaction.md) jobs. Snapshot leases are also used to derive the health of backups which gets updated in the `Status` subresource of every `Etcd` resource. |
| 74 | + |
| 75 | +> In future these leases will be replaced by [EtcdMember resource](https://github.com/gardener/etcd-druid/blob/3383e0219a6c21c6ef1d5610db964cc3524807c8/docs/proposals/04-etcd-member-custom-resource.md). |
| 76 | +
|
| 77 | +**Code reference:** [Snapshot-Lease-Component](https://github.com/gardener/etcd-druid/tree/3383e0219a6c21c6ef1d5610db964cc3524807c8/internal/component/snapshotlease) |
0 commit comments