Skip to content

Commit

Permalink
Flux v2.2.0 release blog
Browse files Browse the repository at this point in the history
Signed-off-by: Hidde Beydals <[email protected]>
  • Loading branch information
hiddeco committed Dec 12, 2023
1 parent 5ec2986 commit a7fbde1
Showing 1 changed file with 165 additions and 0 deletions.
165 changes: 165 additions & 0 deletions content/en/blog/2023-12-12-announcing-flux-v2.2.0/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
---
author: hiddeco
date: 2023-12-12 16:00:00+00:00
title: Announcing Flux 2.2
description: "Flux v2.2.0: Improving your Helm experience!"
url: /blog/2023/12/flux-v2.2.0/
tags: [announcement]
resources:
- src: "**.{png,jpg}"
title: "Image #:counter"
---

We are thrilled to announce the release of [Flux v2.2.0](https://github.com/fluxcd/flux2/releases/tag/v2.2.0)! In this post, we will highlight some of the new features and improvements included in this release, with the primary theme as the (many) changes made to the [helm-controller](https://fluxcd.io/flux/components/helm/).

## Important things first: API changes

### Static Objects

The **Alert v1beta3**, **Provider v1beta3**, and **HelmChart** of type `oci` APIs are now static objects. This means that they are no longer reconciled by a reconciler and will have no status field. A custom resource of these types is considered ready once it is created.

Existing objects will undergo a one-time auto-migration to remove the status field.

### OCIRepository v1beta2 and HelmChart v1beta2

The OCIRepository and HelmChart OCI was extended with the following fields:

- `.spec.matchOIDCIdentity[].issuer` allows you to specify a regex pattern to match against the issuer of the certificate related to the artifact signature used for OCI keyless verification.
- `.spec.matchOIDCIdentity[].subject` allows you to specify a regex pattern to match against the subject of the certificate related to the artifact signature used for OCI keyless verification.

### Bucket v1beta2

- A new field, `.spec.prefix`, has been added which enables server-side filtering of files if the object's .spec.provider is set to generic/aws/gcp.

### HelmRepository v1beta2 and ImageRepository v1beta2

- A new boolean field, `.spec.insecure`, has been introduced which allows connecting to a non-TLS HTTP container registry. For HelmRepository API, it is only considered if the object's `.spec.type` is set to `oci`.

### HelmRelease v2beta2

The API promotion from `v2beta1` to `v2beta2` is backwards compatible, and the controller will continue to reconcile HelmRelease resources of the `v2beta1` API without requiring any changes. However, making use of the new features requires upgrading the API version.

- A new field, `.spec.driftDetection` field has been introduced to configure drift detection and correction on a per-release basis.
- A new `.spec.test.filters` field has been introduced to selectively run a subset of Helm tests.
- The controller now offers proper integration [with `kstatus`](https://github.com/kubernetes-sigs/cli-utils/blob/master/pkg/kstatus/README.md) and sets `Reconciling` and `Stalled` conditions.
- The `.spec.maxHistory` default value has been lowered from `10` to `5` to increase the controller's performance.
- A history of metadata from Helm releases up to the previous successful release is now available in the `.status.history` field. This includes any Helm test results when enabled.
- The `.patchesStrategicMerge` and `.patchesJson6902` Kustomize post-rendering fields have been deprecated in favor of `.patches`.
- A `status.lastAttemptedConfigDigest` field has been introduced to track the last attempted configuration digest using a hash of the composed values.
- A `.status.lastAttemptedReleaseAction` field has been introduced to accurately determine the active remediation strategy.
- The `.status.lastHandledForceAt` and `.status.lastHandledResetAt` fields have been introduced to track the last time a force upgrade or reset was handled. This to accomadate newly introduced annotations to force upgrades and resets.
- The `.status.lastAppliedRevision` and `.status.lastReleaseRevision` fields have been deprecated in favor of `.status.history`.
- The `.status.lastAttemptedValuesChecksum` has been deprecated in favor of `.status.lastAttemptedConfigDigest`.

Although the `v2beta1` API is still supported, it is recommended to upgrade to the `v2beta2` API as soon as possible. The `v2beta1` API will be removed after 6 months.

To upgrade to the `v2beta2` API, update the `apiVersion` field of your HelmRelease resources to `helm.toolkit.fluxcd.io/v2beta2` after updating the controller and Custom Resource Definitions.

For more in-depth information about these API changes, simply continue reading the highlights below, or refer to the updated [HelmRelease v2beta2 specification](https://fluxcd.io/flux/components/helm/helmreleases/) and [controller changelog](https://github.com/fluxcd/helm-controller/v0.37.0/main/CHANGELOG.md).

## Enhanced `HelmRelease` reconciliation model

The reconciliation model of the helm-controller has been rewritten to be able to better dermine the state a Helm release is in, to then decide what Helm action should be performed to reach the desired state.

Effectively, this means that the controller is now capable of continuing where it left off, and to run [Helm tests](https://fluxcd.io/flux/components/helm/helmreleases/#test-configuration) as soon as they are enabled without a Helm upgrade having to take place first.

In addition, it now takes note of releases _while they are happening_, instead of making observations _afterwards_. Ensuring that when performing a rollback remediation, the version we revert to is always exactly the same as the one previously released by the controller. In cases where it is uncertain about state, it will always decide to (reattempt to) perform a Helm upgrade.

This also allows it with certainty to only count release attempts that did cause a mutation to the Helm storage as failures towards retry attempts, improving continuity due to it retrying instantly instead of remediating first.

## Improved observability of Helm releases

An additional thing the enhanced reconciliation model allowed us to work on is making improvements to how we report state back to you, as a user.

The improvements range from the introduction of `Reconciling` and `Stalled` Condition types to become [`kstatus` compatible](https://github.com/kubernetes-sigs/cli-utils/tree/master/pkg/kstatus), to an enriched overview of Helm releases up to the previous successful release in the Status, and more informative Kubernetes Event and Condition messages.

```console
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal HelmChartCreated 25s helm-controller Created HelmChart/demo/demo-podinfo with SourceRef 'HelmRepository/demo/podinfo'
Normal InstallSucceeded 20s helm-controller Helm install succeeded for release demo/podinfo.v1 with chart [email protected]
Normal TestSucceeded 12s helm-controller Helm test succeeded for release demo/podinfo.v1 with chart [email protected]: 3 test hooks completed successfully
```

For more details around these changes, refer to the [Status section](https://fluxcd.io/flux/components/helm/helmreleases/#helmrelease-status) in the HelmRelease v2beta2 specification.

## Recovery from `pending-*` Helm release state

A much reported issue was the helm-controller being unable to recover from `another operation (install/upgrade/rollback) is in progress` errors, which could occur when the controller Pod was forcefully killed.

From this release on, the controller will recover from such errors by unlocking the Helm release from a `pending-*` state to a `failed` state, and retrying it with a Helm upgrade.

## Helm Release drift detection and correction

Around April we launched cluster state drift detection and correction for Helm releases as an experimental feature. At that time, it could only be enabled using a controller global feature flag, making it impractical to use at scale due to the wide variability in charts and unpredictability of the effects on some Helm charts.

For charts with lifecycle hooks, or cluster resources like Horizontal/Vertical Pod Autoscalers for which controllers may write updates back into their own spec, those updates would always be considered as drift by the helm-controller unless the resource would be ignored in full.

To address the above pain points, Helm drift detection can now be enabled on the `HelmRelease` itself, while also allowing you to ignore specific fields using [JSON Pointers](https://datatracker.ietf.org/doc/html/rfc6901):

```yaml
spec:
driftDetection:
mode: enabled
ignore:
- paths: ["/spec/replicas"]
target:
kind: Deployment
```
Using these settings, any drift detected will now be corrected by recreating and patching the Kubernetes objects (instead of doing a Helm upgrade) while changes to the `.spec.replicas` fields for Deployments will be ignored.

For more information, refer to the [drift detection section](https://fluxcd.io/flux/components/helm/helmreleases/#drift-detection) in the HelmRelease v2beta2 specifiation.

## Forcing and retrying Helm releases

Another much-reported issue was the impractical steps one had to take to recover from "retries exhausted" errors. To instruct the helm-controller to retry installing or upgrading a Helm release when it is out of retries, you can now either:

- Instruct it to reset the failure counts, allowing it to retry the number of times as configured in the remediation strategy

```shell
flux reconcile helmrelease <release> --reset
```

- Instruct it to force a one-off Helm install or upgrade

```shell
flux reconcile helmrelease <release> --force
```

For in-depth explanations about these new command options, refer to the ["resetting remediation retries"](https://fluxcd.io/flux/components/helm/helmreleases/#resetting-remediation-retries) and ["forcing a release"](https://fluxcd.io/flux/components/helm/helmreleases/#forcing-a-release) sections in the HelmRelease v2beta2 specification.

## Benchmark results

To measure the real world impact of this overhaul to the helm-controller, we have set up benchmarks that measure Mean Time To Production (MTTP). The MTTP benchmark measures the time it takes for Flux to deploy application changes into production. Below are the results of the benchmark that ran on a GitHub hosted runner (Ubuntu, 16 cores):

| Objects | Type | Flux component | Duration | Max Memory |
|---------|---------------|----------------------|----------|------------|
| 100 | OCIRepository | source-controller | 25s | 38Mi |
| 100 | Kustomization | kustomize-controller | 27s | 32Mi |
| 100 | HelmChart | source-controller | 25s | 40Mi |
| 100 | HelmRelease | helm-controller | 31s | 140Mi |
| 500 | OCIRepository | source-controller | 45s | 65Mi |
| 500 | Kustomization | kustomize-controller | 2m2s | 72Mi |
| 500 | HelmChart | source-controller | 45s | 68Mi |
| 500 | HelmRelease | helm-controller | 2m55s | 350Mi |
| 1000 | OCIRepository | source-controller | 1m30s | 67Mi |
| 1000 | Kustomization | kustomize-controller | 4m15s | 112Mi |
| 1000 | HelmChart | source-controller | 1m30s | 110Mi |
| 1000 | HelmRelease | helm-controller | 8m2s | 620Mi |

> The benchmark uses a single application ([podinfo](https://github.com/stefanprodan/podinfo)) for all tests with intervals set to `60m`. The results may change when deploying Flux objects with a different configuration.

For more information about the benchmark setup and how you can run them on your machine, check out the [fluxcd/flux-benchmark](https://github.com/fluxcd/flux-benchmark) repository.

## Breaking changes to Kustomizations

All Flux components have been updated from Kustomize v5.0.3 to [v5.3.0](https://github.com/kubernetes-sigs/kustomize/releases/tag/kustomize%2Fv5.3.0).

This update has a breaking change in Kustomize: components are now applied after generators. If you use Kustomize components or `.spec.components` in Kustomizations along with generators, then please make necessary changes before upgrading to avoid any undesirable behavior. For more information, see the relevant [Kustomize issue](https://github.com/kubernetes-sigs/kustomize/issues/5141).

## Other notable changes

- `flux install` has been made safer by preventing unwanted upgrades to a bootstrapped cluster.
- `flux bootstrap` now supports Gitea. To bootstrap Flux onto a cluster using Gitea as the Git provider, run `flux bootstrap gitea --repository <repo> --owner <owner>`.

0 comments on commit a7fbde1

Please sign in to comment.