forked from elastic/elasticsearch
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update data stream lifecycle telemetry to track global retention (ela…
…stic#112451) Currently, the data stream lifecycle telemetry has the following structure: ``` { .... "data_lifecycle" : { "available": true, "enabled": true, "count": 0, "default_rollover_used": true, "retention": { "minimum_millis": 0, "maximum_millis": 0, "average_millis": 0.0 } }.... ``` In the snippet above you can see that we track: - The amount of data streams managed by the data stream lifecycle by `count` - If the default rollover has been overwritten by `default_rollover_used` - The min, max and average of the `data_retention` configured on a data stream level. In this PR we propose the following extention: ``` .... "data_lifecycle" : { "available": true, "enabled": true, "count": 0, "default_rollover_used": true, "effective_retention": { #elastic/dev#2537 "retained_data_streams": 5, "minimum_millis": 0, # Only if retained data streams > 1 "maximum_millis": 0, "average_millis": 0.0 }, "data_retention": { "configured_data_streams": 5, "minimum_millis": 0, # Only if retained data streams > 1 "maximum_millis": 0, "average_millis": 0.0 }, "global_retention": { "default": { "defined": true/false, "affected_data_streams": 0, "millis": 0 }, "max": { "defined": true/false, "affected_data_streams": 0, "millis": 0 } } ``` With this extension we are tracking: - The amount of data streams managed by the data stream lifecycle by `count` - If the default rollover has been overwritten by `default_rollover_used` - The min, max and average of the `data_retention` configured on a data stream level and the number of data streams that have it configured. We add the min, max and avg only if there are data streams with data retention configuration to avoid messing with the stats in a dashboard. - The min, max and average of the `effective_retention` and the number of data streams that are retained. We add the min, max and avg only if there are retained data streams to avoid messing with the stats in a dashboard. - Global retention stats, if they are defined, if the number of the affected data streams and the actual value. The above metrics allow us to answer questions like: - How many data streams are affected by global retention. - How big is the difference between the longest data retention compared to max global retention. - How much does the effective retention diverging from the data retention, this will show the impact of the global retention.
- Loading branch information
Showing
10 changed files
with
560 additions
and
187 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
pr: 112451 | ||
summary: Update data stream lifecycle telemetry to track global retention | ||
area: Data streams | ||
type: breaking | ||
issues: [] | ||
breaking: | ||
title: Update data stream lifecycle telemetry to track global retention | ||
area: REST API | ||
details: |- | ||
In this release we introduced global retention settings that fulfil the following criteria: | ||
- a data stream managed by the data stream lifecycle, | ||
- a data stream that is not an internal data stream. | ||
As a result, we defined different types of retention: | ||
- **data retention**: the retention configured on data stream level by the data stream user or owner | ||
- **default global retention:** the retention configured by an admin on a cluster level and applied to any | ||
data stream that doesn't have data retention and fulfils the criteria. | ||
- **max global retention:** the retention configured by an admin to guard against having long retention periods. | ||
Any data stream that fulfills the criteria will adhere to the data retention unless it exceeds the max retention, | ||
in which case the max global retention applies. | ||
- **effective retention:** the retention that applies on the data stream that fulfill the criteria at a given moment | ||
in time. It takes into consideration all the retention above and resolves it to the retention that will take effect. | ||
Considering the above changes, having a field named `retention` in the usage API was confusing. For this reason, we | ||
renamed it to `data_retention` and added telemetry about the other configurations too. | ||
impact: Users that use the field `data_lifecycle.retention` should use the `data_lifecycle.data_retention` | ||
notable: false |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.