-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
query: Make group a/b minPenalty to be configurable #8061
base: main
Are you sure you want to change the base?
Conversation
Change-Id: If991daccf49d03cc116a7e0413e023ea4cb45643 Signed-off-by: Magiceses <[email protected]>
We should probably pipe it through commandline flags at least! |
The current patches are just a simple way of implementation. I'm not sure if it will have other effects to modify this value. If you think that this parameter to be configurable is reasonable, I can pipe it through commandline flags. |
Yeah j think it would be cool to be able to configure this and initial penalty through command line options! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good luck trying to understand what the "correct" value is and to explain to regular users who aren't intimately familiar with the inner workings 😄 please focus instead on how to fix the dedup algorithm. I think a proposal is needed here instead which details everything and then we can make the correct decision
I have lately wondered why we don't dedup on chunk level. That should sort out restarts but not scrape misses bit could be super easy to do. |
Change-Id: If991daccf49d03cc116a7e0413e023ea4cb45643
Changes
What happened
When the THANOS QURY connects multiple Prometheus instances, if the pull time of the two Prometheus instances is too close, it may cause the data to fall, such as:
At this time, when you query the metrics through the Thanos Query's dedup param interface, the data of the Counter data type will decline (this situation is not reset to the counter), just like the following:
This situation is incorrect for the counter type, which will cause the Rate func calculate abnormality, just like:
Solution
I noticed that the logic that affects this part is in the iter.go file.
Therefore, can we force the THANOS Query by modifying the penalty time of group A/B to solve this problem that we can skip two particularly similar points (for example, the collection time of these two points is only 1s or less).
After testing, after modifying this penalty time from 0 to 999, the fall of data will not appear. Just like following:
Verification
After modifying, THANOS Query can run normally, but I am not sure if there will be other scenarios that will affect it, or there are other considerations for this penalty value, must to be set to 0?
Can anyone help it or other suggestions?