You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some of the customers are reporting a problem that user's p50, p75 and p95 values are non-deterministic. There are quantile and quantileDeterministic aggregate functions in ClickHouse that are used in this case. They both use reservoir sampling method to compute approximate quantiles. The only difference is that quantile uses a random number generator to pickup the samples, making the results non-deterministic, while quantileDeterministic needs caller to pass in a denominator and that denominator is used to pick out the samples.
The main task is to pick a correct denominator that the customer can pass in. The lack of determinism comes from the random seed and a non-random denominator should be able to give the result we want. This ClickHouse documentation can be useful in this context. Specifically, this section of the documentation is a bit unclear:
If the same determinator value occurs too often, the function works incorrectly.
Expected Result
The fluctuation in the p50, p75 and p95 values does not occur.
Actual Result
p50, p75 and p95 values are non-deterministic.
The text was updated successfully, but these errors were encountered:
Environment
Snuba SaaS
Steps to Reproduce
Some of the customers are reporting a problem that user's p50, p75 and p95 values are non-deterministic. There are
quantile
andquantileDeterministic
aggregate functions in ClickHouse that are used in this case. They both use reservoir sampling method to compute approximate quantiles. The only difference is that quantile uses a random number generator to pickup the samples, making the results non-deterministic, whilequantileDeterministic
needs caller to pass in a denominator and that denominator is used to pick out the samples.The main task is to pick a correct denominator that the customer can pass in. The lack of determinism comes from the random seed and a non-random denominator should be able to give the result we want. This ClickHouse documentation can be useful in this context. Specifically, this section of the documentation is a bit unclear:
Expected Result
The fluctuation in the p50, p75 and p95 values does not occur.
Actual Result
p50, p75 and p95 values are non-deterministic.
The text was updated successfully, but these errors were encountered: