-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
approx_percentile: Does not enforce constant percentage between input rows within an aggregation #24592
Comments
Java behavior is correct and we have seen people use non-const second arg. |
It's weird behavior to just use whichever value happens to come last for the whole aggregation. Even allowing different values only for different groups is a bit surprising, that each group is effectively running a different aggregation. IMO throwing an exception is the better thing to do. |
Well we can error out if you get different values for different rows but it doesn't have to be constant. I guess that's what the title is saying? |
Ah okay, so i think we agree. It doesn't have to be a literal value, but it should be the same value for all rows. |
@kaikalur Apologies for being unclear, I've updated with more examples and the title to be (hopefully more clear) |
In Meta production environment, we noticed failures in the aggregate function approx_percentile with Presto-cpp and not Presto-Java. Specifically, certain queries in Presto-cpp throws an exception like
VeloxUserError: base->equalValueAt(base, baseRow, baseFirstRow) Percentile argument must be constant for all input rows
.According to the Presto documentation, this is actually the desired behavior.
However, in Presto-Java, it seems like the last percentile value amongst the rows is respected and all of the previous percentiles are ignored. This goes against the intended behavior described by the documentation.
Presto-Java Behavior
Presto-C++ Behavior
The behavior in Java actually seems unintentional and may produce results that the user is not knowingly expecting. We are raising this issue to the community to solicit ideas for how we can move forward.
Thanks!
cc @spershin @rschlussel @kgpai @amitkdutta @Yuhta @kagamiori
The text was updated successfully, but these errors were encountered: