-
Notifications
You must be signed in to change notification settings - Fork 928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workaround groupby aggregate thrust::copy_if overflow #12079
Workaround groupby aggregate thrust::copy_if overflow #12079
Conversation
Codecov ReportBase: 87.47% // Head: 88.10% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## branch-22.12 #12079 +/- ##
================================================
+ Coverage 87.47% 88.10% +0.62%
================================================
Files 133 135 +2
Lines 21826 22072 +246
================================================
+ Hits 19093 19447 +354
+ Misses 2733 2625 -108
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, I will be running this today (building things) but I do not think my run is a blocker. I'll just report here or on the original issue.
Update: I ran with this patch on an 80GB GPU, and while on the aggregate I see a |
OK, @davidwendt by itself this seems to be working fine in the 80GB gpu. I believe I am hitting a second issue that is not related to this one for the join that comes later. I see issues with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this bug manifest in other places thrust::copy_if
is called?
Yes, I found a few other places this could be a problem. I will open PRs for those as well. |
Does it make sense for someone to start working on the underlying fix as well? (as you linked NVIDIA/cccl#747) |
Do we have a test exercising this overflow issue? |
A test would not be practical because it would require too much memory for the CI machines. Even with the fix, this runs out of memory on my 48GB GPU. This is really a thrust bug so I feel like the test should go there. |
@gpucibot merge |
Workaround in nvtext's wordpiece-tokenizer due to limitation in `thrust::copy_if` which fails if the input-iterator spans more than int-max. Found existing thrust issue: https://github.com/NVIDIA/thrust/issues/1302 This calls the `thrust::copy_if` in chunks if the iterator can span greater than int-max. Found while working on #12079 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Nghia Truong (https://github.com/ttnghia) - Vyas Ramasubramani (https://github.com/vyasr) URL: #12168
Workaround in json's get_tree_representation due to limitation in `thrust::copy_if` which fails if the input-iterator spans more than int-max. Found existing thrust issue: https://github.com/NVIDIA/thrust/issues/1302 This calls the thrust::copy_if in chunks if the iterator can span greater than int-max. Found while working on #12079 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Elias Stehle (https://github.com/elstehle) - Mike Wilson (https://github.com/hyperbolic2346) URL: #12190
This extracts `thrust_copy_if` out of `json_tree.cu` and puts it into `cudf/detail/utilities/algorithm.cuh` with a new name `cudf::detail::copy_if_safe`. Such utility is useful not just for use in `json_tree.cu` but potentially in many other places. The next immediate use case will be to implement `from_json` in NVIDIA/spark-rapids-jni#844. This also changes the work in #12079 to adopt the new function `cudf::detail::copy_if_safe`. Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - Yunsong Wang (https://github.com/PointKernel) - David Wendt (https://github.com/davidwendt) URL: #12455
Description
Workaround for limitation in
thrust::copy_if
which fails if the input-iterator spans more than int-max.The
thrust::copy_if
hardcodes the iterator distance type to be an inthttps://github.com/NVIDIA/thrust/blob/dbd144ed543b60c4ff9d456edd19869e82fe8873/thrust/system/cuda/detail/copy_if.h#L699-L708
Found existing thrust issue: NVIDIA/cccl#747
This calls the
copy_if
in chunks if the iterator can span greater than int-max.Closes #12058
Checklist