-
Notifications
You must be signed in to change notification settings - Fork 928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workaround thrust-copy-if limit in json get_tree_representation #12190
Workaround thrust-copy-if limit in json get_tree_representation #12190
Conversation
Codecov ReportBase: 88.37% // Head: 88.18% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## branch-23.02 #12190 +/- ##
================================================
- Coverage 88.37% 88.18% -0.19%
================================================
Files 137 137
Lines 22657 22657
================================================
- Hits 20022 19981 -41
- Misses 2635 2676 +41
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the workaround. This looks good. There should be effectively "zero" overhead in cases when we're running for less than INT_MAX
items.
nit: would we want to put this into some common ground for other places to use, where we could run into scenarios that exceed INT_MAX
problem sizes (e.g., the wordpiece tokenizer)?
I was considering that. I think the other 2 places this is occurs will be fixed differently in the future (not require the copy-if). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small not on comment typo, but otherwise looks good.
* | ||
* Workaround for thrust::copy_if bug (https://github.com/NVIDIA/thrust/issues/1302) | ||
* where it cannot iterate over int-max values `distance(first,last) > int-max` | ||
* This calls thrust::copy_if in 2B chunks instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2G
OutputIterator thrust_copy_if(rmm::exec_policy policy, | ||
InputIterator first, | ||
InputIterator last, | ||
StencilIterator stencil, | ||
OutputIterator result, | ||
Predicate pred) | ||
{ | ||
auto const copy_size = std::min(static_cast<std::size_t>(std::distance(first, last)), | ||
static_cast<std::size_t>(std::numeric_limits<int>::max())); | ||
|
||
auto itr = first; | ||
while (itr != last) { | ||
auto const copy_end = | ||
static_cast<std::size_t>(std::distance(itr, last)) <= copy_size ? last : itr + copy_size; | ||
result = thrust::copy_if(policy, itr, copy_end, stencil, result, pred); | ||
stencil += std::distance(itr, copy_end); | ||
itr = copy_end; | ||
} | ||
return result; | ||
} | ||
|
||
template <typename InputIterator, typename OutputIterator, typename Predicate> | ||
OutputIterator thrust_copy_if(rmm::exec_policy policy, | ||
InputIterator first, | ||
InputIterator last, | ||
OutputIterator result, | ||
Predicate pred) | ||
{ | ||
return thrust_copy_if(policy, first, last, first, result, pred); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, I assume that you're going to extract this out into a common utility file and call it across many places (in a future follow-up PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look into moving this in a follow-up PR as suggested.
@gpucibot merge |
Description
Workaround in json's get_tree_representation due to limitation in
thrust::copy_if
which fails if the input-iterator spans more than int-max.Found existing thrust issue: NVIDIA/cccl#747
This calls the thrust::copy_if in chunks if the iterator can span greater than int-max.
Found while working on #12079
Checklist