You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Thrust/CUB have tests that explicitly exercise "CUDA Dynamic Parallelism (CDP)", i.e., launching a kernel from within another kernel.
These tests were added at a time when cudaDeviceSynchronize() from device code was supported (i.e., CDPv1).
However, CDPv1 has been replaced with CDPv2 that no longer supports device-side synchronization. Furthermore, newer architectures no longer support CDPv1 (e.g., sm90). As a result, Thrust/CUB implemented cmake logic to filter out architectures that don't support CDPv1, which has lead to issues.
With the introduction of CDPv2, CUB/Thrust were refactored to no longer rely on device-side synchronization.
CUB was refactored to remove synchronization and use tail kernels launches instead
Synchronization could not be removed from Thrust algorithms, and so Thrust was updated such that thrust::device in device code fell back to thrust::seq
As a result, we no longer have tests that exercise CDPv1. Therefore, we no longer need to filter out the architecture list.
Describe the solution you'd like
Remove the architecture filtering logic from Thrust/CUB cmake files.
Thrust/CUB tests should all work, but there may be lingering issues to clean up as a result of removing the filtering logic.
Tasks
Remove CDP filtering/target logic from CUB cmake
Remove CDP filtering/target logic from Thrust cmake
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered:
Is this a duplicate?
Area
General CCCL
Is your feature request related to a problem? Please describe.
Thrust/CUB have tests that explicitly exercise "CUDA Dynamic Parallelism (CDP)", i.e., launching a kernel from within another kernel.
These tests were added at a time when
cudaDeviceSynchronize()
from device code was supported (i.e., CDPv1).However, CDPv1 has been replaced with CDPv2 that no longer supports device-side synchronization. Furthermore, newer architectures no longer support CDPv1 (e.g., sm90). As a result, Thrust/CUB implemented cmake logic to filter out architectures that don't support CDPv1, which has lead to issues.
With the introduction of CDPv2, CUB/Thrust were refactored to no longer rely on device-side synchronization.
thrust::device
in device code fell back tothrust::seq
As a result, we no longer have tests that exercise CDPv1. Therefore, we no longer need to filter out the architecture list.
Describe the solution you'd like
Remove the architecture filtering logic from Thrust/CUB cmake files.
Thrust/CUB tests should all work, but there may be lingering issues to clean up as a result of removing the filtering logic.
Tasks
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: