You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
thrust::inclusive_scan does not work properly on certain vector sizes and seems to stop midway e.g. [1,2,3,x,x,...,x] (x being the value the result vector was initialized with)
After some experiments it seems that -rdc=true causes this behaviour. (changing from debug to release results in the exact same results)
After some testing i've got the same results for 12.5 Patch 1 and 12.3 Patch 2 (didn't check 12.4).
The same behaviour could also be triggered without -rdc=true but instead -ewp.
The issue does not appear if i set SM>=60, compute>=60, which if left empty defaults to SM_52, compute_52.
The issue is also present for SM_53, compute_53.
I've tested all configurations of 52<=SM<=89, 52<=compute<=89
Is this a duplicate?
Type of Bug
Silent Failure
Component
Thrust
Describe the bug
thrust::inclusive_scan does not work properly on certain vector sizes and seems to stop midway e.g. [1,2,3,x,x,...,x] (x being the value the result vector was initialized with)
After some experiments it seems that -rdc=true causes this behaviour. (changing from debug to release results in the exact same results)
Visual Studio Community 2022
-Version 17.11.4
CUDA 12.6 Update 1
How to Reproduce
Log generated on my system:
Expected behavior
inclusive_scan should work no matter the status of GenerateRelocatableDeviceCode
Reproduction link
https://github.com/Olli1080/Cuda-Bug.git
Operating System
Windows 11 Pro 10.0.22631
nvidia-smi output
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.94 Driver Version: 560.94 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4090 WDDM | 00000000:01:00.0 On | Off |
| 0% 38C P8 22W / 450W | 1460MiB / 24564MiB | 4% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
NVCC version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Wed_Aug_14_10:26:51_Pacific_Daylight_Time_2024
Cuda compilation tools, release 12.6, V12.6.68
Build cuda_12.6.r12.6/compiler.34714021_0
The text was updated successfully, but these errors were encountered: