-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]: mismatch creates an object on a device after #3591
Comments
I can reproduce the issue in a slightly different form (calling |
Using nvc++ 25.1 and your
shows successful compilation. Using nvc++ 24.5, I got:
Can you please give me the exact CCCL git commit that you used to encounter this error? |
I'm using nvc++ dev version, that tracks the CCCL main branch. In contrast, nvc++ release versions use CCCL release branches, so they don’t include this commit. The exact CCCL commit causing the failure is:
Before this commit, everything worked fine with:
|
I confirm that commit 35df3a9 causes the failure. After cloning the CCCL main branch and checking out this commit, I can reproduce the failure with nvc++ 25.1. However, when switching to cebb54c, the failure no longer occurs.
|
Is this a duplicate?
Type of Bug
Compile-time Error
Component
Thrust
Describe the bug
For this example:
The NVC++ stdpar fails with:
This is a new failure that's being observed after this commit 35df3a9.
The mismatch algorithm should only inspect existing elements of a container, not create new elements. In this test, since Wrapper has a constructor/destructor accessing static storage object, it gets created on the device after this commitL and causes NVC++ stdpar to fail.
It looks like the recent change in this header file mismatch.h in the commit 35df3a9 could be potentially causing the creation of a cuda::std::tuple object, where one of the types inside the tuple is Wrapper<int32_t> in the smaller test case.
How to Reproduce
nvc++ -stdpar -Ofast --c++17 -c test.cpp
Expected behavior
"test.cpp", line 22: error: global or namespace scope variables such as "Wrapper::my_count [with T=int32_t]" (declared at line 26) cannot be accessed from device code
function "Wrapper::
Wrapper [with T=int32_t]" is implicitly a device function because it is called from device function "cuda::std::__4::__tuple_leaf<_Ip, _Hp, cuda::std::__4::__tuple_leaf_specialization::__default>::__tuple_leaf [with _Ip=0UL, _Hp=Wrapper<int32_t>]" (declared implicitly)--my_count;
^
Reproduction link
No response
Operating System
No response
nvidia-smi output
No response
NVCC version
No response
The text was updated successfully, but these errors were encountered: