Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: CuPy cannot be built with CCCL v2.3.1 #1493

Closed
1 task done
leofang opened this issue Mar 6, 2024 · 9 comments
Closed
1 task done

[BUG]: CuPy cannot be built with CCCL v2.3.1 #1493

leofang opened this issue Mar 6, 2024 · 9 comments
Labels
bug Something isn't working right.

Comments

@leofang
Copy link
Member

leofang commented Mar 6, 2024

Is this a duplicate?

Type of Bug

Compile-time Error

Component

Thrust

Describe the bug

Build time failure:

    Error limit reached.
    100 errors detected in the compilation of "cupy/cuda/cupy_thrust.cu".
    Compilation terminated.
    /home/leof/dev/cupy_cuda122/cupy/_core/include/cupy/_cccl/thrust/thrust/detail/type_traits.h(553): error: "thrust::detail" is ambiguous
          : thrust::detail::eval_if<

This happens between v2.2.0 and v2.3.1. From git bisect, it seems the offending commit is e21f700

$ git bisect bad
e21f700c44d6d4c8af8f7f6b38d5b2f650f25764 is the first bad commit
commit e21f700c44d6d4c8af8f7f6b38d5b2f650f25764
Author: Georgy Evtushenko <[email protected]>
Date:   Mon Sep 11 22:10:56 2023 +0000

    Use inline arch namespace in Thrust

 thrust/thrust/detail/config/namespace.h            | 87 +++++++++++++++++++++-
 .../system/cuda/detail/core/agent_launcher.h       |  6 +-
 2 files changed, 88 insertions(+), 5 deletions(-)

cc: @gevtushenko for vis

How to Reproduce

Build CuPy from source, with the CCCL submodule changed to v2.3.1

  1. git clone --recursive https://github.com/cupy/cupy.git
  2. cd cupy; git submodule update --init --recursive
  3. cd third_party/cccl
  4. git fetch --tags origin
  5. git checkout v2.3.1
  6. cd ../..
  7. pip install -v -e .

Expected behavior

Build success

Reproduction link

No response

Operating System

No response

nvidia-smi output

No response

NVCC version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:02:13_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0
@leofang leofang added the bug Something isn't working right. label Mar 6, 2024
@github-project-automation github-project-automation bot moved this to Todo in CCCL Mar 6, 2024
@leofang
Copy link
Member Author

leofang commented Mar 6, 2024

From the build log, I think this was the full command invoked (obtained at the parent commit of the offending one):

    Command: ['/usr/local/cuda-12.2.2/bin/nvcc', '-D_FORCE_INLINES=1', '-DCUPY_CACHE_KEY=ab177d42e646c2aa932d5d5fac58c69106a9cac1', '-DCUPY_CUB_VERSION_CODE=200200', '-DCUPY_JITIFY_VERSION_CODE=1774a3ba5', '-I/home/leof/dev/cupy_cuda122/cupy/_core/include/cupy/_cccl/libcudacxx', '-I/home/leof/dev/cupy_cuda122/cupy/_core/include/cupy/_cccl/thrust', '-I/home/leof/dev/cupy_cuda122/cupy/_core/include/cupy/_cccl/cub', '-I/home/leof/dev/cupy_cuda122/cupy/_core/include', '-I/usr/local/cuda-12.2.2/include', '-c', 'cupy/cuda/cupy_thrust.cu', '-o', 'build/temp.device_objects/cupy/cuda/cupy_thrust.cu.o', '--generate-code=arch=compute_86,code=sm_86', '--generate-code=arch=compute_89,code=sm_89', '-Xfatbin=-compress-all', '-O2', '--compiler-options="-fPIC"', '--std=c++14', '-t2', '-Xcompiler=-fno-gnu-unique']

@miscco
Copy link
Collaborator

miscco commented Mar 6, 2024

This looks like there is an additional namespace thrust::detail defined somewhere. Could that be in your project?

@leofang
Copy link
Member Author

leofang commented Mar 6, 2024

I did suspect it could be the case, but I don't think I've seen it done, nor did I find anything via grep. Not sure if I missed something.

Also, the offending commit looks innocent to me and it's unclear to me why this error could be triggered... I was scratching my head around.

@miscco
Copy link
Collaborator

miscco commented Mar 6, 2024

Oh the commit is definitely causing the issue.

The issue is that nvcc has certain issues with nested namespaces.

For nvcc there is difference between inline namespace meow { namespace thrust { namespace detail and namespace thrust { namespace detail

The commit introduced a versioning namespace that ensures that we do not accidentally mix kernels from different cccl versions.

@miscco
Copy link
Collaborator

miscco commented Mar 6, 2024

Now that I think about it my guess is that you have conflicting versions of thrust in your build. One will have the inline namespace one not.

That would create exactly that kind of issue

@miscco
Copy link
Collaborator

miscco commented Mar 6, 2024

So I tried and build according to your reproducer and it does indeed fail.

I den tried to replace the offending thrust::detail qualifier with THRUST_NS_QUALIFIER::detail which should include the inline namespace identifier.

However, when actually building cupy it then expanded to plain ::thrust::detail

That suggests that there are different versions of thrust at play

@leofang
Copy link
Member Author

leofang commented Mar 6, 2024

I might have some clue about this. CuPy vendored very old thrust::complex headers which do come with a detail namespace. Getting rid of it is a long-term goal, in the meanwhile any suggestion for me to quickly work around?

@miscco
Copy link
Collaborator

miscco commented Mar 6, 2024

@leofang I have opened cupy/cupy#8221

I believe that should address your issue

@leofang
Copy link
Member Author

leofang commented Mar 6, 2024

CuPy vendored very old thrust::complex headers which do come with a detail namespace. Getting rid of it is a long-term goal

Tracked here: cupy/cupy#8222

@leofang leofang closed this as completed Dec 4, 2024
@github-project-automation github-project-automation bot moved this from Todo to Done in CCCL Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working right.
Projects
Archived in project
Development

No branches or pull requests

2 participants