Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: The CUDA SDK defines the reserved identifier __noinline__, breaking Clang and GCC interoperation #1235

Open
1 task done
ldionne opened this issue Dec 20, 2023 · 8 comments
Labels
bug Something isn't working right.

Comments

@ldionne
Copy link

ldionne commented Dec 20, 2023

Is this a duplicate?

Type of Bug

Compile-time Error

Component

Not sure

Describe the bug

I maintain libc++, the C++ Standard Library shipped with LLVM / Clang. We recently received a bug report explaining that using Clang (and libc++ in particular) with the CUDA SDK didn't work anymore, because the CUDA SDK is defining __noinline__ to __attribute__((__noinline__)) for convenience and that conflicts with libc++'s usage of __attribute__((__noinline__)).

This is both non-standard and poor practice on the CUDA SDK's side -- underscore names are reserved for the programming language implementation. It seems like this was reported in the past as NVIDIA/thrust#1703 but I'm not certain the problem was taken seriously.

I would like to gauge whether there is interest for migrating away from that macro and restoring proper interoperability between the CUDA SDK and Clang, GCC and their standard library implementations. If you can establish a migration path away from the macro, libc++ can work around the issue in the meantime to avoid leaving users stranded. However, we would like to have a commitment from CUDA that a migration path will be created to fix the problem in the long term -- otherwise libc++ would just be bending backwards to work around arbitrary vendor's decisions forever, and that is not workable for us.

Note that while this problem is not widespread yet, it will start hitting anyone who updates to LLVM 18 because libc++ introduced new uses of __attribute__((__noinline__)). We expect this is going to become a fairly widespread problem if nothing is done.

Note: If this is not the right place to file a bug against the CUDA SDK, please let me know where I can do so. I am not a CUDA SDK user myself, but I am reaching out because I believe our two implementations working together well is important for the ecosystem.

How to Reproduce

#include <cuda/something>
#include <string>

Expected behavior

It compiles

Reproduction link

No response

Operating System

No response

nvidia-smi output

No response

NVCC version

No response

@ldionne ldionne added the bug Something isn't working right. label Dec 20, 2023
@github-project-automation github-project-automation bot moved this to Todo in CCCL Dec 20, 2023
@ldionne
Copy link
Author

ldionne commented Dec 20, 2023

CC @jrhemstad and @gevtushenko Since I think you chimed in on the Thrust bug referenced here.

@miscco
Copy link
Contributor

miscco commented Dec 20, 2023

HI @ldionne, thanks for reaching out.

I fear that this is the wrong place for the issue, as we have unfortunately nothing to do with the cuda runtime header.

I have reached out internally to find out who to contact regarding this issue. We really want to keep building with libc++

@ldionne
Copy link
Author

ldionne commented Dec 20, 2023

Ah, that's kind of what I expected. Thanks, I'll wait to hear back!

@miscco
Copy link
Contributor

miscco commented Dec 21, 2023

Hey @ldionne,

unfortunately the feedback that I got is that those identifiers will stay. From the viewpoint of the CUDA SDK it is the implementation of the CUDA programming language, so those identifiers are good to use.

As I understand it there are plans to address this on either within clang or the on the libc++ side.

@philnik777
Copy link

@miscco (I realize that arguing through a third party isn't great, but I don't think I have much of a choice. Sorry.) Even if the CUDA SDK is part of the implementation, why is it so hard to make things easier for other parts of the implementation? Since this is internal to the CUDA SDK, it should be trivial to find-and-replace __noinline__ with something like __cuda_noinline__. libc++ prefixes almost all internal macros with _LIBCPP to avoid clashing with other parts of the implementation, and I'm pretty sure we'd rename stuff if they interfere with other things. We've done that a lot with the type traits.

@ldionne
Copy link
Author

ldionne commented Dec 21, 2023

As I understand it there are plans to address this on either within clang or the on the libc++ side.

There would be plans to work around this issue until the CUDA SDK stops defining the identifier, yes. But since there seems to be no desire from the CUDA SDK to be compatible with other implementations (Clang and GCC, both of which allow the use of __attribute__((__noinline__))), that seriously raises the question of how hard we should try to bend backwards on our side.

Basically, we're willing to work around the issue temporarily while CUDA fixes the underlying problem to make our users lives better. But if CUDA itself doesn't care about those users, I don't think it's viable for us to start working around every wrench that might be thrown at us unilaterally. Frankly, that stance from the CUDA SDK seems hostile to other implementations and to the ecosystem as a whole.

Is there nobody from the CUDA SDK we can have a direct conversation with? It seems a bit awkward to have this here, through you.

@jrhemstad
Copy link
Collaborator

Hey @ldionne. CCCL doesn't own the header in question, but we've been aware of the problem for a while.

I've been working on making noise about this internally for a while. Lots of people out on holiday for the next week or two, but I'll report back as soon as I can.

@jrhemstad
Copy link
Collaborator

Uneventful update: I am still working on escalating this internally.

ldionne pushed a commit to llvm/llvm-project that referenced this issue Jan 22, 2024
…__ macro (#73838)

The CUDA SDK contains an unfortunate definition for the `__noinline__`
macro. This patch works around it by using `__attribute__((noinline))`
instead of `__attribute__((__noinline__))` on CUDA. We are still waiting
for a long-term resolution to this issue in NVIDIA/cccl#1235.
blueboxd pushed a commit to blueboxd/libcxx that referenced this issue Feb 7, 2024
…__ macro (#73838)

The CUDA SDK contains an unfortunate definition for the `__noinline__`
macro. This patch works around it by using `__attribute__((noinline))`
instead of `__attribute__((__noinline__))` on CUDA. We are still waiting
for a long-term resolution to this issue in NVIDIA/cccl#1235.

NOKEYCHECK=True
GitOrigin-RevId: 7378fb30645ad5398491acea3960a8115d1b171c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working right.
Projects
Status: Todo
Development

No branches or pull requests

4 participants