Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not include arm_neon.h when running under nvcc #1028

Merged
merged 1 commit into from
Nov 26, 2024

Conversation

frankier
Copy link
Contributor

I'm having trouble building llama.cpp with ARM and CUDA. The reason seems to be that the build process ends up trying to compile something with nvcc which includes the arm_neon.h header which uses a bunch of NEON intrinsics that nvcc doesn't know about.

See here: https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=1090459&view=logs&jobId=d93976e3-6ed1-588b-ebf8-e8a19c2becc4&j=d93976e3-6ed1-588b-ebf8-e8a19c2becc4&t=9243320e-a173-5427-7ff1-0cd57d39b91c

Following abseil/abseil-cpp#1665 I've changed the code to conditionally include the header.

src/ggml-impl.h Outdated
@@ -14,7 +14,7 @@
#include <arm_sve.h>
#endif // __ARM_FEATURE_SVE

#if defined(__ARM_NEON)
#if defined(__ARM_NEON) && !(defined(__NVCC__) && defined(__CUDACC__))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any case where __CUDACC__ is defined and __NVCC__ is not?

Copy link
Contributor Author

@frankier frankier Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is following the recommendation of abseil/abseil-cpp#1665 (comment)

I guess per https://llvm.org/docs/CompileCudaWithLLVM.html#detecting-clang-vs-nvcc-from-code we will have defined(__CUDACC__) && !defined(__NVCC__) when compiling with Clang. I'm not really sure whether we want to avoid including the header or not in this case. Do we ever want the header when compiling CUDA code? Probably not? Shall we just go with !defined(__CUDACC__) then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've change it so it won't include the header when compiling CUDA code under either GCC or Clang.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if it necessary for clang, but in any case this code is not actually used in the CUDA backend, so disabling it entirely for all CUDA compilation is ok. In the future I will probably move this to the CPU backend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants