Do not include arm_neon.h when running under nvcc #1028

frankier · 2024-11-26T09:22:18Z

I'm having trouble building llama.cpp with ARM and CUDA. The reason seems to be that the build process ends up trying to compile something with nvcc which includes the arm_neon.h header which uses a bunch of NEON intrinsics that nvcc doesn't know about.

See here: https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=1090459&view=logs&jobId=d93976e3-6ed1-588b-ebf8-e8a19c2becc4&j=d93976e3-6ed1-588b-ebf8-e8a19c2becc4&t=9243320e-a173-5427-7ff1-0cd57d39b91c

Following abseil/abseil-cpp#1665 I've changed the code to conditionally include the header.

slaren · 2024-11-26T12:21:25Z

src/ggml-impl.h

@@ -14,7 +14,7 @@
 #include <arm_sve.h>
 #endif // __ARM_FEATURE_SVE

-#if defined(__ARM_NEON)
+#if defined(__ARM_NEON) && !(defined(__NVCC__) && defined(__CUDACC__))


Is there any case where __CUDACC__ is defined and __NVCC__ is not?

This is following the recommendation of abseil/abseil-cpp#1665 (comment)

I guess per https://llvm.org/docs/CompileCudaWithLLVM.html#detecting-clang-vs-nvcc-from-code we will have defined(__CUDACC__) && !defined(__NVCC__) when compiling with Clang. I'm not really sure whether we want to avoid including the header or not in this case. Do we ever want the header when compiling CUDA code? Probably not? Shall we just go with !defined(__CUDACC__) then?

Okay, I've change it so it won't include the header when compiling CUDA code under either GCC or Clang.

I am not sure if it necessary for clang, but in any case this code is not actually used in the CUDA backend, so disabling it entirely for all CUDA compilation is ok. In the future I will probably move this to the CPU backend.

frankier mentioned this pull request Nov 26, 2024

Update 4153 conda-forge/llama.cpp-feedstock#32

Open

5 tasks

slaren reviewed Nov 26, 2024

View reviewed changes

Do not include arm_neon.h when compiling CUDA code

808733f

frankier force-pushed the no-neon-header-nvcc branch from 88b306f to 808733f Compare November 26, 2024 13:39

slaren merged commit 0f3203c into ggerganov:master Nov 26, 2024

slaren mentioned this pull request Nov 26, 2024

Compile bug: ARM neon instructions when compiling cuda ggerganov/llama.cpp#10531

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not include arm_neon.h when running under nvcc #1028

Do not include arm_neon.h when running under nvcc #1028

frankier commented Nov 26, 2024

slaren Nov 26, 2024

frankier Nov 26, 2024 •

edited

Loading

frankier Nov 26, 2024

slaren Nov 26, 2024

Do not include arm_neon.h when running under nvcc #1028

Do not include arm_neon.h when running under nvcc #1028

Conversation

frankier commented Nov 26, 2024

slaren Nov 26, 2024

Choose a reason for hiding this comment

frankier Nov 26, 2024 • edited Loading

Choose a reason for hiding this comment

frankier Nov 26, 2024

Choose a reason for hiding this comment

slaren Nov 26, 2024

Choose a reason for hiding this comment

frankier Nov 26, 2024 •

edited

Loading