-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework builtin handling #2461
Rework builtin handling #2461
Conversation
a08dc63
to
0b20884
Compare
🟨 CI finished in 2h 31m: Pass: 93%/364 | Total: 6d 13h | Avg: 26m 00s | Max: 1h 27m | Hits: 13%/17361
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
+/- | CUDA Experimental |
pycuda | |
CUDA C Core Library |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | pycuda |
+/- | CUDA C Core Library |
🏃 Runner counts (total jobs: 364)
# | Runner |
---|---|
297 | linux-amd64-cpu16 |
28 | linux-arm64-cpu16 |
24 | linux-amd64-gpu-v100-latest-1 |
15 | windows-amd64-cpu16 |
d3f48b7
to
de72a0f
Compare
🟩 CI finished in 2h 18m: Pass: 100%/364 | Total: 1d 23h | Avg: 7m 53s | Max: 1h 47m | Hits: 79%/25671
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
+/- | CUDA Experimental |
pycuda | |
CUDA C Core Library |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | pycuda |
+/- | CUDA C Core Library |
🏃 Runner counts (total jobs: 364)
# | Runner |
---|---|
297 | linux-amd64-cpu16 |
28 | linux-arm64-cpu16 |
24 | linux-amd64-gpu-v100-latest-1 |
15 | windows-amd64-cpu16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vast improvement, thanks! just a few questions.
#if defined(_LIBCUDACXX_IS_BASE_OF) && !defined(_LIBCUDACXX_USE_IS_BASE_OF_FALLBACK) | ||
|
||
template <class _Set, class... _Ty> | ||
_CCCL_INLINE_VAR constexpr bool __mset_contains = (_LIBCUDACXX_IS_BASE_OF(__mtype<_Ty>, _Set) && ...); | ||
|
||
#else | ||
|
||
template <class _Set, class... _Ty> | ||
_CCCL_INLINE_VAR constexpr bool __mset_contains = (_CUDA_VSTD::is_base_of_v<__mtype<_Ty>, _Set> && ...); | ||
|
||
#endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as the comment above says, this is a particularly hot piece of meta-programming, so the direct use of the intrinsic here is intentional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no difference between what was there and what is now.
is_base_of_v is directly given by the compiler builtin without going through is_base::value, so both conditionals are exactly the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instantiating a variable template is not free. this instantiates N extra variable templates than using the intrinsic directly.
# define _CCCL_BUILTIN_OPERATOR_NEW(...) __builtin_operator_new(__VA_ARGS__) | ||
#endif // __check_builtin(__builtin_operator_new) && __check_builtin(__builtin_operator_delete) | ||
|
||
#if __has_builtin(__decay) && defined(_CCCL_CUDA_COMPILER_CLANG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why __has_builtin
here but __check_builtin
elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote that at the top.
The __check_builtin
facility is a clutch to work with older GCC / clang that do not support __has_builtin
However, that has some drawbacks, because the __has_keyword
or __has_feature
checks can lead to false positives.
So for all newer builtins that are only supported by new compilers anyhow, I am moving towards just using __has_builtin
@@ -62,6 +48,20 @@ struct __libcpp_is_member_pointer<_Tp _Up::*> | |||
}; | |||
}; | |||
|
|||
#if defined(_CCCL_BUILTIN_IS_MEMBER_FUNCTION_POINTER) && !defined(_LIBCUDACXX_USE_IS_MEMBER_FUNCTION_POINTER_FALLBACK) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should change the _LIBCUDACXX_USE_[...]_FALLBACK
macros to _CCCL_USE_[...]_FALLBACK
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was contemplating that as well, however this is explicitly for a type trait in libcu++ and not something that is generically available everywhere, so I decided to keep it as is.
Not a hill to die on though
# endif | ||
|
||
#else | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is __libcpp_is_member_pointer
defined unconditionally now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is just github formatting being bad.
We need the __libcpp_is_member_pointer
in other places, so I moved that to the top
# ifdef _LIBCUDACXX_IS_CONSTANT_EVALUATED // is_constant_evaluated only exists since GCC 9 | ||
# ifdef _CCCL_BUILTIN_IS_CONSTANT_EVALUATED // is_constant_evaluated only exists since GCC 9 | ||
if (__libcpp_is_constant_evaluated()) | ||
# endif // defined(_LIBCUDACXX_IS_CONSTANT_EVALUATED) | ||
# endif // defined(_CCCL_BUILTIN_IS_CONSTANT_EVALUATED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we using __libcpp_is_constant_evaluated()
instead of using _CCCL_BUILTIN_IS_CONSTANT_EVALUATED()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually in that case we could use it directly, most often __libcpp_is_constant_evaluated
is used without conditionals
@@ -40,7 +40,7 @@ _CCCL_NORETURN _LIBCUDACXX_ATTRIBUTE_FORMAT(__printf__, 1, 2) | |||
_LIBCUDACXX_HIDE_FROM_ABI void __libcpp_verbose_abort(const char*, ...) | |||
{ | |||
::abort(); | |||
__builtin_unreachable(); // never reached, but needed to tell the compiler that the function never returns | |||
_CCCL_UNREACHABLE(); // never reached, but needed to tell the compiler that the function never returns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this macro be called _CCCL_BUILTIN_UNREACHABLE()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unfortunately not a builtin but a pile of hacks that sometimes is a builtin
58caca7
to
c85690a
Compare
🟨 CI finished in 2h 30m: Pass: 97%/364 | Total: 6d 16h | Avg: 26m 22s | Max: 1h 22m | Hits: 10%/25671
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
+/- | CUDA Experimental |
pycuda | |
CUDA C Core Library |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | pycuda |
+/- | CUDA C Core Library |
🏃 Runner counts (total jobs: 364)
# | Runner |
---|---|
297 | linux-amd64-cpu16 |
28 | linux-arm64-cpu16 |
24 | linux-amd64-gpu-v100-latest-1 |
15 | windows-amd64-cpu16 |
c85690a
to
d56b494
Compare
🟩 CI finished in 3h 07m: Pass: 100%/364 | Total: 7d 18h | Avg: 30m 46s | Max: 1h 21m | Hits: 10%/25691
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
+/- | CUDA Experimental |
pycuda | |
CUDA C Core Library |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | pycuda |
+/- | CUDA C Core Library |
🏃 Runner counts (total jobs: 364)
# | Runner |
---|---|
297 | linux-amd64-cpu16 |
28 | linux-arm64-cpu16 |
24 | linux-amd64-gpu-v100-latest-1 |
15 | windows-amd64-cpu16 |
* Move builtin detection to its own file * Try to reenable more builtins * Address review comments
We are relying a lot on compiler builtins for libcu++, but there are a lot more and also it is all over the place.
This moves almost all of the builtin detection and fixes to a central CCCL header that provides it everywhere and also replaces most open coded uses with that.