Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move remaining CUB policy hubs to tuning headers #3141

Merged
merged 5 commits into from
Dec 12, 2024

Conversation

bernhardmgruber
Copy link
Contributor

Fixes: #3097

Copy link
Collaborator

@miscco miscco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor question about SelectedPolicy vs PolicyHub

@bernhardmgruber bernhardmgruber enabled auto-merge (squash) December 12, 2024 18:45
Copy link
Contributor

🟩 CI finished in 2h 52m: Pass: 100%/94 | Total: 2d 15h | Avg: 40m 37s | Max: 1h 30m | Hits: 69%/12384
  • 🟩 thrust: Pass: 100%/46 | Total: 1d 01h | Avg: 33m 07s | Max: 1h 05m | Hits: 70%/9260

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 44m 57s | Avg: 22m 28s | Max: 29m 58s
    🟩 cpu
      🟩 amd64              Pass: 100%/44  | Total:  1d 00h | Avg: 33m 11s | Max:  1h 05m | Hits:  70%/9260  
      🟩 arm64              Pass: 100%/2   | Total:  1h 03m | Avg: 31m 39s | Max: 33m 48s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  3h 34m | Avg: 30m 42s | Max: 55m 25s | Hits:  63%/1852  
      🟩 12.5               Pass: 100%/2   | Total:  1h 50m | Avg: 55m 02s | Max: 57m 39s
      🟩 12.6               Pass: 100%/37  | Total: 19h 58m | Avg: 32m 23s | Max:  1h 05m | Hits:  72%/7408  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 59m 45s | Avg: 29m 52s | Max: 29m 53s
      🟩 nvcc11.1           Pass: 100%/7   | Total:  3h 34m | Avg: 30m 42s | Max: 55m 25s | Hits:  63%/1852  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 50m | Avg: 55m 02s | Max: 57m 39s
      🟩 nvcc12.6           Pass: 100%/35  | Total: 18h 59m | Avg: 32m 32s | Max:  1h 05m | Hits:  72%/7408  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 59m 45s | Avg: 29m 52s | Max: 29m 53s
      🟩 nvcc               Pass: 100%/44  | Total:  1d 00h | Avg: 33m 16s | Max:  1h 05m | Hits:  70%/9260  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  1h 56m | Avg: 29m 03s | Max: 34m 49s
      🟩 Clang10            Pass: 100%/1   | Total: 35m 01s | Avg: 35m 01s | Max: 35m 01s
      🟩 Clang11            Pass: 100%/1   | Total: 32m 44s | Avg: 32m 44s | Max: 32m 44s
      🟩 Clang12            Pass: 100%/1   | Total: 31m 29s | Avg: 31m 29s | Max: 31m 29s
      🟩 Clang13            Pass: 100%/1   | Total: 30m 16s | Avg: 30m 16s | Max: 30m 16s
      🟩 Clang14            Pass: 100%/1   | Total: 32m 28s | Avg: 32m 28s | Max: 32m 28s
      🟩 Clang15            Pass: 100%/1   | Total: 35m 22s | Avg: 35m 22s | Max: 35m 22s
      🟩 Clang16            Pass: 100%/1   | Total: 31m 16s | Avg: 31m 16s | Max: 31m 16s
      🟩 Clang17            Pass: 100%/1   | Total: 36m 12s | Avg: 36m 12s | Max: 36m 12s
      🟩 Clang18            Pass: 100%/7   | Total:  3h 04m | Avg: 26m 20s | Max: 36m 07s
      🟩 GCC6               Pass: 100%/2   | Total: 51m 06s | Avg: 25m 33s | Max: 27m 04s
      🟩 GCC7               Pass: 100%/2   | Total: 57m 29s | Avg: 28m 44s | Max: 32m 07s
      🟩 GCC8               Pass: 100%/1   | Total: 35m 39s | Avg: 35m 39s | Max: 35m 39s
      🟩 GCC9               Pass: 100%/3   | Total:  1h 29m | Avg: 29m 59s | Max: 33m 56s
      🟩 GCC10              Pass: 100%/1   | Total: 36m 04s | Avg: 36m 04s | Max: 36m 04s
      🟩 GCC11              Pass: 100%/1   | Total: 34m 21s | Avg: 34m 21s | Max: 34m 21s
      🟩 GCC12              Pass: 100%/1   | Total: 35m 43s | Avg: 35m 43s | Max: 35m 43s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 17m | Avg: 24m 38s | Max: 40m 28s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 44m 26s | Avg: 44m 26s | Max: 44m 26s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 55m 25s | Avg: 55m 25s | Max: 55m 25s | Hits:  63%/1852  
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 02m | Avg:  1h 02m | Max:  1h 02m | Hits:  63%/1852  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 28m | Avg: 49m 36s | Max:  1h 05m | Hits:  75%/5556  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 50m | Avg: 55m 02s | Max: 57m 39s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  9h 25m | Avg: 29m 45s | Max: 36m 12s
      🟩 GCC                Pass: 100%/19  | Total:  8h 57m | Avg: 28m 17s | Max: 40m 28s
      🟩 Intel              Pass: 100%/1   | Total: 44m 26s | Avg: 44m 26s | Max: 44m 26s
      🟩 MSVC               Pass: 100%/5   | Total:  4h 26m | Avg: 53m 18s | Max:  1h 05m | Hits:  70%/9260  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 50m | Avg: 55m 02s | Max: 57m 39s
    🟩 gpu
      🟩 v100               Pass: 100%/46  | Total:  1d 01h | Avg: 33m 07s | Max:  1h 05m | Hits:  70%/9260  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total: 23h 56m | Avg: 35m 54s | Max:  1h 05m | Hits:  63%/7408  
      🟩 TestCPU            Pass: 100%/3   | Total: 36m 52s | Avg: 12m 17s | Max: 21m 35s | Hits:  99%/1852  
      🟩 TestGPU            Pass: 100%/3   | Total: 50m 19s | Avg: 16m 46s | Max: 17m 52s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 19m 52s | Avg: 19m 52s | Max: 19m 52s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  2h 05m | Avg: 25m 10s | Max: 28m 58s
      🟩 14                 Pass: 100%/4   | Total:  2h 29m | Avg: 37m 21s | Max: 55m 25s | Hits:  63%/1852  
      🟩 17                 Pass: 100%/12  | Total:  8h 02m | Avg: 40m 11s | Max:  1h 02m | Hits:  63%/3704  
      🟩 20                 Pass: 100%/23  | Total: 12h 01m | Avg: 31m 21s | Max:  1h 05m | Hits:  81%/3704  
    
  • 🟩 cub: Pass: 100%/45 | Total: 1d 13h | Avg: 50m 05s | Max: 1h 30m | Hits: 65%/3124

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  1d 11h | Avg: 49m 49s | Max:  1h 30m | Hits:  65%/3124  
      🟩 arm64              Pass: 100%/2   | Total:  1h 51m | Avg: 55m 46s | Max: 56m 26s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  5h 30m | Avg: 47m 14s | Max: 55m 17s | Hits:  65%/781   
      🟩 12.5               Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 06m
      🟩 12.6               Pass: 100%/36  | Total:  1d 05h | Avg: 49m 53s | Max:  1h 30m | Hits:  65%/2343  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 54m | Avg: 57m 07s | Max: 57m 51s
      🟩 nvcc11.1           Pass: 100%/7   | Total:  5h 30m | Avg: 47m 14s | Max: 55m 17s | Hits:  65%/781   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 06m
      🟩 nvcc12.6           Pass: 100%/34  | Total:  1d 04h | Avg: 49m 27s | Max:  1h 30m | Hits:  65%/2343  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 54m | Avg: 57m 07s | Max: 57m 51s
      🟩 nvcc               Pass: 100%/43  | Total:  1d 11h | Avg: 49m 46s | Max:  1h 30m | Hits:  65%/3124  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  3h 10m | Avg: 47m 30s | Max: 50m 56s
      🟩 Clang10            Pass: 100%/1   | Total: 53m 31s | Avg: 53m 31s | Max: 53m 31s
      🟩 Clang11            Pass: 100%/1   | Total: 52m 51s | Avg: 52m 51s | Max: 52m 51s
      🟩 Clang12            Pass: 100%/1   | Total: 50m 08s | Avg: 50m 08s | Max: 50m 08s
      🟩 Clang13            Pass: 100%/1   | Total: 52m 29s | Avg: 52m 29s | Max: 52m 29s
      🟩 Clang14            Pass: 100%/1   | Total: 52m 48s | Avg: 52m 48s | Max: 52m 48s
      🟩 Clang15            Pass: 100%/1   | Total: 50m 01s | Avg: 50m 01s | Max: 50m 01s
      🟩 Clang16            Pass: 100%/1   | Total: 55m 14s | Avg: 55m 14s | Max: 55m 14s
      🟩 Clang17            Pass: 100%/1   | Total: 54m 39s | Avg: 54m 39s | Max: 54m 39s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 21m | Avg: 45m 52s | Max: 57m 51s
      🟩 GCC6               Pass: 100%/2   | Total:  1h 32m | Avg: 46m 02s | Max: 46m 06s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 40m | Avg: 50m 00s | Max: 50m 31s
      🟩 GCC8               Pass: 100%/1   | Total: 51m 09s | Avg: 51m 09s | Max: 51m 09s
      🟩 GCC9               Pass: 100%/3   | Total:  2h 32m | Avg: 50m 56s | Max: 58m 19s
      🟩 GCC10              Pass: 100%/1   | Total: 54m 47s | Avg: 54m 47s | Max: 54m 47s
      🟩 GCC11              Pass: 100%/1   | Total: 54m 14s | Avg: 54m 14s | Max: 54m 14s
      🟩 GCC12              Pass: 100%/1   | Total: 52m 15s | Avg: 52m 15s | Max: 52m 15s
      🟩 GCC13              Pass: 100%/8   | Total:  5h 40m | Avg: 42m 37s | Max:  1h 30m
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 58m 02s | Avg: 58m 02s | Max: 58m 02s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 55m 17s | Avg: 55m 17s | Max: 55m 17s | Hits:  65%/781   
      🟩 MSVC14.29          Pass: 100%/1   | Total: 58m 06s | Avg: 58m 06s | Max: 58m 06s | Hits:  65%/781   
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 04m | Avg:  1h 02m | Max:  1h 05m | Hits:  65%/1562  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 06m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total: 15h 32m | Avg: 49m 05s | Max: 57m 51s
      🟩 GCC                Pass: 100%/19  | Total: 14h 58m | Avg: 47m 16s | Max:  1h 30m
      🟩 Intel              Pass: 100%/1   | Total: 58m 02s | Avg: 58m 02s | Max: 58m 02s
      🟩 MSVC               Pass: 100%/4   | Total:  3h 57m | Avg: 59m 21s | Max:  1h 05m | Hits:  65%/3124  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 06m
    🟩 gpu
      🟩 v100               Pass: 100%/45  | Total:  1d 13h | Avg: 50m 05s | Max:  1h 30m | Hits:  65%/3124  
    🟩 jobs
      🟩 Build              Pass: 100%/39  | Total:  1d 10h | Avg: 52m 43s | Max:  1h 06m | Hits:  65%/3124  
      🟩 DeviceLaunch       Pass: 100%/1   | Total:  1h 30m | Avg:  1h 30m | Max:  1h 30m
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 57s | Avg: 16m 57s | Max: 16m 57s
      🟩 HostLaunch         Pass: 100%/2   | Total: 41m 17s | Avg: 20m 38s | Max: 21m 57s
      🟩 TestGPU            Pass: 100%/2   | Total: 48m 45s | Avg: 24m 22s | Max: 26m 10s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 24m 04s | Avg: 24m 04s | Max: 24m 04s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  3h 55m | Avg: 47m 07s | Max: 50m 14s
      🟩 14                 Pass: 100%/4   | Total:  3h 22m | Avg: 50m 40s | Max: 55m 17s | Hits:  65%/781   
      🟩 17                 Pass: 100%/12  | Total: 11h 03m | Avg: 55m 16s | Max:  1h 06m | Hits:  65%/1562  
      🟩 20                 Pass: 100%/24  | Total: 19h 12m | Avg: 48m 01s | Max:  1h 30m | Hits:  65%/781   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 9m 05s | Avg: 4m 32s | Max: 7m 09s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 05s | Avg:  4m 32s | Max:  7m 09s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  9m 05s | Avg:  4m 32s | Max:  7m 09s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  9m 05s | Avg:  4m 32s | Max:  7m 09s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  9m 05s | Avg:  4m 32s | Max:  7m 09s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  9m 05s | Avg:  4m 32s | Max:  7m 09s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  9m 05s | Avg:  4m 32s | Max:  7m 09s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total:  9m 05s | Avg:  4m 32s | Max:  7m 09s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  1m 56s | Avg:  1m 56s | Max:  1m 56s
      🟩 Test               Pass: 100%/1   | Total:  7m 09s | Avg:  7m 09s | Max:  7m 09s
    
  • 🟩 python: Pass: 100%/1 | Total: 31m 54s | Avg: 31m 54s | Max: 31m 54s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 31m 54s | Avg: 31m 54s | Max: 31m 54s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 31m 54s | Avg: 31m 54s | Max: 31m 54s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 31m 54s | Avg: 31m 54s | Max: 31m 54s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 31m 54s | Avg: 31m 54s | Max: 31m 54s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 31m 54s | Avg: 31m 54s | Max: 31m 54s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 31m 54s | Avg: 31m 54s | Max: 31m 54s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 31m 54s | Avg: 31m 54s | Max: 31m 54s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 31m 54s | Avg: 31m 54s | Max: 31m 54s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 94)

# Runner
70 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16

@bernhardmgruber bernhardmgruber merged commit 5141553 into NVIDIA:main Dec 12, 2024
112 checks passed
@bernhardmgruber bernhardmgruber deleted the tune_headers branch December 12, 2024 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Move CUB tunings to dedicated headers
2 participants