Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internalize cub::KernelConfig #3683

Merged
merged 3 commits into from
Feb 5, 2025

Conversation

fbusato
Copy link
Contributor

@fbusato fbusato commented Feb 4, 2025

Fixes #3682

Description

move KernelConfig into detail namespace and deprecate external usage

Can be backported to 2.8

@fbusato fbusato added 3.0 Targeted for 3.0 release backport branch/2.8.x labels Feb 4, 2025
@fbusato fbusato self-assigned this Feb 4, 2025
@fbusato fbusato requested a review from a team as a code owner February 4, 2025 22:16
@fbusato fbusato enabled auto-merge (squash) February 4, 2025 23:09
Copy link
Contributor

github-actions bot commented Feb 5, 2025

🟩 CI finished in 1h 53m: Pass: 100%/90 | Total: 2d 15h | Avg: 42m 39s | Max: 1h 17m | Hits: 290%/12730
  • 🟩 cub: Pass: 100%/44 | Total: 1d 15h | Avg: 54m 10s | Max: 1h 17m | Hits: 355%/3500

    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  1d 13h | Avg: 54m 01s | Max:  1h 17m | Hits: 355%/3500  
      🟩 arm64              Pass: 100%/2   | Total:  1h 55m | Avg: 57m 31s | Max: 58m 42s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  5h 12m | Avg:  1h 02m | Max:  1h 13m | Hits: 356%/875   
      🟩 12.5               Pass: 100%/2   | Total:  2h 25m | Avg:  1h 12m | Max:  1h 17m
      🟩 12.8               Pass: 100%/37  | Total:  1d 08h | Avg: 52m 04s | Max:  1h 12m | Hits: 355%/2625  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 01m
      🟩 nvcc12.0           Pass: 100%/5   | Total:  5h 12m | Avg:  1h 02m | Max:  1h 13m | Hits: 356%/875   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 25m | Avg:  1h 12m | Max:  1h 17m
      🟩 nvcc12.8           Pass: 100%/35  | Total:  1d 06h | Avg: 51m 35s | Max:  1h 12m | Hits: 355%/2625  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 01m
      🟩 nvcc               Pass: 100%/42  | Total:  1d 13h | Avg: 53m 53s | Max:  1h 17m | Hits: 355%/3500  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 53m | Avg: 58m 24s | Max:  1h 04m
      🟩 Clang15            Pass: 100%/2   | Total:  1h 59m | Avg: 59m 58s | Max:  1h 00m
      🟩 Clang16            Pass: 100%/2   | Total:  1h 48m | Avg: 54m 19s | Max: 55m 47s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 54m | Avg: 57m 16s | Max: 58m 45s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 37m | Avg: 48m 14s | Max:  1h 01m
      🟩 GCC7               Pass: 100%/2   | Total:  1h 57m | Avg: 58m 42s | Max:  1h 00m
      🟩 GCC8               Pass: 100%/1   | Total: 55m 45s | Avg: 55m 45s | Max: 55m 45s
      🟩 GCC9               Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 13m
      🟩 GCC10              Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 01m
      🟩 GCC11              Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 07m
      🟩 GCC12              Pass: 100%/2   | Total:  1h 59m | Avg: 59m 41s | Max:  1h 04m
      🟩 GCC13              Pass: 100%/10  | Total:  6h 16m | Avg: 37m 41s | Max:  1h 06m
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 12m | Hits: 356%/1750  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 21m | Avg:  1h 10m | Max:  1h 12m | Hits: 355%/1750  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 25m | Avg:  1h 12m | Max:  1h 17m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 15h 14m | Avg: 53m 47s | Max:  1h 04m
      🟩 GCC                Pass: 100%/21  | Total: 17h 28m | Avg: 49m 54s | Max:  1h 13m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 36m | Avg:  1h 09m | Max:  1h 12m | Hits: 355%/3500  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 25m | Avg:  1h 12m | Max:  1h 17m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 50m 03s | Avg: 25m 01s | Max: 25m 37s
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 10h | Avg:  1h 01m | Max:  1h 17m | Hits: 355%/3500  
      🟩 rtxa6000           Pass: 100%/8   | Total:  4h 09m | Avg: 31m 10s | Max:  1h 02m
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 13h | Avg:  1h 00m | Max:  1h 17m | Hits: 355%/3500  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 20m 45s | Avg: 20m 45s | Max: 20m 45s
      🟩 GraphCapture       Pass: 100%/1   | Total: 17m 23s | Avg: 17m 23s | Max: 17m 23s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 15m | Avg: 25m 01s | Max: 26m 01s
      🟩 TestGPU            Pass: 100%/2   | Total: 41m 16s | Avg: 20m 38s | Max: 21m 47s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 50m 03s | Avg: 25m 01s | Max: 25m 37s
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 06m | Avg:  1h 06m | Max:  1h 06m
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 20h 31m | Avg:  1h 01m | Max:  1h 17m | Hits: 356%/2625  
      🟩 20                 Pass: 100%/24  | Total: 19h 12m | Avg: 48m 00s | Max:  1h 12m | Hits: 354%/875   
    
  • 🟩 thrust: Pass: 100%/43 | Total: 23h 40m | Avg: 33m 01s | Max: 1h 00m | Hits: 266%/9230

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 37m 07s | Avg: 18m 33s | Max: 26m 00s
    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total: 22h 42m | Avg: 33m 14s | Max:  1h 00m | Hits: 266%/9230  
      🟩 arm64              Pass: 100%/2   | Total: 57m 23s | Avg: 28m 41s | Max: 30m 28s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  3h 09m | Avg: 37m 52s | Max: 58m 12s | Hits: 241%/1846  
      🟩 12.5               Pass: 100%/2   | Total:  1h 43m | Avg: 51m 52s | Max: 52m 42s
      🟩 12.8               Pass: 100%/36  | Total: 18h 47m | Avg: 31m 18s | Max:  1h 00m | Hits: 272%/7384  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 59m 50s | Avg: 29m 55s | Max: 29m 57s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 09m | Avg: 37m 52s | Max: 58m 12s | Hits: 241%/1846  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 43m | Avg: 51m 52s | Max: 52m 42s
      🟩 nvcc12.8           Pass: 100%/34  | Total: 17h 47m | Avg: 31m 23s | Max:  1h 00m | Hits: 272%/7384  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 59m 50s | Avg: 29m 55s | Max: 29m 57s
      🟩 nvcc               Pass: 100%/41  | Total: 22h 40m | Avg: 33m 11s | Max:  1h 00m | Hits: 266%/9230  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 07m | Avg: 31m 50s | Max: 33m 11s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 06m | Avg: 33m 04s | Max: 33m 31s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 05m | Avg: 32m 31s | Max: 34m 13s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 09m | Avg: 34m 38s | Max: 35m 36s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 48m | Avg: 24m 00s | Max: 33m 47s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 02m | Avg: 31m 05s | Max: 32m 19s
      🟩 GCC8               Pass: 100%/1   | Total: 31m 33s | Avg: 31m 33s | Max: 31m 33s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 09m | Avg: 34m 34s | Max: 36m 01s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 05m | Avg: 32m 40s | Max: 33m 15s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 06m | Avg: 33m 15s | Max: 34m 11s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 05m | Avg: 32m 59s | Max: 34m 16s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 10m | Avg: 23m 45s | Max: 34m 34s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 53m | Avg: 56m 36s | Max: 58m 12s | Hits: 241%/3692  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 36m | Avg: 52m 14s | Max:  1h 00m | Hits: 282%/5538  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 43m | Avg: 51m 52s | Max: 52m 42s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  8h 15m | Avg: 29m 10s | Max: 35m 36s
      🟩 GCC                Pass: 100%/19  | Total:  9h 10m | Avg: 28m 59s | Max: 36m 01s
      🟩 MSVC               Pass: 100%/5   | Total:  4h 29m | Avg: 53m 59s | Max:  1h 00m | Hits: 266%/9230  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 43m | Avg: 51m 52s | Max: 52m 42s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/33  | Total: 19h 40m | Avg: 35m 46s | Max:  1h 00m | Hits: 241%/5538  
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 59m | Avg: 23m 57s | Max:  1h 00m | Hits: 303%/3692  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total: 22h 15m | Avg: 36m 05s | Max:  1h 00m | Hits: 241%/7384  
      🟩 TestCPU            Pass: 100%/3   | Total: 51m 41s | Avg: 17m 13s | Max: 35m 28s | Hits: 365%/1846  
      🟩 TestGPU            Pass: 100%/3   | Total: 33m 02s | Avg: 11m 00s | Max: 12m 07s
    🟩 sm
      🟩 90;90a;100         Pass: 100%/1   | Total: 33m 17s | Avg: 33m 17s | Max: 33m 17s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 12h 23m | Avg: 37m 09s | Max:  1h 00m | Hits: 241%/5538  
      🟩 20                 Pass: 100%/21  | Total: 10h 40m | Avg: 30m 29s | Max:  1h 00m | Hits: 303%/3692  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 8m 39s | Avg: 4m 19s | Max: 6m 14s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  8m 39s | Avg:  4m 19s | Max:  6m 14s
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total:  8m 39s | Avg:  4m 19s | Max:  6m 14s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total:  8m 39s | Avg:  4m 19s | Max:  6m 14s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  8m 39s | Avg:  4m 19s | Max:  6m 14s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  8m 39s | Avg:  4m 19s | Max:  6m 14s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  8m 39s | Avg:  4m 19s | Max:  6m 14s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  8m 39s | Avg:  4m 19s | Max:  6m 14s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 25s | Avg:  2m 25s | Max:  2m 25s
      🟩 Test               Pass: 100%/1   | Total:  6m 14s | Avg:  6m 14s | Max:  6m 14s
    
  • 🟩 python: Pass: 100%/1 | Total: 26m 25s | Avg: 26m 25s | Max: 26m 25s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 26m 25s | Avg: 26m 25s | Max: 26m 25s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 26m 25s | Avg: 26m 25s | Max: 26m 25s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 26m 25s | Avg: 26m 25s | Max: 26m 25s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 26m 25s | Avg: 26m 25s | Max: 26m 25s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 26m 25s | Avg: 26m 25s | Max: 26m 25s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 26m 25s | Avg: 26m 25s | Max: 26m 25s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 26m 25s | Avg: 26m 25s | Max: 26m 25s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 26m 25s | Avg: 26m 25s | Max: 26m 25s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 90)

# Runner
65 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

@fbusato fbusato merged commit 32d05c7 into NVIDIA:main Feb 5, 2025
101 of 104 checks passed
Copy link
Contributor

github-actions bot commented Feb 5, 2025

Successfully created backport PR for branch/2.8.x:

github-actions bot pushed a commit that referenced this pull request Feb 5, 2025
miscco pushed a commit that referenced this pull request Feb 5, 2025
(cherry picked from commit 32d05c7)

Co-authored-by: Federico Busato <[email protected]>
@fbusato fbusato deleted the internalize-cub-kernel-config branch February 11, 2025 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.0 Targeted for 3.0 release backport branch/2.8.x
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Internalize cub::KernelConfig
2 participants