Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tune cub::DeviceTransform for Blackwell #3543

Merged
merged 1 commit into from
Jan 28, 2025

Conversation

bernhardmgruber
Copy link
Contributor

No description provided.

@bernhardmgruber bernhardmgruber requested a review from a team as a code owner January 27, 2025 20:09
@bernhardmgruber bernhardmgruber changed the title Tune cub::DeviceTransform for B200 Tune cub::DeviceTransform for Blackwell Jan 27, 2025
Copy link
Contributor

🟨 CI finished in 7h 29m: Pass: 98%/90 | Total: 2d 17h | Avg: 43m 27s | Max: 1h 31m | Hits: 232%/10928
  • 🟨 thrust: Pass: 97%/43 | Total: 1d 00h | Avg: 34m 42s | Max: 1h 07m | Hits: 167%/7376

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  97%/41  | Total: 23h 50m | Avg: 34m 53s | Max:  1h 07m | Hits: 167%/7376  
      🟩 arm64              Pass: 100%/2   | Total:  1h 02m | Avg: 31m 06s | Max: 33m 45s
    🔍 ctk: 12.6 🔍
      🟩 12.0               Pass: 100%/5   | Total:  3h 17m | Avg: 39m 31s | Max:  1h 00m | Hits: 162%/1844  
      🟩 12.5               Pass: 100%/2   | Total:  1h 57m | Avg: 58m 39s | Max: 59m 38s
      🔍 12.6               Pass:  97%/36  | Total: 19h 37m | Avg: 32m 43s | Max:  1h 07m | Hits: 169%/5532  
    🔍 cudacxx: nvcc12.6 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 53m 45s | Avg: 26m 52s | Max: 28m 17s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 17m | Avg: 39m 31s | Max:  1h 00m | Hits: 162%/1844  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 57m | Avg: 58m 39s | Max: 59m 38s
      🔍 nvcc12.6           Pass:  97%/34  | Total: 18h 44m | Avg: 33m 03s | Max:  1h 07m | Hits: 169%/5532  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 53m 45s | Avg: 26m 52s | Max: 28m 17s
      🔍 nvcc               Pass:  97%/41  | Total: 23h 59m | Avg: 35m 05s | Max:  1h 07m | Hits: 167%/7376  
    🔍 cxx: MSVC14.39 🔍
      🟩 Clang14            Pass: 100%/4   | Total:  2h 15m | Avg: 33m 48s | Max: 35m 18s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 05m | Avg: 32m 33s | Max: 35m 13s
      🟩 Clang16            Pass: 100%/2   | Total: 59m 10s | Avg: 29m 35s | Max: 29m 52s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 05m | Avg: 32m 36s | Max: 33m 00s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 47m | Avg: 23m 52s | Max: 34m 57s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 12m | Avg: 36m 08s | Max: 37m 39s
      🟩 GCC8               Pass: 100%/1   | Total: 31m 43s | Avg: 31m 43s | Max: 31m 43s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 07m | Avg: 33m 37s | Max: 35m 49s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 12m | Avg: 36m 28s | Max: 37m 03s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 12m | Avg: 36m 17s | Max: 36m 56s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 10m | Avg: 35m 24s | Max: 35m 53s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 31m | Avg: 26m 25s | Max: 42m 56s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 02m | Avg:  1h 01m | Max:  1h 02m | Hits: 170%/3688  
      🔍 MSVC14.39          Pass:  66%/3   | Total:  2h 41m | Avg: 53m 59s | Max:  1h 07m | Hits: 164%/3688  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 57m | Avg: 58m 39s | Max: 59m 38s
    🔍 cxx_family: MSVC 🔍
      🟩 Clang              Pass: 100%/17  | Total:  8h 11m | Avg: 28m 55s | Max: 35m 18s
      🟩 GCC                Pass: 100%/19  | Total:  9h 58m | Avg: 31m 31s | Max: 42m 56s
      🔍 MSVC               Pass:  80%/5   | Total:  4h 44m | Avg: 56m 57s | Max:  1h 07m | Hits: 167%/7376  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 57m | Avg: 58m 39s | Max: 59m 38s
    🔍 jobs: TestCPU 🔍
      🟩 Build              Pass: 100%/37  | Total: 23h 07m | Avg: 37m 29s | Max:  1h 07m | Hits: 167%/7376  
      🔍 TestCPU            Pass:  66%/3   | Total: 49m 44s | Avg: 16m 34s | Max: 33m 50s
      🟩 TestGPU            Pass: 100%/3   | Total: 55m 50s | Avg: 18m 36s | Max: 31m 26s
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/20  | Total: 12h 56m | Avg: 38m 49s | Max:  1h 02m | Hits: 169%/5532  
      🔍 20                 Pass:  95%/21  | Total: 10h 55m | Avg: 31m 12s | Max:  1h 07m | Hits: 162%/1844  
    🟨 gpu
      🟨 v100               Pass:  97%/43  | Total:  1d 00h | Avg: 34m 42s | Max:  1h 07m | Hits: 167%/7376  
    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total:  1h 00m | Avg: 30m 29s | Max: 31m 26s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 19m 40s | Avg: 19m 40s | Max: 19m 40s
    
  • 🟩 cub: Pass: 100%/44 | Total: 1d 15h | Avg: 53m 40s | Max: 1h 31m | Hits: 368%/3552

    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  1d 13h | Avg: 53m 28s | Max:  1h 31m | Hits: 368%/3552  
      🟩 arm64              Pass: 100%/2   | Total:  1h 55m | Avg: 57m 44s | Max:  1h 00m
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 48m | Avg: 57m 45s | Max: 59m 47s | Hits: 368%/888   
      🟩 12.5               Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 10m
      🟩 12.6               Pass: 100%/37  | Total:  1d 08h | Avg: 52m 13s | Max:  1h 31m | Hits: 368%/2664  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 54m | Avg: 57m 23s | Max: 59m 03s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 48m | Avg: 57m 45s | Max: 59m 47s | Hits: 368%/888   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 10m
      🟩 nvcc12.6           Pass: 100%/35  | Total:  1d 06h | Avg: 51m 55s | Max:  1h 31m | Hits: 368%/2664  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 54m | Avg: 57m 23s | Max: 59m 03s
      🟩 nvcc               Pass: 100%/42  | Total:  1d 13h | Avg: 53m 29s | Max:  1h 31m | Hits: 368%/3552  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 51m | Avg: 57m 58s | Max:  1h 01m
      🟩 Clang15            Pass: 100%/2   | Total:  1h 54m | Avg: 57m 01s | Max: 59m 56s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 50m | Avg: 55m 11s | Max: 55m 24s
      🟩 Clang17            Pass: 100%/2   | Total:  2h 02m | Avg:  1h 01m | Max:  1h 01m
      🟩 Clang18            Pass: 100%/7   | Total:  5h 30m | Avg: 47m 16s | Max: 59m 24s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 54m | Avg: 57m 29s | Max: 58m 58s
      🟩 GCC8               Pass: 100%/1   | Total: 58m 32s | Avg: 58m 32s | Max: 58m 32s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 48m | Avg: 54m 13s | Max: 54m 44s
      🟩 GCC10              Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 08m
      🟩 GCC11              Pass: 100%/2   | Total:  1h 54m | Avg: 57m 03s | Max: 59m 29s
      🟩 GCC12              Pass: 100%/4   | Total:  2h 42m | Avg: 40m 43s | Max: 59m 36s
      🟩 GCC13              Pass: 100%/8   | Total:  5h 59m | Avg: 44m 55s | Max:  1h 31m
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 10m | Hits: 369%/1776  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 22m | Avg:  1h 11m | Max:  1h 15m | Hits: 368%/1776  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 10m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 15h 09m | Avg: 53m 31s | Max:  1h 01m
      🟩 GCC                Pass: 100%/21  | Total: 17h 20m | Avg: 49m 31s | Max:  1h 31m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 31m | Avg:  1h 07m | Max:  1h 15m | Hits: 368%/3552  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 10m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 45m 39s | Avg: 22m 49s | Max: 26m 17s
      🟩 v100               Pass: 100%/42  | Total:  1d 14h | Avg: 55m 08s | Max:  1h 31m | Hits: 368%/3552  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 11h | Avg: 57m 48s | Max:  1h 15m | Hits: 368%/3552  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 20m 47s | Avg: 20m 47s | Max: 20m 47s
      🟩 GraphCapture       Pass: 100%/1   | Total: 17m 30s | Avg: 17m 30s | Max: 17m 30s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 05m | Avg: 21m 57s | Max: 23m 49s
      🟩 TestGPU            Pass: 100%/2   | Total:  1h 58m | Avg: 59m 09s | Max:  1h 31m
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 45m 39s | Avg: 22m 49s | Max: 26m 17s
      🟩 90a                Pass: 100%/1   | Total: 23m 36s | Avg: 23m 36s | Max: 23m 36s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 19h 42m | Avg: 59m 06s | Max:  1h 10m | Hits: 369%/2664  
      🟩 20                 Pass: 100%/24  | Total: 19h 39m | Avg: 49m 08s | Max:  1h 31m | Hits: 367%/888   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 11m 05s | Avg: 5m 32s | Max: 8m 59s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 11m 05s | Avg:  5m 32s | Max:  8m 59s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total: 11m 05s | Avg:  5m 32s | Max:  8m 59s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total: 11m 05s | Avg:  5m 32s | Max:  8m 59s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 11m 05s | Avg:  5m 32s | Max:  8m 59s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 11m 05s | Avg:  5m 32s | Max:  8m 59s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 11m 05s | Avg:  5m 32s | Max:  8m 59s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total: 11m 05s | Avg:  5m 32s | Max:  8m 59s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 06s | Avg:  2m 06s | Max:  2m 06s
      🟩 Test               Pass: 100%/1   | Total:  8m 59s | Avg:  8m 59s | Max:  8m 59s
    
  • 🟩 python: Pass: 100%/1 | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 90)

# Runner
65 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

Copy link
Contributor

🟨 CI finished in 10h 13m: Pass: 98%/90 | Total: 2d 17h | Avg: 43m 26s | Max: 1h 31m | Hits: 232%/10928
  • 🟨 thrust: Pass: 97%/43 | Total: 1d 00h | Avg: 34m 41s | Max: 1h 07m | Hits: 167%/7376

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  97%/41  | Total: 23h 49m | Avg: 34m 51s | Max:  1h 07m | Hits: 167%/7376  
      🟩 arm64              Pass: 100%/2   | Total:  1h 02m | Avg: 31m 06s | Max: 33m 45s
    🔍 ctk: 12.6 🔍
      🟩 12.0               Pass: 100%/5   | Total:  3h 17m | Avg: 39m 31s | Max:  1h 00m | Hits: 162%/1844  
      🟩 12.5               Pass: 100%/2   | Total:  1h 57m | Avg: 58m 39s | Max: 59m 38s
      🔍 12.6               Pass:  97%/36  | Total: 19h 36m | Avg: 32m 41s | Max:  1h 07m | Hits: 169%/5532  
    🔍 cudacxx: nvcc12.6 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 53m 45s | Avg: 26m 52s | Max: 28m 17s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 17m | Avg: 39m 31s | Max:  1h 00m | Hits: 162%/1844  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 57m | Avg: 58m 39s | Max: 59m 38s
      🔍 nvcc12.6           Pass:  97%/34  | Total: 18h 42m | Avg: 33m 01s | Max:  1h 07m | Hits: 169%/5532  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 53m 45s | Avg: 26m 52s | Max: 28m 17s
      🔍 nvcc               Pass:  97%/41  | Total: 23h 57m | Avg: 35m 04s | Max:  1h 07m | Hits: 167%/7376  
    🔍 cxx: MSVC14.39 🔍
      🟩 Clang14            Pass: 100%/4   | Total:  2h 15m | Avg: 33m 48s | Max: 35m 18s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 05m | Avg: 32m 33s | Max: 35m 13s
      🟩 Clang16            Pass: 100%/2   | Total: 59m 10s | Avg: 29m 35s | Max: 29m 52s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 05m | Avg: 32m 36s | Max: 33m 00s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 47m | Avg: 23m 52s | Max: 34m 57s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 12m | Avg: 36m 08s | Max: 37m 39s
      🟩 GCC8               Pass: 100%/1   | Total: 31m 43s | Avg: 31m 43s | Max: 31m 43s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 07m | Avg: 33m 37s | Max: 35m 49s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 12m | Avg: 36m 28s | Max: 37m 03s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 12m | Avg: 36m 17s | Max: 36m 56s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 10m | Avg: 35m 24s | Max: 35m 53s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 31m | Avg: 26m 25s | Max: 42m 56s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 02m | Avg:  1h 01m | Max:  1h 02m | Hits: 170%/3688  
      🔍 MSVC14.39          Pass:  66%/3   | Total:  2h 40m | Avg: 53m 34s | Max:  1h 07m | Hits: 164%/3688  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 57m | Avg: 58m 39s | Max: 59m 38s
    🔍 cxx_family: MSVC 🔍
      🟩 Clang              Pass: 100%/17  | Total:  8h 11m | Avg: 28m 55s | Max: 35m 18s
      🟩 GCC                Pass: 100%/19  | Total:  9h 58m | Avg: 31m 31s | Max: 42m 56s
      🔍 MSVC               Pass:  80%/5   | Total:  4h 43m | Avg: 56m 42s | Max:  1h 07m | Hits: 167%/7376  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 57m | Avg: 58m 39s | Max: 59m 38s
    🔍 jobs: TestCPU 🔍
      🟩 Build              Pass: 100%/37  | Total: 23h 07m | Avg: 37m 29s | Max:  1h 07m | Hits: 167%/7376  
      🔍 TestCPU            Pass:  66%/3   | Total: 48m 29s | Avg: 16m 09s | Max: 32m 35s
      🟩 TestGPU            Pass: 100%/3   | Total: 55m 50s | Avg: 18m 36s | Max: 31m 26s
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/20  | Total: 12h 56m | Avg: 38m 49s | Max:  1h 02m | Hits: 169%/5532  
      🔍 20                 Pass:  95%/21  | Total: 10h 54m | Avg: 31m 09s | Max:  1h 07m | Hits: 162%/1844  
    🟨 gpu
      🟨 v100               Pass:  97%/43  | Total:  1d 00h | Avg: 34m 41s | Max:  1h 07m | Hits: 167%/7376  
    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total:  1h 00m | Avg: 30m 29s | Max: 31m 26s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 19m 40s | Avg: 19m 40s | Max: 19m 40s
    
  • 🟩 cub: Pass: 100%/44 | Total: 1d 15h | Avg: 53m 40s | Max: 1h 31m | Hits: 368%/3552

    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  1d 13h | Avg: 53m 28s | Max:  1h 31m | Hits: 368%/3552  
      🟩 arm64              Pass: 100%/2   | Total:  1h 55m | Avg: 57m 44s | Max:  1h 00m
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 48m | Avg: 57m 45s | Max: 59m 47s | Hits: 368%/888   
      🟩 12.5               Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 10m
      🟩 12.6               Pass: 100%/37  | Total:  1d 08h | Avg: 52m 13s | Max:  1h 31m | Hits: 368%/2664  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 54m | Avg: 57m 23s | Max: 59m 03s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 48m | Avg: 57m 45s | Max: 59m 47s | Hits: 368%/888   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 10m
      🟩 nvcc12.6           Pass: 100%/35  | Total:  1d 06h | Avg: 51m 55s | Max:  1h 31m | Hits: 368%/2664  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 54m | Avg: 57m 23s | Max: 59m 03s
      🟩 nvcc               Pass: 100%/42  | Total:  1d 13h | Avg: 53m 29s | Max:  1h 31m | Hits: 368%/3552  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 51m | Avg: 57m 58s | Max:  1h 01m
      🟩 Clang15            Pass: 100%/2   | Total:  1h 54m | Avg: 57m 01s | Max: 59m 56s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 50m | Avg: 55m 11s | Max: 55m 24s
      🟩 Clang17            Pass: 100%/2   | Total:  2h 02m | Avg:  1h 01m | Max:  1h 01m
      🟩 Clang18            Pass: 100%/7   | Total:  5h 30m | Avg: 47m 16s | Max: 59m 24s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 54m | Avg: 57m 29s | Max: 58m 58s
      🟩 GCC8               Pass: 100%/1   | Total: 58m 32s | Avg: 58m 32s | Max: 58m 32s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 48m | Avg: 54m 13s | Max: 54m 44s
      🟩 GCC10              Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 08m
      🟩 GCC11              Pass: 100%/2   | Total:  1h 54m | Avg: 57m 03s | Max: 59m 29s
      🟩 GCC12              Pass: 100%/4   | Total:  2h 42m | Avg: 40m 43s | Max: 59m 36s
      🟩 GCC13              Pass: 100%/8   | Total:  5h 59m | Avg: 44m 55s | Max:  1h 31m
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 10m | Hits: 369%/1776  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 22m | Avg:  1h 11m | Max:  1h 15m | Hits: 368%/1776  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 10m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 15h 09m | Avg: 53m 31s | Max:  1h 01m
      🟩 GCC                Pass: 100%/21  | Total: 17h 20m | Avg: 49m 31s | Max:  1h 31m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 31m | Avg:  1h 07m | Max:  1h 15m | Hits: 368%/3552  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 10m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 45m 39s | Avg: 22m 49s | Max: 26m 17s
      🟩 v100               Pass: 100%/42  | Total:  1d 14h | Avg: 55m 08s | Max:  1h 31m | Hits: 368%/3552  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 11h | Avg: 57m 48s | Max:  1h 15m | Hits: 368%/3552  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 20m 47s | Avg: 20m 47s | Max: 20m 47s
      🟩 GraphCapture       Pass: 100%/1   | Total: 17m 30s | Avg: 17m 30s | Max: 17m 30s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 05m | Avg: 21m 57s | Max: 23m 49s
      🟩 TestGPU            Pass: 100%/2   | Total:  1h 58m | Avg: 59m 09s | Max:  1h 31m
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 45m 39s | Avg: 22m 49s | Max: 26m 17s
      🟩 90a                Pass: 100%/1   | Total: 23m 36s | Avg: 23m 36s | Max: 23m 36s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 19h 42m | Avg: 59m 06s | Max:  1h 10m | Hits: 369%/2664  
      🟩 20                 Pass: 100%/24  | Total: 19h 39m | Avg: 49m 08s | Max:  1h 31m | Hits: 367%/888   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 11m 05s | Avg: 5m 32s | Max: 8m 59s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 11m 05s | Avg:  5m 32s | Max:  8m 59s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total: 11m 05s | Avg:  5m 32s | Max:  8m 59s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total: 11m 05s | Avg:  5m 32s | Max:  8m 59s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 11m 05s | Avg:  5m 32s | Max:  8m 59s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 11m 05s | Avg:  5m 32s | Max:  8m 59s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 11m 05s | Avg:  5m 32s | Max:  8m 59s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total: 11m 05s | Avg:  5m 32s | Max:  8m 59s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 06s | Avg:  2m 06s | Max:  2m 06s
      🟩 Test               Pass: 100%/1   | Total:  8m 59s | Avg:  8m 59s | Max:  8m 59s
    
  • 🟩 python: Pass: 100%/1 | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 45m 54s | Avg: 45m 54s | Max: 45m 54s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 90)

# Runner
65 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

Copy link
Contributor

🟩 CI finished in 2h 16m: Pass: 100%/89 | Total: 16h 01m | Avg: 10m 48s | Max: 1h 01m | Hits: 422%/10928
  • 🟩 cub: Pass: 100%/44 | Total: 8h 15m | Avg: 11m 16s | Max: 32m 06s | Hits: 540%/3552

    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  8h 05m | Avg: 11m 34s | Max: 32m 06s | Hits: 540%/3552  
      🟩 arm64              Pass: 100%/2   | Total:  9m 53s | Avg:  4m 56s | Max:  5m 08s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 46m 54s | Avg:  9m 22s | Max: 25m 28s | Hits: 540%/888   
      🟩 12.5               Pass: 100%/2   | Total: 22m 03s | Avg: 11m 01s | Max: 11m 30s
      🟩 12.6               Pass: 100%/37  | Total:  7h 06m | Avg: 11m 32s | Max: 32m 06s | Hits: 540%/2664  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  9m 12s | Avg:  4m 36s | Max:  4m 38s
      🟩 nvcc12.0           Pass: 100%/5   | Total: 46m 54s | Avg:  9m 22s | Max: 25m 28s | Hits: 540%/888   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 22m 03s | Avg: 11m 01s | Max: 11m 30s
      🟩 nvcc12.6           Pass: 100%/35  | Total:  6h 57m | Avg: 11m 55s | Max: 32m 06s | Hits: 540%/2664  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 12s | Avg:  4m 36s | Max:  4m 38s
      🟩 nvcc               Pass: 100%/42  | Total:  8h 06m | Avg: 11m 35s | Max: 32m 06s | Hits: 540%/3552  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 21m 55s | Avg:  5m 28s | Max:  5m 54s
      🟩 Clang15            Pass: 100%/2   | Total: 11m 52s | Avg:  5m 56s | Max:  6m 14s
      🟩 Clang16            Pass: 100%/2   | Total: 11m 46s | Avg:  5m 53s | Max:  6m 14s
      🟩 Clang17            Pass: 100%/2   | Total: 11m 17s | Avg:  5m 38s | Max:  5m 49s
      🟩 Clang18            Pass: 100%/7   | Total:  1h 24m | Avg: 12m 01s | Max: 32m 00s
      🟩 GCC7               Pass: 100%/2   | Total: 11m 37s | Avg:  5m 48s | Max:  6m 03s
      🟩 GCC8               Pass: 100%/1   | Total:  6m 09s | Avg:  6m 09s | Max:  6m 09s
      🟩 GCC9               Pass: 100%/2   | Total: 11m 21s | Avg:  5m 40s | Max:  5m 48s
      🟩 GCC10              Pass: 100%/2   | Total: 11m 49s | Avg:  5m 54s | Max:  6m 07s
      🟩 GCC11              Pass: 100%/2   | Total: 11m 52s | Avg:  5m 56s | Max:  5m 57s
      🟩 GCC12              Pass: 100%/4   | Total: 37m 34s | Avg:  9m 23s | Max: 19m 30s
      🟩 GCC13              Pass: 100%/8   | Total:  2h 06m | Avg: 15m 49s | Max: 31m 21s
      🟩 MSVC14.29          Pass: 100%/2   | Total: 52m 44s | Avg: 26m 22s | Max: 27m 16s | Hits: 540%/1776  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  1h 02m | Avg: 31m 28s | Max: 32m 06s | Hits: 540%/1776  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 22m 03s | Avg: 11m 01s | Max: 11m 30s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  2h 21m | Avg:  8m 17s | Max: 32m 00s
      🟩 GCC                Pass: 100%/21  | Total:  3h 37m | Avg: 10m 20s | Max: 31m 21s
      🟩 MSVC               Pass: 100%/4   | Total:  1h 55m | Avg: 28m 55s | Max: 32m 06s | Hits: 540%/3552  
      🟩 NVHPC              Pass: 100%/2   | Total: 22m 03s | Avg: 11m 01s | Max: 11m 30s
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 24m 22s | Avg: 12m 11s | Max: 19m 30s
      🟩 v100               Pass: 100%/42  | Total:  7h 51m | Avg: 11m 13s | Max: 32m 06s | Hits: 540%/3552  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  5h 12m | Avg:  8m 27s | Max: 32m 06s | Hits: 540%/3552  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 31m 21s | Avg: 31m 21s | Max: 31m 21s
      🟩 GraphCapture       Pass: 100%/1   | Total: 25m 43s | Avg: 25m 43s | Max: 25m 43s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 04m | Avg: 21m 37s | Max: 26m 51s
      🟩 TestGPU            Pass: 100%/2   | Total:  1h 01m | Avg: 30m 34s | Max: 32m 00s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 24m 22s | Avg: 12m 11s | Max: 19m 30s
      🟩 90a                Pass: 100%/1   | Total:  4m 58s | Avg:  4m 58s | Max:  4m 58s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  3h 06m | Avg:  9m 20s | Max: 30m 51s | Hits: 540%/2664  
      🟩 20                 Pass: 100%/24  | Total:  5h 09m | Avg: 12m 52s | Max: 32m 06s | Hits: 540%/888   
    
  • 🟩 thrust: Pass: 100%/42 | Total: 6h 33m | Avg: 9m 22s | Max: 32m 25s | Hits: 365%/7376

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 22m 39s | Avg: 11m 19s | Max: 15m 40s
    🟩 cpu
      🟩 amd64              Pass: 100%/40  | Total:  6h 24m | Avg:  9m 36s | Max: 32m 25s | Hits: 365%/7376  
      🟩 arm64              Pass: 100%/2   | Total:  9m 48s | Avg:  4m 54s | Max:  5m 00s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 46m 38s | Avg:  9m 19s | Max: 25m 19s | Hits: 365%/1844  
      🟩 12.5               Pass: 100%/2   | Total: 33m 36s | Avg: 16m 48s | Max: 16m 56s
      🟩 12.6               Pass: 100%/35  | Total:  5h 13m | Avg:  8m 57s | Max: 32m 25s | Hits: 365%/5532  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 13s | Avg:  5m 06s | Max:  5m 10s
      🟩 nvcc12.0           Pass: 100%/5   | Total: 46m 38s | Avg:  9m 19s | Max: 25m 19s | Hits: 365%/1844  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 33m 36s | Avg: 16m 48s | Max: 16m 56s
      🟩 nvcc12.6           Pass: 100%/33  | Total:  5h 03m | Avg:  9m 11s | Max: 32m 25s | Hits: 365%/5532  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 13s | Avg:  5m 06s | Max:  5m 10s
      🟩 nvcc               Pass: 100%/40  | Total:  6h 23m | Avg:  9m 35s | Max: 32m 25s | Hits: 365%/7376  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 22m 49s | Avg:  5m 42s | Max:  6m 29s
      🟩 Clang15            Pass: 100%/2   | Total: 11m 37s | Avg:  5m 48s | Max:  6m 02s
      🟩 Clang16            Pass: 100%/2   | Total: 10m 59s | Avg:  5m 29s | Max:  5m 39s
      🟩 Clang17            Pass: 100%/2   | Total: 12m 15s | Avg:  6m 07s | Max:  6m 16s
      🟩 Clang18            Pass: 100%/7   | Total: 51m 41s | Avg:  7m 23s | Max: 17m 21s
      🟩 GCC7               Pass: 100%/2   | Total: 10m 59s | Avg:  5m 29s | Max:  5m 54s
      🟩 GCC8               Pass: 100%/1   | Total:  5m 39s | Avg:  5m 39s | Max:  5m 39s
      🟩 GCC9               Pass: 100%/2   | Total: 11m 44s | Avg:  5m 52s | Max:  6m 04s
      🟩 GCC10              Pass: 100%/2   | Total: 11m 34s | Avg:  5m 47s | Max:  6m 01s
      🟩 GCC11              Pass: 100%/2   | Total: 11m 31s | Avg:  5m 45s | Max:  6m 07s
      🟩 GCC12              Pass: 100%/2   | Total: 12m 45s | Avg:  6m 22s | Max:  6m 27s
      🟩 GCC13              Pass: 100%/8   | Total:  1h 11m | Avg:  8m 53s | Max: 17m 53s
      🟩 MSVC14.29          Pass: 100%/2   | Total: 54m 08s | Avg: 27m 04s | Max: 28m 49s | Hits: 365%/3688  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  1h 01m | Avg: 30m 48s | Max: 32m 25s | Hits: 365%/3688  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 33m 36s | Avg: 16m 48s | Max: 16m 56s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  1h 49m | Avg:  6m 25s | Max: 17m 21s
      🟩 GCC                Pass: 100%/19  | Total:  2h 15m | Avg:  7m 07s | Max: 17m 53s
      🟩 MSVC               Pass: 100%/4   | Total:  1h 55m | Avg: 28m 56s | Max: 32m 25s | Hits: 365%/7376  
      🟩 NVHPC              Pass: 100%/2   | Total: 33m 36s | Avg: 16m 48s | Max: 16m 56s
    🟩 gpu
      🟩 v100               Pass: 100%/42  | Total:  6h 33m | Avg:  9m 22s | Max: 32m 25s | Hits: 365%/7376  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  5h 27m | Avg:  8m 51s | Max: 32m 25s | Hits: 365%/7376  
      🟩 TestCPU            Pass: 100%/2   | Total: 15m 30s | Avg:  7m 45s | Max:  7m 49s
      🟩 TestGPU            Pass: 100%/3   | Total: 50m 54s | Avg: 16m 58s | Max: 17m 53s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total:  4m 33s | Avg:  4m 33s | Max:  4m 33s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  3h 13m | Avg:  9m 40s | Max: 29m 11s | Hits: 365%/5532  
      🟩 20                 Pass: 100%/20  | Total:  2h 57m | Avg:  8m 53s | Max: 32m 25s | Hits: 365%/1844  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 9m 45s | Avg: 4m 52s | Max: 7m 28s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 45s | Avg:  4m 52s | Max:  7m 28s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  9m 45s | Avg:  4m 52s | Max:  7m 28s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  9m 45s | Avg:  4m 52s | Max:  7m 28s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  9m 45s | Avg:  4m 52s | Max:  7m 28s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  9m 45s | Avg:  4m 52s | Max:  7m 28s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  9m 45s | Avg:  4m 52s | Max:  7m 28s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total:  9m 45s | Avg:  4m 52s | Max:  7m 28s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 17s | Avg:  2m 17s | Max:  2m 17s
      🟩 Test               Pass: 100%/1   | Total:  7m 28s | Avg:  7m 28s | Max:  7m 28s
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 01m | Avg: 1h 01m | Max: 1h 01m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 89)

# Runner
65 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
8 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@bernhardmgruber bernhardmgruber enabled auto-merge (squash) January 28, 2025 16:08
@bernhardmgruber bernhardmgruber merged commit eaf3051 into NVIDIA:main Jan 28, 2025
101 of 104 checks passed
Copy link
Contributor

Git push to origin failed for branch/2.8.x with exitcode 128

@bernhardmgruber bernhardmgruber deleted the tune_transform branch January 28, 2025 16:40
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this pull request Jan 28, 2025
davebayer pushed a commit to davebayer/cccl that referenced this pull request Jan 29, 2025
bernhardmgruber added a commit that referenced this pull request Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants