Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add b200 policies for cub.device.run_length_encode.encode,non_trivialruns #3546

Merged
merged 3 commits into from
Feb 5, 2025

Conversation

bernhardmgruber
Copy link
Contributor

No description provided.

Copy link
Contributor

🟨 CI finished in 5h 18m: Pass: 98%/90 | Total: 16h 37m | Avg: 11m 05s | Max: 48m 35s | Hits: 418%/10928
  • 🟨 thrust: Pass: 97%/43 | Total: 6h 46m | Avg: 9m 27s | Max: 33m 35s | Hits: 365%/7376

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  97%/41  | Total:  6h 37m | Avg:  9m 41s | Max: 33m 35s | Hits: 365%/7376  
      🟩 arm64              Pass: 100%/2   | Total:  9m 50s | Avg:  4m 55s | Max:  5m 18s
    🔍 ctk: 12.6 🔍
      🟩 12.0               Pass: 100%/5   | Total: 44m 58s | Avg:  8m 59s | Max: 24m 39s | Hits: 365%/1844  
      🟩 12.5               Pass: 100%/2   | Total: 30m 29s | Avg: 15m 14s | Max: 15m 55s
      🔍 12.6               Pass:  97%/36  | Total:  5h 31m | Avg:  9m 12s | Max: 33m 35s | Hits: 365%/5532  
    🔍 cudacxx: nvcc12.6 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  9m 58s | Avg:  4m 59s | Max:  5m 02s
      🟩 nvcc12.0           Pass: 100%/5   | Total: 44m 58s | Avg:  8m 59s | Max: 24m 39s | Hits: 365%/1844  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 30m 29s | Avg: 15m 14s | Max: 15m 55s
      🔍 nvcc12.6           Pass:  97%/34  | Total:  5h 21m | Avg:  9m 27s | Max: 33m 35s | Hits: 365%/5532  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 58s | Avg:  4m 59s | Max:  5m 02s
      🔍 nvcc               Pass:  97%/41  | Total:  6h 36m | Avg:  9m 40s | Max: 33m 35s | Hits: 365%/7376  
    🔍 cxx: MSVC14.39 🔍
      🟩 Clang14            Pass: 100%/4   | Total: 21m 36s | Avg:  5m 24s | Max:  5m 55s
      🟩 Clang15            Pass: 100%/2   | Total: 11m 03s | Avg:  5m 31s | Max:  5m 42s
      🟩 Clang16            Pass: 100%/2   | Total: 11m 53s | Avg:  5m 56s | Max:  5m 58s
      🟩 Clang17            Pass: 100%/2   | Total: 11m 24s | Avg:  5m 42s | Max:  5m 58s
      🟩 Clang18            Pass: 100%/7   | Total: 44m 28s | Avg:  6m 21s | Max: 11m 36s
      🟩 GCC7               Pass: 100%/2   | Total: 10m 21s | Avg:  5m 10s | Max:  5m 17s
      🟩 GCC8               Pass: 100%/1   | Total:  5m 43s | Avg:  5m 43s | Max:  5m 43s
      🟩 GCC9               Pass: 100%/2   | Total: 10m 38s | Avg:  5m 19s | Max:  5m 21s
      🟩 GCC10              Pass: 100%/2   | Total: 11m 32s | Avg:  5m 46s | Max:  5m 57s
      🟩 GCC11              Pass: 100%/2   | Total: 12m 21s | Avg:  6m 10s | Max:  6m 12s
      🟩 GCC12              Pass: 100%/2   | Total: 12m 41s | Avg:  6m 20s | Max:  6m 21s
      🟩 GCC13              Pass: 100%/8   | Total:  1h 09m | Avg:  8m 39s | Max: 16m 58s
      🟩 MSVC14.29          Pass: 100%/2   | Total: 53m 53s | Avg: 26m 56s | Max: 29m 14s | Hits: 365%/3688  
      🔍 MSVC14.39          Pass:  66%/3   | Total:  1h 29m | Avg: 29m 54s | Max: 33m 35s | Hits: 365%/3688  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 30m 29s | Avg: 15m 14s | Max: 15m 55s
    🔍 cxx_family: MSVC 🔍
      🟩 Clang              Pass: 100%/17  | Total:  1h 40m | Avg:  5m 54s | Max: 11m 36s
      🟩 GCC                Pass: 100%/19  | Total:  2h 12m | Avg:  6m 58s | Max: 16m 58s
      🔍 MSVC               Pass:  80%/5   | Total:  2h 23m | Avg: 28m 43s | Max: 33m 35s | Hits: 365%/7376  
      🟩 NVHPC              Pass: 100%/2   | Total: 30m 29s | Avg: 15m 14s | Max: 15m 55s
    🔍 jobs: TestCPU 🔍
      🟩 Build              Pass: 100%/37  | Total:  5h 13m | Avg:  8m 28s | Max: 29m 14s | Hits: 365%/7376  
      🔍 TestCPU            Pass:  66%/3   | Total: 48m 36s | Avg: 16m 12s | Max: 33m 35s
      🟩 TestGPU            Pass: 100%/3   | Total: 44m 43s | Avg: 14m 54s | Max: 16m 58s
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/20  | Total:  3h 06m | Avg:  9m 18s | Max: 29m 14s | Hits: 365%/5532  
      🔍 20                 Pass:  95%/21  | Total:  3h 18m | Avg:  9m 26s | Max: 33m 35s | Hits: 365%/1844  
    🟨 gpu
      🟨 v100               Pass:  97%/43  | Total:  6h 46m | Avg:  9m 27s | Max: 33m 35s | Hits: 365%/7376  
    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 22m 35s | Avg: 11m 17s | Max: 16m 58s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total:  4m 46s | Avg:  4m 46s | Max:  4m 46s
    
  • 🟩 cub: Pass: 100%/44 | Total: 8h 51m | Avg: 12m 04s | Max: 37m 19s | Hits: 528%/3552

    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  8h 40m | Avg: 12m 23s | Max: 37m 19s | Hits: 528%/3552  
      🟩 arm64              Pass: 100%/2   | Total: 11m 16s | Avg:  5m 38s | Max:  5m 57s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 53m 29s | Avg: 10m 41s | Max: 27m 52s | Hits: 528%/888   
      🟩 12.5               Pass: 100%/2   | Total: 21m 03s | Avg: 10m 31s | Max: 10m 32s
      🟩 12.6               Pass: 100%/37  | Total:  7h 36m | Avg: 12m 21s | Max: 37m 19s | Hits: 528%/2664  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  9m 57s | Avg:  4m 58s | Max:  5m 04s
      🟩 nvcc12.0           Pass: 100%/5   | Total: 53m 29s | Avg: 10m 41s | Max: 27m 52s | Hits: 528%/888   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 21m 03s | Avg: 10m 31s | Max: 10m 32s
      🟩 nvcc12.6           Pass: 100%/35  | Total:  7h 27m | Avg: 12m 46s | Max: 37m 19s | Hits: 528%/2664  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 57s | Avg:  4m 58s | Max:  5m 04s
      🟩 nvcc               Pass: 100%/42  | Total:  8h 41m | Avg: 12m 25s | Max: 37m 19s | Hits: 528%/3552  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 24m 58s | Avg:  6m 14s | Max:  6m 17s
      🟩 Clang15            Pass: 100%/2   | Total: 13m 31s | Avg:  6m 45s | Max:  6m 55s
      🟩 Clang16            Pass: 100%/2   | Total: 13m 16s | Avg:  6m 38s | Max:  6m 41s
      🟩 Clang17            Pass: 100%/2   | Total: 13m 06s | Avg:  6m 33s | Max:  6m 37s
      🟩 Clang18            Pass: 100%/7   | Total:  1h 29m | Avg: 12m 44s | Max: 31m 57s
      🟩 GCC7               Pass: 100%/2   | Total: 12m 43s | Avg:  6m 21s | Max:  6m 39s
      🟩 GCC8               Pass: 100%/1   | Total:  6m 03s | Avg:  6m 03s | Max:  6m 03s
      🟩 GCC9               Pass: 100%/2   | Total: 13m 33s | Avg:  6m 46s | Max:  7m 05s
      🟩 GCC10              Pass: 100%/2   | Total: 13m 30s | Avg:  6m 45s | Max:  6m 51s
      🟩 GCC11              Pass: 100%/2   | Total: 13m 48s | Avg:  6m 54s | Max:  7m 03s
      🟩 GCC12              Pass: 100%/4   | Total: 38m 06s | Avg:  9m 31s | Max: 19m 16s
      🟩 GCC13              Pass: 100%/8   | Total:  2h 18m | Avg: 17m 19s | Max: 37m 19s
      🟩 MSVC14.29          Pass: 100%/2   | Total: 56m 13s | Avg: 28m 06s | Max: 28m 21s | Hits: 528%/1776  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  1h 03m | Avg: 31m 55s | Max: 32m 20s | Hits: 528%/1776  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 21m 03s | Avg: 10m 31s | Max: 10m 32s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  2h 34m | Avg:  9m 03s | Max: 31m 57s
      🟩 GCC                Pass: 100%/21  | Total:  3h 56m | Avg: 11m 15s | Max: 37m 19s
      🟩 MSVC               Pass: 100%/4   | Total:  2h 00m | Avg: 30m 01s | Max: 32m 20s | Hits: 528%/3552  
      🟩 NVHPC              Pass: 100%/2   | Total: 21m 03s | Avg: 10m 31s | Max: 10m 32s
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 24m 12s | Avg: 12m 06s | Max: 19m 16s
      🟩 v100               Pass: 100%/42  | Total:  8h 27m | Avg: 12m 04s | Max: 37m 19s | Hits: 528%/3552  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  5h 37m | Avg:  9m 07s | Max: 32m 20s | Hits: 528%/3552  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 25m 46s | Avg: 25m 46s | Max: 25m 46s
      🟩 GraphCapture       Pass: 100%/1   | Total: 26m 30s | Avg: 26m 30s | Max: 26m 30s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 15m | Avg: 25m 09s | Max: 31m 57s
      🟩 TestGPU            Pass: 100%/2   | Total:  1h 06m | Avg: 33m 10s | Max: 37m 19s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 24m 12s | Avg: 12m 06s | Max: 19m 16s
      🟩 90a                Pass: 100%/1   | Total:  4m 59s | Avg:  4m 59s | Max:  4m 59s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  3h 22m | Avg: 10m 06s | Max: 32m 20s | Hits: 528%/2664  
      🟩 20                 Pass: 100%/24  | Total:  5h 29m | Avg: 13m 43s | Max: 37m 19s | Hits: 528%/888   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 10m 51s | Avg: 5m 25s | Max: 8m 42s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 10m 51s | Avg:  5m 25s | Max:  8m 42s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total: 10m 51s | Avg:  5m 25s | Max:  8m 42s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total: 10m 51s | Avg:  5m 25s | Max:  8m 42s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 10m 51s | Avg:  5m 25s | Max:  8m 42s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 10m 51s | Avg:  5m 25s | Max:  8m 42s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 10m 51s | Avg:  5m 25s | Max:  8m 42s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total: 10m 51s | Avg:  5m 25s | Max:  8m 42s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 09s | Avg:  2m 09s | Max:  2m 09s
      🟩 Test               Pass: 100%/1   | Total:  8m 42s | Avg:  8m 42s | Max:  8m 42s
    
  • 🟩 python: Pass: 100%/1 | Total: 48m 35s | Avg: 48m 35s | Max: 48m 35s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 48m 35s | Avg: 48m 35s | Max: 48m 35s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 48m 35s | Avg: 48m 35s | Max: 48m 35s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 48m 35s | Avg: 48m 35s | Max: 48m 35s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 48m 35s | Avg: 48m 35s | Max: 48m 35s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 48m 35s | Avg: 48m 35s | Max: 48m 35s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 48m 35s | Avg: 48m 35s | Max: 48m 35s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 48m 35s | Avg: 48m 35s | Max: 48m 35s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 48m 35s | Avg: 48m 35s | Max: 48m 35s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 90)

# Runner
65 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@bernhardmgruber bernhardmgruber force-pushed the tune_rle branch 2 times, most recently from cae7c17 to 6737746 Compare January 28, 2025 08:13
Copy link
Contributor

🟨 CI finished in 2h 02m: Pass: 98%/90 | Total: 15h 28m | Avg: 10m 19s | Max: 41m 33s | Hits: 420%/10928
  • 🟨 thrust: Pass: 97%/43 | Total: 6h 50m | Avg: 9m 33s | Max: 33m 33s | Hits: 365%/7376

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  97%/41  | Total:  6h 39m | Avg:  9m 45s | Max: 33m 33s | Hits: 365%/7376  
      🟩 arm64              Pass: 100%/2   | Total: 10m 49s | Avg:  5m 24s | Max:  6m 08s
    🔍 ctk: 12.6 🔍
      🟩 12.0               Pass: 100%/5   | Total: 46m 14s | Avg:  9m 14s | Max: 25m 01s | Hits: 365%/1844  
      🟩 12.5               Pass: 100%/2   | Total: 28m 14s | Avg: 14m 07s | Max: 14m 29s
      🔍 12.6               Pass:  97%/36  | Total:  5h 36m | Avg:  9m 20s | Max: 33m 33s | Hits: 365%/5532  
    🔍 cudacxx: nvcc12.6 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 39s | Avg:  5m 19s | Max:  5m 33s
      🟩 nvcc12.0           Pass: 100%/5   | Total: 46m 14s | Avg:  9m 14s | Max: 25m 01s | Hits: 365%/1844  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 28m 14s | Avg: 14m 07s | Max: 14m 29s
      🔍 nvcc12.6           Pass:  97%/34  | Total:  5h 25m | Avg:  9m 34s | Max: 33m 33s | Hits: 365%/5532  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 39s | Avg:  5m 19s | Max:  5m 33s
      🔍 nvcc               Pass:  97%/41  | Total:  6h 40m | Avg:  9m 45s | Max: 33m 33s | Hits: 365%/7376  
    🔍 cxx: MSVC14.39 🔍
      🟩 Clang14            Pass: 100%/4   | Total: 21m 12s | Avg:  5m 18s | Max:  5m 36s
      🟩 Clang15            Pass: 100%/2   | Total: 11m 28s | Avg:  5m 44s | Max:  6m 03s
      🟩 Clang16            Pass: 100%/2   | Total: 11m 23s | Avg:  5m 41s | Max:  5m 50s
      🟩 Clang17            Pass: 100%/2   | Total: 11m 44s | Avg:  5m 52s | Max:  6m 18s
      🟩 Clang18            Pass: 100%/7   | Total: 49m 28s | Avg:  7m 04s | Max: 15m 04s
      🟩 GCC7               Pass: 100%/2   | Total: 14m 06s | Avg:  7m 03s | Max:  8m 47s
      🟩 GCC8               Pass: 100%/1   | Total:  5m 21s | Avg:  5m 21s | Max:  5m 21s
      🟩 GCC9               Pass: 100%/2   | Total: 11m 46s | Avg:  5m 53s | Max:  5m 58s
      🟩 GCC10              Pass: 100%/2   | Total: 11m 22s | Avg:  5m 41s | Max:  5m 57s
      🟩 GCC11              Pass: 100%/2   | Total: 11m 51s | Avg:  5m 55s | Max:  6m 03s
      🟩 GCC12              Pass: 100%/2   | Total: 12m 15s | Avg:  6m 07s | Max:  6m 11s
      🟩 GCC13              Pass: 100%/8   | Total:  1h 03m | Avg:  7m 54s | Max: 13m 58s
      🟩 MSVC14.29          Pass: 100%/2   | Total: 53m 17s | Avg: 26m 38s | Max: 28m 16s | Hits: 365%/3688  
      🔍 MSVC14.39          Pass:  66%/3   | Total:  1h 34m | Avg: 31m 22s | Max: 33m 33s | Hits: 365%/3688  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 28m 14s | Avg: 14m 07s | Max: 14m 29s
    🔍 cxx_family: MSVC 🔍
      🟩 Clang              Pass: 100%/17  | Total:  1h 45m | Avg:  6m 11s | Max: 15m 04s
      🟩 GCC                Pass: 100%/19  | Total:  2h 09m | Avg:  6m 50s | Max: 13m 58s
      🔍 MSVC               Pass:  80%/5   | Total:  2h 27m | Avg: 29m 28s | Max: 33m 33s | Hits: 365%/7376  
      🟩 NVHPC              Pass: 100%/2   | Total: 28m 14s | Avg: 14m 07s | Max: 14m 29s
    🔍 jobs: TestCPU 🔍
      🟩 Build              Pass: 100%/37  | Total:  5h 21m | Avg:  8m 40s | Max: 30m 35s | Hits: 365%/7376  
      🔍 TestCPU            Pass:  66%/3   | Total: 48m 56s | Avg: 16m 18s | Max: 33m 33s
      🟩 TestGPU            Pass: 100%/3   | Total: 40m 45s | Avg: 13m 35s | Max: 15m 04s
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/20  | Total:  3h 13m | Avg:  9m 39s | Max: 29m 59s | Hits: 365%/5532  
      🔍 20                 Pass:  95%/21  | Total:  3h 19m | Avg:  9m 30s | Max: 33m 33s | Hits: 365%/1844  
    🟨 gpu
      🟨 v100               Pass:  97%/43  | Total:  6h 50m | Avg:  9m 33s | Max: 33m 33s | Hits: 365%/7376  
    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 17m 56s | Avg:  8m 58s | Max: 11m 43s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total:  4m 54s | Avg:  4m 54s | Max:  4m 54s
    
  • 🟩 cub: Pass: 100%/44 | Total: 7h 46m | Avg: 10m 35s | Max: 30m 50s | Hits: 534%/3552

    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  7h 36m | Avg: 10m 51s | Max: 30m 50s | Hits: 534%/3552  
      🟩 arm64              Pass: 100%/2   | Total:  9m 42s | Avg:  4m 51s | Max:  4m 58s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 45m 14s | Avg:  9m 02s | Max: 24m 52s | Hits: 536%/888   
      🟩 12.5               Pass: 100%/2   | Total: 19m 28s | Avg:  9m 44s | Max:  9m 56s
      🟩 12.6               Pass: 100%/37  | Total:  6h 41m | Avg: 10m 50s | Max: 30m 50s | Hits: 533%/2664  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  8m 51s | Avg:  4m 25s | Max:  4m 32s
      🟩 nvcc12.0           Pass: 100%/5   | Total: 45m 14s | Avg:  9m 02s | Max: 24m 52s | Hits: 536%/888   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 19m 28s | Avg:  9m 44s | Max:  9m 56s
      🟩 nvcc12.6           Pass: 100%/35  | Total:  6h 32m | Avg: 11m 12s | Max: 30m 50s | Hits: 533%/2664  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  8m 51s | Avg:  4m 25s | Max:  4m 32s
      🟩 nvcc               Pass: 100%/42  | Total:  7h 37m | Avg: 10m 53s | Max: 30m 50s | Hits: 534%/3552  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 21m 19s | Avg:  5m 19s | Max:  5m 48s
      🟩 Clang15            Pass: 100%/2   | Total: 11m 45s | Avg:  5m 52s | Max:  5m 53s
      🟩 Clang16            Pass: 100%/2   | Total: 11m 20s | Avg:  5m 40s | Max:  5m 47s
      🟩 Clang17            Pass: 100%/2   | Total: 11m 13s | Avg:  5m 36s | Max:  5m 56s
      🟩 Clang18            Pass: 100%/7   | Total:  1h 18m | Avg: 11m 16s | Max: 29m 30s
      🟩 GCC7               Pass: 100%/2   | Total: 11m 07s | Avg:  5m 33s | Max:  5m 52s
      🟩 GCC8               Pass: 100%/1   | Total:  5m 57s | Avg:  5m 57s | Max:  5m 57s
      🟩 GCC9               Pass: 100%/2   | Total: 10m 44s | Avg:  5m 22s | Max:  5m 41s
      🟩 GCC10              Pass: 100%/2   | Total: 11m 46s | Avg:  5m 53s | Max:  6m 18s
      🟩 GCC11              Pass: 100%/2   | Total: 11m 35s | Avg:  5m 47s | Max:  5m 49s
      🟩 GCC12              Pass: 100%/4   | Total: 38m 31s | Avg:  9m 37s | Max: 22m 23s
      🟩 GCC13              Pass: 100%/8   | Total:  1h 48m | Avg: 13m 33s | Max: 28m 59s
      🟩 MSVC14.29          Pass: 100%/2   | Total: 53m 17s | Avg: 26m 38s | Max: 28m 25s | Hits: 535%/1776  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  1h 00m | Avg: 30m 19s | Max: 30m 50s | Hits: 533%/1776  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 19m 28s | Avg:  9m 44s | Max:  9m 56s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  2h 14m | Avg:  7m 54s | Max: 29m 30s
      🟩 GCC                Pass: 100%/21  | Total:  3h 18m | Avg:  9m 26s | Max: 28m 59s
      🟩 MSVC               Pass: 100%/4   | Total:  1h 53m | Avg: 28m 28s | Max: 30m 50s | Hits: 534%/3552  
      🟩 NVHPC              Pass: 100%/2   | Total: 19m 28s | Avg:  9m 44s | Max:  9m 56s
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 26m 42s | Avg: 13m 21s | Max: 22m 23s
      🟩 v100               Pass: 100%/42  | Total:  7h 19m | Avg: 10m 27s | Max: 30m 50s | Hits: 534%/3552  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  5h 02m | Avg:  8m 11s | Max: 30m 50s | Hits: 534%/3552  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 22m 25s | Avg: 22m 25s | Max: 22m 25s
      🟩 GraphCapture       Pass: 100%/1   | Total: 14m 35s | Avg: 14m 35s | Max: 14m 35s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 15m | Avg: 25m 11s | Max: 28m 59s
      🟩 TestGPU            Pass: 100%/2   | Total: 50m 28s | Avg: 25m 14s | Max: 29m 30s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 26m 42s | Avg: 13m 21s | Max: 22m 23s
      🟩 90a                Pass: 100%/1   | Total:  4m 17s | Avg:  4m 17s | Max:  4m 17s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  3h 02m | Avg:  9m 08s | Max: 29m 48s | Hits: 535%/2664  
      🟩 20                 Pass: 100%/24  | Total:  4h 43m | Avg: 11m 47s | Max: 30m 50s | Hits: 532%/888   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 10m 27s | Avg: 5m 13s | Max: 8m 21s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 10m 27s | Avg:  5m 13s | Max:  8m 21s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total: 10m 27s | Avg:  5m 13s | Max:  8m 21s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total: 10m 27s | Avg:  5m 13s | Max:  8m 21s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 10m 27s | Avg:  5m 13s | Max:  8m 21s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 10m 27s | Avg:  5m 13s | Max:  8m 21s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 10m 27s | Avg:  5m 13s | Max:  8m 21s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total: 10m 27s | Avg:  5m 13s | Max:  8m 21s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 06s | Avg:  2m 06s | Max:  2m 06s
      🟩 Test               Pass: 100%/1   | Total:  8m 21s | Avg:  8m 21s | Max:  8m 21s
    
  • 🟩 python: Pass: 100%/1 | Total: 41m 33s | Avg: 41m 33s | Max: 41m 33s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 41m 33s | Avg: 41m 33s | Max: 41m 33s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 41m 33s | Avg: 41m 33s | Max: 41m 33s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 41m 33s | Avg: 41m 33s | Max: 41m 33s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 41m 33s | Avg: 41m 33s | Max: 41m 33s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 41m 33s | Avg: 41m 33s | Max: 41m 33s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 41m 33s | Avg: 41m 33s | Max: 41m 33s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 41m 33s | Avg: 41m 33s | Max: 41m 33s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 41m 33s | Avg: 41m 33s | Max: 41m 33s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 90)

# Runner
65 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@gonidelis
Copy link
Member

run_length_encode.encode

|  T{ct}  |  OffsetT{ct}  |  Elements{io}  |  MaxSegSize  |   Ref Time |   Ref Noise |   Cmp Time |   Cmp Noise |        Diff |   %Diff |  Status  |
|---------|---------------|----------------|--------------|------------|-------------|------------|-------------|-------------|---------|----------|
|   I8    |      I32      |      2^16      |     2^1      |  12.365 us |       8.80% |  12.313 us |       8.35% |   -0.052 us |  -0.42% |   SAME   |
|   I8    |      I32      |      2^20      |     2^1      |  17.518 us |       8.52% |  16.065 us |       9.27% |   -1.453 us |  -8.29% |   SAME   |
|   I8    |      I32      |      2^24      |     2^1      |  74.461 us |       1.49% |  64.545 us |       1.94% |   -9.916 us | -13.32% |   FAST   |
|   I8    |      I32      |      2^28      |     2^1      |   1.000 ms |       0.30% | 840.618 us |       0.30% | -159.740 us | -15.97% |   FAST   |
|   I8    |      I32      |      2^16      |     2^4      |  12.494 us |       8.24% |  12.396 us |       8.13% |   -0.098 us |  -0.79% |   SAME   |
|   I8    |      I32      |      2^20      |     2^4      |  16.644 us |       6.80% |  14.666 us |       8.58% |   -1.977 us | -11.88% |   FAST   |
|   I8    |      I32      |      2^24      |     2^4      |  61.968 us |       1.77% |  52.026 us |       1.24% |   -9.943 us | -16.04% |   FAST   |
|   I8    |      I32      |      2^28      |     2^4      | 791.149 us |       0.27% | 630.145 us |       0.23% | -161.003 us | -20.35% |   FAST   |
|   I8    |      I32      |      2^16      |     2^8      |  12.596 us |       7.92% |  12.629 us |       7.88% |    0.033 us |   0.26% |   SAME   |
|   I8    |      I32      |      2^20      |     2^8      |  17.064 us |       5.09% |  15.222 us |       6.19% |   -1.842 us | -10.79% |   FAST   |
|   I8    |      I32      |      2^24      |     2^8      |  60.547 us |       1.51% |  50.225 us |       1.02% |  -10.322 us | -17.05% |   FAST   |
|   I8    |      I32      |      2^28      |     2^8      | 769.089 us |       0.30% | 605.555 us |       0.22% | -163.534 us | -21.26% |   FAST   |
|   I16   |      I32      |      2^16      |     2^1      |  14.434 us |       8.85% |  14.290 us |       8.80% |   -0.144 us |  -1.00% |   SAME   |
|   I16   |      I32      |      2^20      |     2^1      |  19.081 us |       4.64% |  17.029 us |       4.81% |   -2.052 us | -10.76% |   FAST   |
|   I16   |      I32      |      2^24      |     2^1      |  75.078 us |       1.07% |  69.225 us |       1.44% |   -5.854 us |  -7.80% |   FAST   |
|   I16   |      I32      |      2^28      |     2^1      | 992.142 us |       0.31% | 908.468 us |       0.42% |  -83.674 us |  -8.43% |   FAST   |
|   I16   |      I32      |      2^16      |     2^4      |  13.887 us |       8.57% |  13.302 us |       6.36% |   -0.585 us |  -4.21% |   SAME   |
|   I16   |      I32      |      2^20      |     2^4      |  18.220 us |       7.83% |  15.458 us |       9.28% |   -2.762 us | -15.16% |   FAST   |
|   I16   |      I32      |      2^24      |     2^4      |  63.226 us |       1.80% |  56.649 us |       1.65% |   -6.577 us | -10.40% |   FAST   |
|   I16   |      I32      |      2^28      |     2^4      | 801.585 us |       0.26% | 708.052 us |       0.22% |  -93.533 us | -11.67% |   FAST   |
|   I16   |      I32      |      2^16      |     2^8      |  13.772 us |       9.03% |  13.029 us |       6.46% |   -0.743 us |  -5.39% |   SAME   |
|   I16   |      I32      |      2^20      |     2^8      |  17.844 us |       8.13% |  14.799 us |       8.63% |   -3.045 us | -17.06% |   FAST   |
|   I16   |      I32      |      2^24      |     2^8      |  62.162 us |       1.44% |  55.599 us |       1.92% |   -6.563 us | -10.56% |   FAST   |
|   I16   |      I32      |      2^28      |     2^8      | 781.244 us |       0.28% | 685.730 us |       0.25% |  -95.514 us | -12.23% |   FAST   |
|   I32   |      I32      |      2^16      |     2^1      |  12.543 us |       8.07% |  12.505 us |       8.25% |   -0.038 us |  -0.30% |   SAME   |
|   I32   |      I32      |      2^20      |     2^1      |  19.079 us |       4.46% |  16.965 us |       5.04% |   -2.114 us | -11.08% |   FAST   |
|   I32   |      I32      |      2^24      |     2^1      |  74.520 us |       0.98% |  68.756 us |       0.83% |   -5.764 us |  -7.73% |   FAST   |
|   I32   |      I32      |      2^28      |     2^1      | 971.392 us |       0.49% | 894.222 us |       0.62% |  -77.170 us |  -7.94% |   FAST   |
|   I32   |      I32      |      2^16      |     2^4      |  12.485 us |       8.02% |  12.511 us |       8.01% |    0.026 us |   0.21% |   SAME   |
|   I32   |      I32      |      2^20      |     2^4      |  18.081 us |       7.87% |  16.236 us |       9.08% |   -1.845 us | -10.20% |   FAST   |
|   I32   |      I32      |      2^24      |     2^4      |  64.394 us |       0.86% |  56.646 us |       1.47% |   -7.748 us | -12.03% |   FAST   |
|   I32   |      I32      |      2^28      |     2^4      | 798.258 us |       0.16% | 693.319 us |       0.23% | -104.939 us | -13.15% |   FAST   |
|   I32   |      I32      |      2^16      |     2^8      |  12.844 us |       6.70% |  12.939 us |       6.10% |    0.095 us |   0.74% |   SAME   |
|   I32   |      I32      |      2^20      |     2^8      |  17.775 us |       8.47% |  15.822 us |      10.90% |   -1.953 us | -10.99% |   FAST   |
|   I32   |      I32      |      2^24      |     2^8      |  62.818 us |       1.43% |  55.452 us |       2.02% |   -7.366 us | -11.73% |   FAST   |
|   I32   |      I32      |      2^28      |     2^8      | 779.564 us |       0.17% | 664.655 us |       0.25% | -114.909 us | -14.74% |   FAST   |
|   I64   |      I32      |      2^16      |     2^1      |  14.508 us |       7.25% |  13.797 us |      12.52% |   -0.711 us |  -4.90% |   SAME   |
|   I64   |      I32      |      2^20      |     2^1      |  20.682 us |       5.24% |  18.752 us |       6.05% |   -1.930 us |  -9.33% |   FAST   |
|   I64   |      I32      |      2^24      |     2^1      | 100.379 us |       1.20% |  97.429 us |       0.78% |   -2.950 us |  -2.94% |   FAST   |
|   I64   |      I32      |      2^28      |     2^1      |   1.374 ms |       0.45% |   1.327 ms |       0.16% |  -47.151 us |  -3.43% |   FAST   |
|   I64   |      I32      |      2^16      |     2^4      |  13.468 us |       9.23% |  12.879 us |       6.42% |   -0.589 us |  -4.37% |   SAME   |
|   I64   |      I32      |      2^20      |     2^4      |  18.784 us |       6.50% |  17.547 us |      10.57% |   -1.237 us |  -6.59% |   FAST   |
|   I64   |      I32      |      2^24      |     2^4      |  78.364 us |       2.28% |  74.801 us |       0.62% |   -3.564 us |  -4.55% |   FAST   |
|   I64   |      I32      |      2^28      |     2^4      | 996.418 us |       0.41% | 963.574 us |       0.15% |  -32.844 us |  -3.30% |   FAST   |
|   I64   |      I32      |      2^16      |     2^8      |  13.390 us |       8.87% |  12.818 us |       6.80% |   -0.573 us |  -4.28% |   SAME   |
|   I64   |      I32      |      2^20      |     2^8      |  18.867 us |       6.36% |  17.181 us |       9.61% |   -1.687 us |  -8.94% |   FAST   |
|   I64   |      I32      |      2^24      |     2^8      |  74.540 us |       2.37% |  72.270 us |       1.21% |   -2.270 us |  -3.04% |   FAST   |
|   I64   |      I32      |      2^28      |     2^8      | 931.800 us |       0.48% | 912.641 us |       0.14% |  -19.158 us |  -2.06% |   FAST   |
|  I128   |      I32      |      2^16      |     2^1      |  14.880 us |       5.88% |  14.819 us |       6.21% |   -0.061 us |  -0.41% |   SAME   |
|  I128   |      I32      |      2^20      |     2^1      |  24.752 us |       4.61% |  24.827 us |       4.28% |    0.075 us |   0.30% |   SAME   |
|  I128   |      I32      |      2^24      |     2^1      | 165.281 us |       0.52% | 165.371 us |       0.55% |    0.090 us |   0.05% |   SAME   |
|  I128   |      I32      |      2^28      |     2^1      |   2.409 ms |       0.09% |   2.409 ms |       0.10% |   -0.031 us |  -0.00% |   SAME   |
|  I128   |      I32      |      2^16      |     2^4      |  15.271 us |       3.70% |  15.267 us |       4.12% |   -0.003 us |  -0.02% |   SAME   |
|  I128   |      I32      |      2^20      |     2^4      |  21.340 us |       7.56% |  21.380 us |       7.49% |    0.040 us |   0.19% |   SAME   |
|  I128   |      I32      |      2^24      |     2^4      | 122.091 us |       0.59% | 122.061 us |       0.50% |   -0.030 us |  -0.02% |   SAME   |
|  I128   |      I32      |      2^28      |     2^4      |   1.709 ms |       0.11% |   1.709 ms |       0.10% |   -0.009 us |  -0.00% |   SAME   |
|  I128   |      I32      |      2^16      |     2^8      |  15.234 us |       3.92% |  15.268 us |       4.26% |    0.034 us |   0.22% |   SAME   |
|  I128   |      I32      |      2^20      |     2^8      |  21.295 us |       4.97% |  21.331 us |       5.22% |    0.036 us |   0.17% |   SAME   |
|  I128   |      I32      |      2^24      |     2^8      | 116.908 us |       0.89% | 116.948 us |       0.88% |    0.041 us |   0.03% |   SAME   |
|  I128   |      I32      |      2^28      |     2^8      |   1.618 ms |       0.10% |   1.618 ms |       0.10% |   -0.024 us |  -0.00% |   SAME   |
|   F32   |      I32      |      2^16      |     2^1      |  13.435 us |       6.44% |  13.252 us |       4.05% |   -0.183 us |  -1.36% |   SAME   |
|   F32   |      I32      |      2^20      |     2^1      |  18.601 us |       5.73% |  16.572 us |       6.22% |   -2.029 us | -10.91% |   FAST   |
|   F32   |      I32      |      2^24      |     2^1      |  74.401 us |       1.16% |  75.491 us |       1.41% |    1.090 us |   1.46% |   SLOW   |
|   F32   |      I32      |      2^28      |     2^1      | 820.784 us |       0.35% | 820.743 us |       0.49% |   -0.041 us |  -0.01% |   SAME   |
|   F32   |      I32      |      2^16      |     2^4      |  12.883 us |       6.40% |  12.739 us |       7.23% |   -0.144 us |  -1.12% |   SAME   |
|   F32   |      I32      |      2^20      |     2^4      |  17.962 us |       8.51% |  15.621 us |      11.63% |   -2.340 us | -13.03% |   FAST   |
|   F32   |      I32      |      2^24      |     2^4      |  64.491 us |       0.55% |  62.596 us |       1.00% |   -1.895 us |  -2.94% |   FAST   |
|   F32   |      I32      |      2^28      |     2^4      | 803.495 us |       0.17% | 795.002 us |       0.24% |   -8.493 us |  -1.06% |   FAST   |
|   F32   |      I32      |      2^16      |     2^8      |  12.446 us |       8.21% |  12.513 us |       8.12% |    0.066 us |   0.53% |   SAME   |
|   F32   |      I32      |      2^20      |     2^8      |  17.850 us |       8.51% |  15.550 us |      11.30% |   -2.300 us | -12.89% |   FAST   |
|   F32   |      I32      |      2^24      |     2^8      |  63.181 us |       1.71% |  61.453 us |       1.88% |   -1.728 us |  -2.73% |   FAST   |
|   F32   |      I32      |      2^28      |     2^8      | 779.752 us |       0.16% | 767.168 us |       0.21% |  -12.583 us |  -1.61% |   FAST   |
|   F64   |      I32      |      2^16      |     2^1      |  14.448 us |       7.45% |  13.653 us |      13.51% |   -0.795 us |  -5.50% |   SAME   |
|   F64   |      I32      |      2^20      |     2^1      |  20.709 us |       5.60% |  18.791 us |       5.17% |   -1.918 us |  -9.26% |   FAST   |
|   F64   |      I32      |      2^24      |     2^1      |  99.532 us |       1.18% |  97.430 us |       0.91% |   -2.102 us |  -2.11% |   FAST   |
|   F64   |      I32      |      2^28      |     2^1      |   1.358 ms |       0.39% |   1.324 ms |       0.17% |  -34.212 us |  -2.52% |   FAST   |
|   F64   |      I32      |      2^16      |     2^4      |  13.325 us |       8.17% |  12.854 us |       6.71% |   -0.471 us |  -3.54% |   SAME   |
|   F64   |      I32      |      2^20      |     2^4      |  18.887 us |       6.26% |  17.568 us |       9.82% |   -1.318 us |  -6.98% |   FAST   |
|   F64   |      I32      |      2^24      |     2^4      |  77.488 us |       2.16% |  74.767 us |       0.73% |   -2.721 us |  -3.51% |   FAST   |
|   F64   |      I32      |      2^28      |     2^4      | 981.028 us |       0.42% | 959.982 us |       0.16% |  -21.046 us |  -2.15% |   FAST   |
|   F64   |      I32      |      2^16      |     2^8      |  13.302 us |       8.58% |  12.886 us |       6.48% |   -0.416 us |  -3.13% |   SAME   |
|   F64   |      I32      |      2^20      |     2^8      |  18.853 us |       5.82% |  16.911 us |       7.55% |   -1.942 us | -10.30% |   FAST   |
|   F64   |      I32      |      2^24      |     2^8      |  74.201 us |       2.27% |  71.807 us |       1.45% |   -2.394 us |  -3.23% |   FAST   |
|   F64   |      I32      |      2^28      |     2^8      | 923.491 us |       0.46% | 908.278 us |       0.15% |  -15.213 us |  -1.65% |   FAST   |
|   C64   |      I32      |      2^16      |     2^1      |  14.772 us |       7.82% |  14.904 us |       7.75% |    0.132 us |   0.89% |   SAME   |
|   C64   |      I32      |      2^20      |     2^1      |  20.872 us |       4.94% |  20.765 us |       5.23% |   -0.107 us |  -0.51% |   SAME   |
|   C64   |      I32      |      2^24      |     2^1      | 117.767 us |       0.71% | 117.707 us |       0.68% |   -0.060 us |  -0.05% |   SAME   |
|   C64   |      I32      |      2^28      |     2^1      |   1.426 ms |       0.31% |   1.426 ms |       0.30% |    0.155 us |   0.01% |   SAME   |
|   C64   |      I32      |      2^16      |     2^4      |  14.711 us |       7.55% |  14.743 us |       7.36% |    0.031 us |   0.21% |   SAME   |
|   C64   |      I32      |      2^20      |     2^4      |  19.445 us |       8.24% |  19.567 us |       8.05% |    0.122 us |   0.63% |   SAME   |
|   C64   |      I32      |      2^24      |     2^4      | 102.540 us |       1.34% | 102.593 us |       1.28% |    0.054 us |   0.05% |   SAME   |
|   C64   |      I32      |      2^28      |     2^4      |   1.395 ms |       0.31% |   1.395 ms |       0.35% |    0.198 us |   0.01% |   SAME   |
|   C64   |      I32      |      2^16      |     2^8      |  14.678 us |       7.58% |  14.627 us |       7.88% |   -0.051 us |  -0.35% |   SAME   |
|   C64   |      I32      |      2^20      |     2^8      |  19.076 us |       7.54% |  19.141 us |       7.66% |    0.065 us |   0.34% |   SAME   |
|   C64   |      I32      |      2^24      |     2^8      |  99.041 us |       1.26% |  99.291 us |       1.32% |    0.250 us |   0.25% |   SAME   |
|   C64   |      I32      |      2^28      |     2^8      |   1.341 ms |       0.31% |   1.341 ms |       0.31% |    0.473 us |   0.04% |   SAME   |

@gonidelis
Copy link
Member

run_length_encode.non_trivial_runs

|  T{ct}  |  OffsetT{ct}  |  Elements{io}  |  MaxSegSize  |   Ref Time |   Ref Noise |   Cmp Time |   Cmp Noise |       Diff |   %Diff |  Status  |
|---------|---------------|----------------|--------------|------------|-------------|------------|-------------|------------|---------|----------|
|   I8    |      I32      |      2^16      |     2^1      |  14.035 us |      11.46% |  14.195 us |      10.22% |   0.160 us |   1.14% |   SAME   |
|   I8    |      I32      |      2^20      |     2^1      |  16.802 us |       6.63% |  16.657 us |       6.74% |  -0.144 us |  -0.86% |   SAME   |
|   I8    |      I32      |      2^24      |     2^1      |  60.360 us |       1.39% |  55.410 us |       2.21% |  -4.949 us |  -8.20% |   FAST   |
|   I8    |      I32      |      2^28      |     2^1      | 760.875 us |       0.45% | 677.138 us |       0.62% | -83.738 us | -11.01% |   FAST   |
|   I8    |      I32      |      2^16      |     2^4      |  15.009 us |       5.68% |  14.408 us |       7.22% |  -0.601 us |  -4.01% |   SAME   |
|   I8    |      I32      |      2^20      |     2^4      |  16.580 us |       6.46% |  15.892 us |       7.98% |  -0.688 us |  -4.15% |   SAME   |
|   I8    |      I32      |      2^24      |     2^4      |  56.690 us |       1.60% |  52.158 us |       1.47% |  -4.532 us |  -7.99% |   FAST   |
|   I8    |      I32      |      2^28      |     2^4      | 705.786 us |       0.25% | 620.212 us |       0.23% | -85.575 us | -12.12% |   FAST   |
|   I8    |      I32      |      2^16      |     2^8      |  13.889 us |       7.92% |  14.652 us |       6.82% |   0.763 us |   5.49% |   SAME   |
|   I8    |      I32      |      2^20      |     2^8      |  17.061 us |       4.73% |  15.534 us |       4.54% |  -1.527 us |  -8.95% |   FAST   |
|   I8    |      I32      |      2^24      |     2^8      |  53.387 us |       2.22% |  49.738 us |       2.09% |  -3.649 us |  -6.84% |   FAST   |
|   I8    |      I32      |      2^28      |     2^8      | 648.751 us |       0.29% | 585.537 us |       0.26% | -63.214 us |  -9.74% |   FAST   |
|   I16   |      I32      |      2^16      |     2^1      |  12.835 us |      10.66% |  13.639 us |       7.52% |   0.804 us |   6.26% |   SAME   |
|   I16   |      I32      |      2^20      |     2^1      |  17.962 us |       8.61% |  16.529 us |       6.45% |  -1.434 us |  -7.98% |   FAST   |
|   I16   |      I32      |      2^24      |     2^1      |  61.379 us |       1.91% |  57.271 us |       1.89% |  -4.108 us |  -6.69% |   FAST   |
|   I16   |      I32      |      2^28      |     2^1      | 775.680 us |       0.46% | 697.976 us |       0.67% | -77.705 us | -10.02% |   FAST   |
|   I16   |      I32      |      2^16      |     2^4      |  12.685 us |       8.75% |  13.303 us |       2.84% |   0.618 us |   4.87% |   SLOW   |
|   I16   |      I32      |      2^20      |     2^4      |  17.848 us |       8.48% |  16.476 us |       6.44% |  -1.372 us |  -7.69% |   FAST   |
|   I16   |      I32      |      2^24      |     2^4      |  58.324 us |       1.39% |  54.074 us |       1.54% |  -4.250 us |  -7.29% |   FAST   |
|   I16   |      I32      |      2^28      |     2^4      | 726.266 us |       0.29% | 644.621 us |       0.31% | -81.645 us | -11.24% |   FAST   |
|   I16   |      I32      |      2^16      |     2^8      |  12.723 us |       9.56% |  13.161 us |       5.59% |   0.438 us |   3.44% |   SAME   |
|   I16   |      I32      |      2^20      |     2^8      |  17.493 us |       8.49% |  15.567 us |       6.80% |  -1.925 us | -11.01% |   FAST   |
|   I16   |      I32      |      2^24      |     2^8      |  56.481 us |       1.49% |  51.646 us |       2.25% |  -4.835 us |  -8.56% |   FAST   |
|   I16   |      I32      |      2^28      |     2^8      | 694.917 us |       0.30% | 602.971 us |       0.35% | -91.946 us | -13.23% |   FAST   |
|   I32   |      I32      |      2^16      |     2^1      |  14.110 us |      10.15% |  13.120 us |       5.00% |  -0.990 us |  -7.02% |   FAST   |
|   I32   |      I32      |      2^20      |     2^1      |  19.190 us |       6.05% |  17.393 us |       2.18% |  -1.797 us |  -9.36% |   FAST   |
|   I32   |      I32      |      2^24      |     2^1      |  67.734 us |       1.69% |  62.493 us |       0.58% |  -5.241 us |  -7.74% |   FAST   |
|   I32   |      I32      |      2^28      |     2^1      | 852.863 us |       0.44% | 784.594 us |       0.52% | -68.268 us |  -8.00% |   FAST   |
|   I32   |      I32      |      2^16      |     2^4      |  14.188 us |       7.55% |  13.216 us |       3.94% |  -0.972 us |  -6.85% |   FAST   |
|   I32   |      I32      |      2^20      |     2^4      |  18.881 us |       6.48% |  17.397 us |       1.75% |  -1.484 us |  -7.86% |   FAST   |
|   I32   |      I32      |      2^24      |     2^4      |  63.718 us |       1.76% |  58.872 us |       1.52% |  -4.847 us |  -7.61% |   FAST   |
|   I32   |      I32      |      2^28      |     2^4      | 787.231 us |       0.27% | 725.725 us |       0.17% | -61.505 us |  -7.81% |   FAST   |
|   I32   |      I32      |      2^16      |     2^8      |  13.926 us |       9.01% |  13.312 us |       1.66% |  -0.614 us |  -4.41% |   FAST   |
|   I32   |      I32      |      2^20      |     2^8      |  18.423 us |       6.74% |  17.317 us |       3.01% |  -1.106 us |  -6.00% |   FAST   |
|   I32   |      I32      |      2^24      |     2^8      |  60.399 us |       2.39% |  56.364 us |       0.54% |  -4.035 us |  -6.68% |   FAST   |
|   I32   |      I32      |      2^28      |     2^8      | 717.261 us |       0.49% | 676.313 us |       0.18% | -40.948 us |  -5.71% |   FAST   |
|   I64   |      I32      |      2^16      |     2^1      |  14.421 us |       7.94% |  15.375 us |       2.04% |   0.954 us |   6.61% |   SLOW   |
|   I64   |      I32      |      2^20      |     2^1      |  19.149 us |       7.05% |  17.484 us |       4.86% |  -1.665 us |  -8.69% |   FAST   |
|   I64   |      I32      |      2^24      |     2^1      |  73.338 us |       1.58% |  70.761 us |       1.32% |  -2.577 us |  -3.51% |   FAST   |
|   I64   |      I32      |      2^28      |     2^1      | 930.749 us |       0.54% | 872.018 us |       0.58% | -58.731 us |  -6.31% |   FAST   |
|   I64   |      I32      |      2^16      |     2^4      |  14.461 us |       7.19% |  14.919 us |       6.02% |   0.458 us |   3.17% |   SAME   |
|   I64   |      I32      |      2^20      |     2^4      |  18.882 us |       7.06% |  17.155 us |       4.87% |  -1.727 us |  -9.15% |   FAST   |
|   I64   |      I32      |      2^24      |     2^4      |  70.503 us |       1.81% |  67.358 us |       1.60% |  -3.145 us |  -4.46% |   FAST   |
|   I64   |      I32      |      2^28      |     2^4      | 876.183 us |       0.35% | 812.130 us |       0.28% | -64.053 us |  -7.31% |   FAST   |
|   I64   |      I32      |      2^16      |     2^8      |  14.532 us |       7.02% |  13.406 us |       1.94% |  -1.126 us |  -7.75% |   FAST   |
|   I64   |      I32      |      2^20      |     2^8      |  18.799 us |       6.37% |  17.145 us |       4.96% |  -1.654 us |  -8.80% |   FAST   |
|   I64   |      I32      |      2^24      |     2^8      |  68.328 us |       1.94% |  64.341 us |       1.91% |  -3.988 us |  -5.84% |   FAST   |
|   I64   |      I32      |      2^28      |     2^8      | 833.729 us |       0.42% | 753.181 us |       0.34% | -80.547 us |  -9.66% |   FAST   |
|  I128   |      I32      |      2^16      |     2^1      |  13.482 us |       8.88% |  14.087 us |       7.23% |   0.605 us |   4.49% |   SAME   |
|  I128   |      I32      |      2^20      |     2^1      |  20.686 us |       5.10% |  21.336 us |       3.01% |   0.649 us |   3.14% |   SLOW   |
|  I128   |      I32      |      2^24      |     2^1      |  98.081 us |       1.07% |  98.115 us |       1.04% |   0.034 us |   0.03% |   SAME   |
|  I128   |      I32      |      2^28      |     2^1      |   1.290 ms |       0.43% |   1.289 ms |       0.45% |  -0.259 us |  -0.02% |   SAME   |
|  I128   |      I32      |      2^16      |     2^4      |  13.429 us |      12.03% |  14.061 us |       7.91% |   0.632 us |   4.71% |   SAME   |
|  I128   |      I32      |      2^20      |     2^4      |  20.710 us |       5.20% |  21.349 us |       3.34% |   0.639 us |   3.09% |   SAME   |
|  I128   |      I32      |      2^24      |     2^4      |  95.632 us |       0.99% |  95.770 us |       0.96% |   0.137 us |   0.14% |   SAME   |
|  I128   |      I32      |      2^28      |     2^4      |   1.251 ms |       0.12% |   1.251 ms |       0.12% |  -0.059 us |  -0.00% |   SAME   |
|  I128   |      I32      |      2^16      |     2^8      |  13.027 us |      10.61% |  13.636 us |       7.05% |   0.609 us |   4.68% |   SAME   |
|  I128   |      I32      |      2^20      |     2^8      |  20.551 us |       6.82% |  21.060 us |       4.33% |   0.509 us |   2.48% |   SAME   |
|  I128   |      I32      |      2^24      |     2^8      |  92.431 us |       1.12% |  92.271 us |       1.12% |  -0.160 us |  -0.17% |   SAME   |
|  I128   |      I32      |      2^28      |     2^8      |   1.194 ms |       0.11% |   1.194 ms |       0.12% |   0.075 us |   0.01% |   SAME   |
|   F32   |      I32      |      2^16      |     2^1      |  14.120 us |      10.02% |  13.124 us |       5.07% |  -0.995 us |  -7.05% |   FAST   |
|   F32   |      I32      |      2^20      |     2^1      |  18.686 us |       6.25% |  17.225 us |       4.02% |  -1.461 us |  -7.82% |   FAST   |
|   F32   |      I32      |      2^24      |     2^1      |  67.670 us |       1.59% |  62.521 us |       0.54% |  -5.148 us |  -7.61% |   FAST   |
|   F32   |      I32      |      2^28      |     2^1      | 794.389 us |       0.41% | 733.871 us |       0.47% | -60.518 us |  -7.62% |   FAST   |
|   F32   |      I32      |      2^16      |     2^4      |  14.348 us |       7.60% |  13.327 us |       2.06% |  -1.021 us |  -7.11% |   FAST   |
|   F32   |      I32      |      2^20      |     2^4      |  18.517 us |       6.66% |  17.292 us |       3.22% |  -1.226 us |  -6.62% |   FAST   |
|   F32   |      I32      |      2^24      |     2^4      |  63.721 us |       1.75% |  58.781 us |       1.40% |  -4.940 us |  -7.75% |   FAST   |
|   F32   |      I32      |      2^28      |     2^4      | 783.656 us |       0.28% | 726.163 us |       0.16% | -57.493 us |  -7.34% |   FAST   |
|   F32   |      I32      |      2^16      |     2^8      |  13.417 us |      11.79% |  13.065 us |       5.66% |  -0.352 us |  -2.63% |   SAME   |
|   F32   |      I32      |      2^20      |     2^8      |  18.352 us |       7.06% |  17.257 us |       3.30% |  -1.095 us |  -5.96% |   FAST   |
|   F32   |      I32      |      2^24      |     2^8      |  60.193 us |       2.28% |  56.364 us |       0.54% |  -3.829 us |  -6.36% |   FAST   |
|   F32   |      I32      |      2^28      |     2^8      | 717.542 us |       0.49% | 676.385 us |       0.19% | -41.157 us |  -5.74% |   FAST   |
|   F64   |      I32      |      2^16      |     2^1      |  14.185 us |       9.61% |  14.405 us |       7.58% |   0.220 us |   1.55% |   SAME   |
|   F64   |      I32      |      2^20      |     2^1      |  19.105 us |       6.66% |  19.722 us |       3.60% |   0.617 us |   3.23% |   SAME   |
|   F64   |      I32      |      2^24      |     2^1      |  72.653 us |       2.22% |  72.483 us |       1.98% |  -0.170 us |  -0.23% |   SAME   |
|   F64   |      I32      |      2^28      |     2^1      | 908.685 us |       0.56% | 909.686 us |       0.61% |   1.001 us |   0.11% |   SAME   |
|   F64   |      I32      |      2^16      |     2^4      |  14.204 us |       7.30% |  14.308 us |       7.31% |   0.104 us |   0.73% |   SAME   |
|   F64   |      I32      |      2^20      |     2^4      |  18.919 us |       6.53% |  19.492 us |       3.98% |   0.573 us |   3.03% |   SAME   |
|   F64   |      I32      |      2^24      |     2^4      |  69.490 us |       2.13% |  69.289 us |       1.95% |  -0.201 us |  -0.29% |   SAME   |
|   F64   |      I32      |      2^28      |     2^4      | 851.832 us |       0.43% | 851.940 us |       0.42% |   0.108 us |   0.01% |   SAME   |
|   F64   |      I32      |      2^16      |     2^8      |  14.254 us |       7.38% |  14.300 us |       7.36% |   0.046 us |   0.32% |   SAME   |
|   F64   |      I32      |      2^20      |     2^8      |  18.924 us |       6.01% |  19.563 us |       3.54% |   0.639 us |   3.38% |   SAME   |
|   F64   |      I32      |      2^24      |     2^8      |  67.096 us |       2.69% |  67.116 us |       2.53% |   0.021 us |   0.03% |   SAME   |
|   F64   |      I32      |      2^28      |     2^8      | 802.087 us |       0.54% | 802.399 us |       0.55% |   0.312 us |   0.04% |   SAME   |
|   C64   |      I32      |      2^16      |     2^1      |  15.258 us |       3.70% |  15.427 us |       2.27% |   0.169 us |   1.11% |   SAME   |
|   C64   |      I32      |      2^20      |     2^1      |  20.823 us |       5.05% |  21.359 us |       3.01% |   0.536 us |   2.57% |   SAME   |
|   C64   |      I32      |      2^24      |     2^1      |  96.978 us |       1.36% |  96.970 us |       1.41% |  -0.008 us |  -0.01% |   SAME   |
|   C64   |      I32      |      2^28      |     2^1      |   1.263 ms |       0.42% |   1.263 ms |       0.46% |   0.374 us |   0.03% |   SAME   |
|   C64   |      I32      |      2^16      |     2^4      |  15.321 us |       3.32% |  15.393 us |       2.21% |   0.072 us |   0.47% |   SAME   |
|   C64   |      I32      |      2^20      |     2^4      |  20.443 us |       6.92% |  21.105 us |       4.96% |   0.662 us |   3.24% |   SAME   |
|   C64   |      I32      |      2^24      |     2^4      |  96.055 us |       1.68% |  95.705 us |       1.59% |  -0.350 us |  -0.36% |   SAME   |
|   C64   |      I32      |      2^28      |     2^4      |   1.260 ms |       0.37% |   1.261 ms |       0.36% |   0.535 us |   0.04% |   SAME   |
|   C64   |      I32      |      2^16      |     2^8      |  15.192 us |       4.19% |  15.372 us |       2.59% |   0.180 us |   1.19% |   SAME   |
|   C64   |      I32      |      2^20      |     2^8      |  20.112 us |       7.92% |  20.815 us |       5.58% |   0.703 us |   3.50% |   SAME   |
|   C64   |      I32      |      2^24      |     2^8      |  91.645 us |       1.39% |  91.565 us |       1.42% |  -0.081 us |  -0.09% |   SAME   |
|   C64   |      I32      |      2^28      |     2^8      |   1.201 ms |       0.31% |   1.202 ms |       0.31% |   0.994 us |   0.08% |   SAME   |```

@gonidelis gonidelis self-requested a review February 4, 2025 09:31
@bernhardmgruber
Copy link
Contributor Author

I think we should revert that tuning for run_length_encode.non_trivial_runs:

|  I128   |      I32      |      2^16      |     2^1      |  13.482 us |       8.88% |  14.087 us |       7.23% |   0.605 us |   4.49% |   SAME   |
|  I128   |      I32      |      2^20      |     2^1      |  20.686 us |       5.10% |  21.336 us |       3.01% |   0.649 us |   3.14% |   SLOW   |
|  I128   |      I32      |      2^24      |     2^1      |  98.081 us |       1.07% |  98.115 us |       1.04% |   0.034 us |   0.03% |   SAME   |
|  I128   |      I32      |      2^28      |     2^1      |   1.290 ms |       0.43% |   1.289 ms |       0.45% |  -0.259 us |  -0.02% |   SAME   |
|  I128   |      I32      |      2^16      |     2^4      |  13.429 us |      12.03% |  14.061 us |       7.91% |   0.632 us |   4.71% |   SAME   |
|  I128   |      I32      |      2^20      |     2^4      |  20.710 us |       5.20% |  21.349 us |       3.34% |   0.639 us |   3.09% |   SAME   |
|  I128   |      I32      |      2^24      |     2^4      |  95.632 us |       0.99% |  95.770 us |       0.96% |   0.137 us |   0.14% |   SAME   |
|  I128   |      I32      |      2^28      |     2^4      |   1.251 ms |       0.12% |   1.251 ms |       0.12% |  -0.059 us |  -0.00% |   SAME   |
|  I128   |      I32      |      2^16      |     2^8      |  13.027 us |      10.61% |  13.636 us |       7.05% |   0.609 us |   4.68% |   SAME   |
|  I128   |      I32      |      2^20      |     2^8      |  20.551 us |       6.82% |  21.060 us |       4.33% |   0.509 us |   2.48% |   SAME   |
|  I128   |      I32      |      2^24      |     2^8      |  92.431 us |       1.12% |  92.271 us |       1.12% |  -0.160 us |  -0.17% |   SAME   |
|  I128   |      I32      |      2^28      |     2^8      |   1.194 ms |       0.11% |   1.194 ms |       0.12% |   0.075 us |   0.01% |   SAME   |

@gonidelis
Copy link
Member

I think we should revert that tuning for run_length_encode.non_trivial_runs:

https://github.com/NVIDIA/cccl/pull/3546/files#diff-04fb86d848b1695abf1e977cc4b7ba7b55f0ce42f0ff50c37ccb9b62ebed27f3R579

There is no tuning for that workload. Something is getting picked up wrong maybe? Or a machine hiccup. @bernhardmgruber

@bernhardmgruber
Copy link
Contributor Author

bernhardmgruber commented Feb 5, 2025

I think we should revert that tuning for run_length_encode.non_trivial_runs:
There is no tuning for that workload. Something is getting picked up wrong maybe? Or a machine hiccup. @bernhardmgruber

Well done! That fastest fix is to prove me wrong! :D

@bernhardmgruber
Copy link
Contributor Author

Alright, this tuning LGTM!

@bernhardmgruber bernhardmgruber enabled auto-merge (squash) February 5, 2025 16:35
@bernhardmgruber bernhardmgruber enabled auto-merge (squash) February 5, 2025 16:36
Copy link
Contributor

github-actions bot commented Feb 5, 2025

🟨 CI finished in 1h 48m: Pass: 98%/90 | Total: 2d 03h | Avg: 34m 00s | Max: 1h 20m | Hits: 162%/13398
  • 🟨 cub: Pass: 97%/44 | Total: 1d 07h | Avg: 42m 42s | Max: 1h 20m | Hits: 92%/4168

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  97%/42  | Total:  1d 06h | Avg: 43m 05s | Max:  1h 20m | Hits:  92%/4168  
      🟩 arm64              Pass: 100%/2   | Total:  1h 09m | Avg: 34m 31s | Max:  1h 03m
    🔍 ctk: 12.8 🔍
      🟩 12.0               Pass: 100%/5   | Total:  3h 31m | Avg: 42m 21s | Max:  1h 09m | Hits:  94%/1042  
      🟩 12.5               Pass: 100%/2   | Total:  2h 30m | Avg:  1h 15m | Max:  1h 16m
      🔍 12.8               Pass:  97%/37  | Total:  1d 01h | Avg: 41m 00s | Max:  1h 20m | Hits:  91%/3126  
    🔍 cudacxx: nvcc12.8 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 01m | Avg: 30m 59s | Max: 57m 15s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 31m | Avg: 42m 21s | Max:  1h 09m | Hits:  94%/1042  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 30m | Avg:  1h 15m | Max:  1h 16m
      🔍 nvcc12.8           Pass:  97%/35  | Total:  1d 00h | Avg: 41m 34s | Max:  1h 20m | Hits:  91%/3126  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 01m | Avg: 30m 59s | Max: 57m 15s
      🔍 nvcc               Pass:  97%/42  | Total:  1d 06h | Avg: 43m 15s | Max:  1h 20m | Hits:  92%/4168  
    🔍 cxx: Clang18 🔍
      🟩 Clang14            Pass: 100%/4   | Total: 54m 18s | Avg: 13m 34s | Max: 36m 49s
      🟩 Clang15            Pass: 100%/2   | Total: 12m 41s | Avg:  6m 20s | Max:  6m 28s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 17m | Avg: 38m 31s | Max: 39m 36s
      🟩 Clang17            Pass: 100%/2   | Total: 46m 35s | Avg: 23m 17s | Max: 39m 54s
      🔍 Clang18            Pass:  85%/7   | Total:  2h 13m | Avg: 19m 07s | Max: 57m 15s
      🟩 GCC7               Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 05m
      🟩 GCC8               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
      🟩 GCC9               Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 08m
      🟩 GCC10              Pass: 100%/2   | Total:  1h 59m | Avg: 59m 35s | Max:  1h 00m
      🟩 GCC11              Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 01m
      🟩 GCC12              Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 07m
      🟩 GCC13              Pass: 100%/10  | Total:  6h 49m | Avg: 40m 57s | Max:  1h 16m
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 27m | Avg:  1h 13m | Max:  1h 17m | Hits:  94%/2084  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 34m | Avg:  1h 17m | Max:  1h 20m | Hits:  90%/2084  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 30m | Avg:  1h 15m | Max:  1h 16m
    🔍 cxx_family: Clang 🔍
      🔍 Clang              Pass:  94%/17  | Total:  5h 24m | Avg: 19m 05s | Max: 57m 15s
      🟩 GCC                Pass: 100%/21  | Total: 18h 23m | Avg: 52m 32s | Max:  1h 16m
      🟩 MSVC               Pass: 100%/4   | Total:  5h 01m | Avg:  1h 15m | Max:  1h 20m | Hits:  92%/4168  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 30m | Avg:  1h 15m | Max:  1h 16m
    🔍 gpu: rtxa6000 🔍
      🟩 h100               Pass: 100%/2   | Total: 56m 09s | Avg: 28m 04s | Max: 30m 12s
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 02h | Avg: 47m 28s | Max:  1h 20m | Hits:  92%/4168  
      🔍 rtxa6000           Pass:  87%/8   | Total:  3h 28m | Avg: 26m 04s | Max:  1h 07m
    🔍 jobs: HostLaunch 🔍
      🟩 Build              Pass: 100%/37  | Total:  1d 05h | Avg: 47m 15s | Max:  1h 20m | Hits:  92%/4168  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 20m 25s | Avg: 20m 25s | Max: 20m 25s
      🟩 GraphCapture       Pass: 100%/1   | Total: 15m 43s | Avg: 15m 43s | Max: 15m 43s
      🔍 HostLaunch         Pass:  66%/3   | Total: 51m 52s | Avg: 17m 17s | Max: 25m 57s
      🟩 TestGPU            Pass: 100%/2   | Total: 42m 23s | Avg: 21m 11s | Max: 21m 32s
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/20  | Total: 16h 53m | Avg: 50m 39s | Max:  1h 20m | Hits:  94%/3126  
      🔍 20                 Pass:  95%/24  | Total: 14h 25m | Avg: 36m 04s | Max:  1h 16m | Hits:  85%/1042  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 56m 09s | Avg: 28m 04s | Max: 30m 12s
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 16m | Avg:  1h 16m | Max:  1h 16m
    
  • 🟩 thrust: Pass: 100%/43 | Total: 19h 02m | Avg: 26m 34s | Max: 1h 15m | Hits: 194%/9230

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 39m 54s | Avg: 19m 57s | Max: 29m 35s
    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total: 18h 23m | Avg: 26m 54s | Max:  1h 15m | Hits: 194%/9230  
      🟩 arm64              Pass: 100%/2   | Total: 39m 46s | Avg: 19m 53s | Max: 35m 07s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  2h 15m | Avg: 27m 04s | Max: 53m 26s | Hits: 147%/1846  
      🟩 12.5               Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 15m
      🟩 12.8               Pass: 100%/36  | Total: 14h 23m | Avg: 23m 59s | Max:  1h 00m | Hits: 205%/7384  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 48s | Avg:  5m 24s | Max:  5m 28s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 15m | Avg: 27m 04s | Max: 53m 26s | Hits: 147%/1846  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 15m
      🟩 nvcc12.8           Pass: 100%/34  | Total: 14h 13m | Avg: 25m 05s | Max:  1h 00m | Hits: 205%/7384  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 48s | Avg:  5m 24s | Max:  5m 28s
      🟩 nvcc               Pass: 100%/41  | Total: 18h 51m | Avg: 27m 36s | Max:  1h 15m | Hits: 194%/9230  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 20m 58s | Avg:  5m 14s | Max:  5m 25s
      🟩 Clang15            Pass: 100%/2   | Total: 11m 22s | Avg:  5m 41s | Max:  5m 42s
      🟩 Clang16            Pass: 100%/2   | Total: 12m 03s | Avg:  6m 01s | Max:  6m 13s
      🟩 Clang17            Pass: 100%/2   | Total: 12m 02s | Avg:  6m 01s | Max:  6m 02s
      🟩 Clang18            Pass: 100%/7   | Total: 45m 25s | Avg:  6m 29s | Max: 10m 23s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 08m | Avg: 34m 10s | Max: 34m 48s
      🟩 GCC8               Pass: 100%/1   | Total: 39m 01s | Avg: 39m 01s | Max: 39m 01s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 16m | Avg: 38m 00s | Max: 38m 16s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 15m | Avg: 37m 58s | Max: 38m 28s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 17m | Avg: 38m 31s | Max: 38m 41s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 24m | Avg: 42m 10s | Max: 42m 48s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 38m | Avg: 27m 22s | Max: 42m 04s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 46m | Avg: 53m 27s | Max: 53m 29s | Hits: 155%/3692  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 30m | Avg: 50m 16s | Max:  1h 00m | Hits: 220%/5538  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 15m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  1h 41m | Avg:  5m 59s | Max: 10m 23s
      🟩 GCC                Pass: 100%/19  | Total: 10h 39m | Avg: 33m 40s | Max: 42m 48s
      🟩 MSVC               Pass: 100%/5   | Total:  4h 17m | Avg: 51m 33s | Max:  1h 00m | Hits: 194%/9230  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 15m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/33  | Total: 15h 23m | Avg: 27m 58s | Max:  1h 15m | Hits: 152%/5538  
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 39m | Avg: 21m 58s | Max:  1h 00m | Hits: 256%/3692  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total: 17h 40m | Avg: 28m 39s | Max:  1h 15m | Hits: 151%/7384  
      🟩 TestCPU            Pass: 100%/3   | Total: 50m 28s | Avg: 16m 49s | Max: 34m 14s | Hits: 365%/1846  
      🟩 TestGPU            Pass: 100%/3   | Total: 31m 54s | Avg: 10m 38s | Max: 11m 12s
    🟩 sm
      🟩 90;90a;100         Pass: 100%/1   | Total: 42m 04s | Avg: 42m 04s | Max: 42m 04s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 10h 14m | Avg: 30m 42s | Max:  1h 08m | Hits: 152%/5538  
      🟩 20                 Pass: 100%/21  | Total:  8h 08m | Avg: 23m 16s | Max:  1h 15m | Hits: 256%/3692  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 7m 24s | Avg: 3m 42s | Max: 5m 09s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  7m 24s | Avg:  3m 42s | Max:  5m 09s
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total:  7m 24s | Avg:  3m 42s | Max:  5m 09s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total:  7m 24s | Avg:  3m 42s | Max:  5m 09s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  7m 24s | Avg:  3m 42s | Max:  5m 09s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  7m 24s | Avg:  3m 42s | Max:  5m 09s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  7m 24s | Avg:  3m 42s | Max:  5m 09s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  7m 24s | Avg:  3m 42s | Max:  5m 09s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 15s | Avg:  2m 15s | Max:  2m 15s
      🟩 Test               Pass: 100%/1   | Total:  5m 09s | Avg:  5m 09s | Max:  5m 09s
    
  • 🟩 python: Pass: 100%/1 | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 90)

# Runner
65 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

Copy link
Contributor

github-actions bot commented Feb 5, 2025

🟩 CI finished in 2h 21m: Pass: 100%/90 | Total: 2d 03h | Avg: 34m 14s | Max: 1h 20m | Hits: 162%/13398
  • 🟩 cub: Pass: 100%/44 | Total: 1d 07h | Avg: 43m 11s | Max: 1h 20m | Hits: 92%/4168

    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  1d 06h | Avg: 43m 35s | Max:  1h 20m | Hits:  92%/4168  
      🟩 arm64              Pass: 100%/2   | Total:  1h 09m | Avg: 34m 31s | Max:  1h 03m
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  3h 31m | Avg: 42m 21s | Max:  1h 09m | Hits:  94%/1042  
      🟩 12.5               Pass: 100%/2   | Total:  2h 30m | Avg:  1h 15m | Max:  1h 16m
      🟩 12.8               Pass: 100%/37  | Total:  1d 01h | Avg: 41m 34s | Max:  1h 20m | Hits:  91%/3126  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 01m | Avg: 30m 59s | Max: 57m 15s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 31m | Avg: 42m 21s | Max:  1h 09m | Hits:  94%/1042  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 30m | Avg:  1h 15m | Max:  1h 16m
      🟩 nvcc12.8           Pass: 100%/35  | Total:  1d 00h | Avg: 42m 11s | Max:  1h 20m | Hits:  91%/3126  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 01m | Avg: 30m 59s | Max: 57m 15s
      🟩 nvcc               Pass: 100%/42  | Total:  1d 06h | Avg: 43m 46s | Max:  1h 20m | Hits:  92%/4168  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 54m 18s | Avg: 13m 34s | Max: 36m 49s
      🟩 Clang15            Pass: 100%/2   | Total: 12m 41s | Avg:  6m 20s | Max:  6m 28s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 17m | Avg: 38m 31s | Max: 39m 36s
      🟩 Clang17            Pass: 100%/2   | Total: 46m 35s | Avg: 23m 17s | Max: 39m 54s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 34m | Avg: 22m 07s | Max: 57m 15s
      🟩 GCC7               Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 05m
      🟩 GCC8               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
      🟩 GCC9               Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 08m
      🟩 GCC10              Pass: 100%/2   | Total:  1h 59m | Avg: 59m 35s | Max:  1h 00m
      🟩 GCC11              Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 01m
      🟩 GCC12              Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 07m
      🟩 GCC13              Pass: 100%/10  | Total:  6h 49m | Avg: 40m 57s | Max:  1h 16m
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 27m | Avg:  1h 13m | Max:  1h 17m | Hits:  94%/2084  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 34m | Avg:  1h 17m | Max:  1h 20m | Hits:  90%/2084  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 30m | Avg:  1h 15m | Max:  1h 16m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  5h 45m | Avg: 20m 19s | Max: 57m 15s
      🟩 GCC                Pass: 100%/21  | Total: 18h 23m | Avg: 52m 32s | Max:  1h 16m
      🟩 MSVC               Pass: 100%/4   | Total:  5h 01m | Avg:  1h 15m | Max:  1h 20m | Hits:  92%/4168  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 30m | Avg:  1h 15m | Max:  1h 16m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 56m 09s | Avg: 28m 04s | Max: 30m 12s
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 02h | Avg: 47m 28s | Max:  1h 20m | Hits:  92%/4168  
      🟩 rtxa6000           Pass: 100%/8   | Total:  3h 49m | Avg: 28m 42s | Max:  1h 07m
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 05h | Avg: 47m 15s | Max:  1h 20m | Hits:  92%/4168  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 20m 25s | Avg: 20m 25s | Max: 20m 25s
      🟩 GraphCapture       Pass: 100%/1   | Total: 15m 43s | Avg: 15m 43s | Max: 15m 43s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 12m | Avg: 24m 18s | Max: 25m 57s
      🟩 TestGPU            Pass: 100%/2   | Total: 42m 23s | Avg: 21m 11s | Max: 21m 32s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 56m 09s | Avg: 28m 04s | Max: 30m 12s
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 16m | Avg:  1h 16m | Max:  1h 16m
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 16h 53m | Avg: 50m 39s | Max:  1h 20m | Hits:  94%/3126  
      🟩 20                 Pass: 100%/24  | Total: 14h 46m | Avg: 36m 57s | Max:  1h 16m | Hits:  85%/1042  
    
  • 🟩 thrust: Pass: 100%/43 | Total: 19h 02m | Avg: 26m 34s | Max: 1h 15m | Hits: 194%/9230

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 39m 54s | Avg: 19m 57s | Max: 29m 35s
    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total: 18h 23m | Avg: 26m 54s | Max:  1h 15m | Hits: 194%/9230  
      🟩 arm64              Pass: 100%/2   | Total: 39m 46s | Avg: 19m 53s | Max: 35m 07s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  2h 15m | Avg: 27m 04s | Max: 53m 26s | Hits: 147%/1846  
      🟩 12.5               Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 15m
      🟩 12.8               Pass: 100%/36  | Total: 14h 23m | Avg: 23m 59s | Max:  1h 00m | Hits: 205%/7384  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 48s | Avg:  5m 24s | Max:  5m 28s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 15m | Avg: 27m 04s | Max: 53m 26s | Hits: 147%/1846  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 15m
      🟩 nvcc12.8           Pass: 100%/34  | Total: 14h 13m | Avg: 25m 05s | Max:  1h 00m | Hits: 205%/7384  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 48s | Avg:  5m 24s | Max:  5m 28s
      🟩 nvcc               Pass: 100%/41  | Total: 18h 51m | Avg: 27m 36s | Max:  1h 15m | Hits: 194%/9230  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 20m 58s | Avg:  5m 14s | Max:  5m 25s
      🟩 Clang15            Pass: 100%/2   | Total: 11m 22s | Avg:  5m 41s | Max:  5m 42s
      🟩 Clang16            Pass: 100%/2   | Total: 12m 03s | Avg:  6m 01s | Max:  6m 13s
      🟩 Clang17            Pass: 100%/2   | Total: 12m 02s | Avg:  6m 01s | Max:  6m 02s
      🟩 Clang18            Pass: 100%/7   | Total: 45m 25s | Avg:  6m 29s | Max: 10m 23s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 08m | Avg: 34m 10s | Max: 34m 48s
      🟩 GCC8               Pass: 100%/1   | Total: 39m 01s | Avg: 39m 01s | Max: 39m 01s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 16m | Avg: 38m 00s | Max: 38m 16s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 15m | Avg: 37m 58s | Max: 38m 28s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 17m | Avg: 38m 31s | Max: 38m 41s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 24m | Avg: 42m 10s | Max: 42m 48s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 38m | Avg: 27m 22s | Max: 42m 04s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 46m | Avg: 53m 27s | Max: 53m 29s | Hits: 155%/3692  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 30m | Avg: 50m 16s | Max:  1h 00m | Hits: 220%/5538  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 15m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  1h 41m | Avg:  5m 59s | Max: 10m 23s
      🟩 GCC                Pass: 100%/19  | Total: 10h 39m | Avg: 33m 40s | Max: 42m 48s
      🟩 MSVC               Pass: 100%/5   | Total:  4h 17m | Avg: 51m 33s | Max:  1h 00m | Hits: 194%/9230  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 15m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/33  | Total: 15h 23m | Avg: 27m 58s | Max:  1h 15m | Hits: 152%/5538  
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 39m | Avg: 21m 58s | Max:  1h 00m | Hits: 256%/3692  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total: 17h 40m | Avg: 28m 39s | Max:  1h 15m | Hits: 151%/7384  
      🟩 TestCPU            Pass: 100%/3   | Total: 50m 28s | Avg: 16m 49s | Max: 34m 14s | Hits: 365%/1846  
      🟩 TestGPU            Pass: 100%/3   | Total: 31m 54s | Avg: 10m 38s | Max: 11m 12s
    🟩 sm
      🟩 90;90a;100         Pass: 100%/1   | Total: 42m 04s | Avg: 42m 04s | Max: 42m 04s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 10h 14m | Avg: 30m 42s | Max:  1h 08m | Hits: 152%/5538  
      🟩 20                 Pass: 100%/21  | Total:  8h 08m | Avg: 23m 16s | Max:  1h 15m | Hits: 256%/3692  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 7m 24s | Avg: 3m 42s | Max: 5m 09s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  7m 24s | Avg:  3m 42s | Max:  5m 09s
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total:  7m 24s | Avg:  3m 42s | Max:  5m 09s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total:  7m 24s | Avg:  3m 42s | Max:  5m 09s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  7m 24s | Avg:  3m 42s | Max:  5m 09s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  7m 24s | Avg:  3m 42s | Max:  5m 09s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  7m 24s | Avg:  3m 42s | Max:  5m 09s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  7m 24s | Avg:  3m 42s | Max:  5m 09s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 15s | Avg:  2m 15s | Max:  2m 15s
      🟩 Test               Pass: 100%/1   | Total:  5m 09s | Avg:  5m 09s | Max:  5m 09s
    
  • 🟩 python: Pass: 100%/1 | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 31m 01s | Avg: 31m 01s | Max: 31m 01s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 90)

# Runner
65 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

@bernhardmgruber bernhardmgruber merged commit 7919645 into NVIDIA:main Feb 5, 2025
102 of 105 checks passed
Copy link
Contributor

github-actions bot commented Feb 5, 2025

Backport failed for branch/2.8.x, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin branch/2.8.x
git worktree add -d .worktree/backport-3546-to-branch/2.8.x origin/branch/2.8.x
cd .worktree/backport-3546-to-branch/2.8.x
git switch --create backport-3546-to-branch/2.8.x
git cherry-pick -x 791964587baf970b5a8657ef2a54aa11ad887e5f

@bernhardmgruber bernhardmgruber deleted the tune_rle branch February 5, 2025 18:09
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this pull request Feb 5, 2025
miscco pushed a commit that referenced this pull request Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants