Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safe warp shuffle operations #3907

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open

Safe warp shuffle operations #3907

wants to merge 23 commits into from

Conversation

fbusato
Copy link
Contributor

@fbusato fbusato commented Feb 22, 2025

Fixes #2976

Documentation preview: Warp Shuffle Docs.pdf

@fbusato fbusato self-assigned this Feb 22, 2025
Copy link

copy-pr-bot bot commented Feb 22, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@fbusato fbusato marked this pull request as ready for review February 25, 2025 00:02
@fbusato fbusato requested review from a team as code owners February 25, 2025 00:02
Copy link
Contributor

🟨 CI finished in 1h 35m: Pass: 77%/158 | Total: 2d 22h | Avg: 26m 54s | Max: 1h 20m | Hits: 68%/158505
  • 🟨 libcudacxx: Pass: 18%/43 | Total: 3h 55m | Avg: 5m 29s | Max: 25m 30s | Hits: 83%/13312

    🟨 ctk
      🟨 12.0               Pass:  20%/5   | Total: 36m 40s | Avg:  7m 20s | Max: 24m 58s | Hits:  34%/2546  
      🟩 12.5               Pass: 100%/2   | Total: 21m 13s | Avg: 10m 36s | Max: 12m 00s | Hits:  93%/5614  
      🟨 12.8               Pass:  13%/36  | Total:  2h 58m | Avg:  4m 56s | Max: 25m 30s | Hits:  97%/5152  
    🟨 cudacxx
      🟥 ClangCUDA18        Pass:   0%/2   | Total:  6m 11s | Avg:  3m 05s | Max:  3m 19s
      🟨 nvcc12.0           Pass:  20%/5   | Total: 36m 40s | Avg:  7m 20s | Max: 24m 58s | Hits:  34%/2546  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 21m 13s | Avg: 10m 36s | Max: 12m 00s | Hits:  93%/5614  
      🟨 nvcc12.8           Pass:  14%/34  | Total:  2h 51m | Avg:  5m 03s | Max: 25m 30s | Hits:  97%/5152  
    🟨 cxx
      🟥 Clang14            Pass:   0%/4   | Total: 12m 19s | Avg:  3m 04s | Max:  3m 11s
      🟥 Clang15            Pass:   0%/2   | Total:  6m 08s | Avg:  3m 04s | Max:  3m 09s
      🟥 Clang16            Pass:   0%/2   | Total:  6m 17s | Avg:  3m 08s | Max:  3m 11s
      🟥 Clang17            Pass:   0%/2   | Total:  6m 24s | Avg:  3m 12s | Max:  3m 18s
      🟥 Clang18            Pass:   0%/6   | Total: 14m 55s | Avg:  2m 29s | Max:  3m 19s
      🟥 GCC7               Pass:   0%/2   | Total:  5m 33s | Avg:  2m 46s | Max:  2m 49s
      🟥 GCC8               Pass:   0%/1   | Total:  2m 55s | Avg:  2m 55s | Max:  2m 55s
      🟥 GCC9               Pass:   0%/2   | Total:  6m 08s | Avg:  3m 04s | Max:  3m 19s
      🟥 GCC10              Pass:   0%/2   | Total:  5m 58s | Avg:  2m 59s | Max:  3m 04s
      🟥 GCC11              Pass:   0%/2   | Total:  5m 59s | Avg:  2m 59s | Max:  3m 01s
      🟥 GCC12              Pass:   0%/2   | Total:  6m 07s | Avg:  3m 03s | Max:  3m 06s
      🟨 GCC13              Pass:  30%/10  | Total: 49m 18s | Avg:  4m 55s | Max: 16m 09s | Hits:  90%/40    
      🟩 MSVC14.29          Pass: 100%/2   | Total: 48m 40s | Avg: 24m 20s | Max: 24m 58s | Hits:  65%/5102  
      🟨 MSVC14.42          Pass:  50%/2   | Total: 38m 04s | Avg: 19m 02s | Max: 25m 30s | Hits:  97%/2556  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 21m 13s | Avg: 10m 36s | Max: 12m 00s | Hits:  93%/5614  
    🟨 cxx_family
      🟥 Clang              Pass:   0%/16  | Total: 46m 03s | Avg:  2m 52s | Max:  3m 19s
      🟨 GCC                Pass:  14%/21  | Total:  1h 21m | Avg:  3m 54s | Max: 16m 09s | Hits:  90%/40    
      🟨 MSVC               Pass:  75%/4   | Total:  1h 26m | Avg: 21m 41s | Max: 25m 30s | Hits:  76%/7658  
      🟩 NVHPC              Pass: 100%/2   | Total: 21m 13s | Avg: 10m 36s | Max: 12m 00s | Hits:  93%/5614  
    🟨 jobs
      🟨 Build              Pass:  13%/37  | Total:  3h 21m | Avg:  5m 26s | Max: 25m 30s | Hits:  83%/13272 
      🟩 NVRTC              Pass: 100%/2   | Total: 32m 02s | Avg: 16m 01s | Max: 16m 09s | Hits:  90%/40    
      🟥 Test               Pass:   0%/3  
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 21s | Avg:  2m 21s | Max:  2m 21s
    🟨 sm
      🟩 75                 Pass: 100%/2   | Total: 32m 02s | Avg: 16m 01s | Max: 16m 09s | Hits:  90%/40    
      🟥 90                 Pass:   0%/2   | Total:  2m 45s | Avg:  1m 22s | Max:  2m 45s
      🟥 90;90a;100         Pass:   0%/1   | Total:  3m 05s | Avg:  3m 05s | Max:  3m 05s
    🟨 cpu
      🟨 amd64              Pass:  19%/41  | Total:  3h 50m | Avg:  5m 37s | Max: 25m 30s | Hits:  83%/13312 
      🟥 arm64              Pass:   0%/2   | Total:  5m 17s | Avg:  2m 38s | Max:  2m 49s
    🟨 cudacxx_family
      🟥 ClangCUDA          Pass:   0%/2   | Total:  6m 11s | Avg:  3m 05s | Max:  3m 19s
      🟨 nvcc               Pass:  19%/41  | Total:  3h 49m | Avg:  5m 36s | Max: 25m 30s | Hits:  83%/13312 
    🟨 gpu
      🟥 h100               Pass:   0%/2   | Total:  2m 45s | Avg:  1m 22s | Max:  2m 45s
      🟨 rtx2080            Pass:  19%/41  | Total:  3h 53m | Avg:  5m 41s | Max: 25m 30s | Hits:  83%/13312 
    🟨 std
      🟨 17                 Pass:  23%/21  | Total:  2h 30m | Avg:  7m 10s | Max: 25m 30s | Hits:  80%/10464 
      🟨 20                 Pass:   9%/21  | Total:  1h 23m | Avg:  3m 57s | Max: 16m 09s | Hits:  96%/2848  
    
  • 🟩 cub: Pass: 100%/45 | Total: 1d 17h | Avg: 55m 27s | Max: 1h 20m | Hits: 44%/53485

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  1d 15h | Avg: 55m 03s | Max:  1h 20m | Hits:  44%/51055 
      🟩 arm64              Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 04m | Hits:  33%/2430  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  5h 07m | Avg:  1h 01m | Max:  1h 10m | Hits:  30%/5908  
      🟩 12.5               Pass: 100%/2   | Total:  2h 25m | Avg:  1h 12m | Max:  1h 15m | Hits:  31%/2248  
      🟩 12.8               Pass: 100%/38  | Total:  1d 10h | Avg: 53m 45s | Max:  1h 20m | Hits:  46%/45329 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 10m | Avg:  1h 05m | Max:  1h 08m | Hits:  34%/2100  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  5h 07m | Avg:  1h 01m | Max:  1h 10m | Hits:  30%/5908  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 25m | Avg:  1h 12m | Max:  1h 15m | Hits:  31%/2248  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  1d 07h | Avg: 53m 07s | Max:  1h 20m | Hits:  47%/43229 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 10m | Avg:  1h 05m | Max:  1h 08m | Hits:  34%/2100  
      🟩 nvcc               Pass: 100%/43  | Total:  1d 15h | Avg: 55m 00s | Max:  1h 20m | Hits:  44%/51385 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 57m | Avg: 59m 20s | Max:  1h 02m | Hits:  34%/4868  
      🟩 Clang15            Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 01m | Hits:  34%/2430  
      🟩 Clang16            Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 02m | Hits:  34%/2430  
      🟩 Clang17            Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 02m | Hits:  34%/2430  
      🟩 Clang18            Pass: 100%/7   | Total:  6h 02m | Avg: 51m 43s | Max:  1h 08m | Hits:  53%/8175  
      🟩 GCC7               Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 02m | Hits:  33%/2434  
      🟩 GCC8               Pass: 100%/1   | Total: 57m 58s | Avg: 57m 58s | Max: 57m 58s | Hits:  33%/1217  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 59m | Avg: 59m 44s | Max:  1h 00m | Hits:  33%/2434  
      🟩 GCC10              Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 01m | Hits:  33%/2434  
      🟩 GCC11              Pass: 100%/2   | Total:  2h 02m | Avg:  1h 01m | Max:  1h 03m | Hits:  33%/2430  
      🟩 GCC12              Pass: 100%/2   | Total:  2h 12m | Avg:  1h 06m | Max:  1h 07m | Hits:  33%/2430  
      🟩 GCC13              Pass: 100%/11  | Total:  6h 52m | Avg: 37m 31s | Max:  1h 12m | Hits:  69%/13365 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 24m | Avg:  1h 12m | Max:  1h 13m | Hits:  12%/2080  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  2h 35m | Avg:  1h 17m | Max:  1h 20m | Hits:  12%/2080  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 25m | Avg:  1h 12m | Max:  1h 15m | Hits:  31%/2248  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 16h 02m | Avg: 56m 37s | Max:  1h 08m | Hits:  42%/20333 
      🟩 GCC                Pass: 100%/22  | Total: 18h 07m | Avg: 49m 26s | Max:  1h 12m | Hits:  51%/26744 
      🟩 MSVC               Pass: 100%/4   | Total:  4h 59m | Avg:  1h 14m | Max:  1h 20m | Hits:  12%/4160  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 25m | Avg:  1h 12m | Max:  1h 15m | Hits:  31%/2248  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total:  1h 12m | Avg: 24m 03s | Max: 26m 17s | Hits:  77%/3645  
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 12h | Avg:  1h 03m | Max:  1h 20m | Hits:  31%/40120 
      🟩 rtxa6000           Pass: 100%/8   | Total:  4h 07m | Avg: 30m 56s | Max:  1h 00m | Hits:  83%/9720  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 14h | Avg:  1h 02m | Max:  1h 20m | Hits:  31%/43765 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 23m 01s | Avg: 23m 01s | Max: 23m 01s | Hits:  99%/1215  
      🟩 GraphCapture       Pass: 100%/1   | Total: 18m 23s | Avg: 18m 23s | Max: 18m 23s | Hits:  99%/1215  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 08m | Avg: 22m 52s | Max: 23m 30s | Hits:  99%/3645  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 04m | Avg: 21m 36s | Max: 22m 24s | Hits:  99%/3645  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total:  1h 12m | Avg: 24m 03s | Max: 26m 17s | Hits:  77%/3645  
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 12m | Avg:  1h 12m | Max:  1h 12m | Hits:  33%/1215  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 21h 09m | Avg:  1h 03m | Max:  1h 14m | Hits:  30%/23535 
      🟩 20                 Pass: 100%/25  | Total: 20h 25m | Avg: 49m 01s | Max:  1h 20m | Hits:  54%/29950 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 22h 14m | Avg: 29m 38s | Max: 1h 06m | Hits: 77%/80136

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 36m 49s | Avg: 18m 24s | Max: 25m 37s | Hits:  88%/3564  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 21h 21m | Avg: 29m 48s | Max:  1h 06m | Hits:  77%/76573 
      🟩 arm64              Pass: 100%/2   | Total: 52m 39s | Avg: 26m 19s | Max: 27m 57s | Hits:  76%/3563  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  2h 53m | Avg: 34m 46s | Max: 53m 15s | Hits:  72%/8901  
      🟩 12.5               Pass: 100%/2   | Total:  1h 40m | Avg: 50m 18s | Max: 53m 23s | Hits:  64%/3562  
      🟩 12.8               Pass: 100%/38  | Total: 17h 39m | Avg: 27m 53s | Max:  1h 06m | Hits:  78%/67673 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 49m 18s | Avg: 24m 39s | Max: 24m 43s | Hits:  77%/3562  
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 53m | Avg: 34m 46s | Max: 53m 15s | Hits:  72%/8901  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 40m | Avg: 50m 18s | Max: 53m 23s | Hits:  64%/3562  
      🟩 nvcc12.8           Pass: 100%/36  | Total: 16h 50m | Avg: 28m 03s | Max:  1h 06m | Hits:  78%/64111 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 49m 18s | Avg: 24m 39s | Max: 24m 43s | Hits:  77%/3562  
      🟩 nvcc               Pass: 100%/43  | Total: 21h 24m | Avg: 29m 52s | Max:  1h 06m | Hits:  77%/76574 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 59m | Avg: 29m 45s | Max: 30m 28s | Hits:  77%/7124  
      🟩 Clang15            Pass: 100%/2   | Total: 54m 53s | Avg: 27m 26s | Max: 28m 05s | Hits:  77%/3562  
      🟩 Clang16            Pass: 100%/2   | Total: 58m 26s | Avg: 29m 13s | Max: 30m 35s | Hits:  77%/3562  
      🟩 Clang17            Pass: 100%/2   | Total: 57m 13s | Avg: 28m 36s | Max: 30m 45s | Hits:  77%/3562  
      🟩 Clang18            Pass: 100%/7   | Total:  2h 25m | Avg: 20m 47s | Max: 27m 23s | Hits:  83%/12467 
      🟩 GCC7               Pass: 100%/2   | Total: 59m 21s | Avg: 29m 40s | Max: 30m 31s | Hits:  76%/3564  
      🟩 GCC8               Pass: 100%/1   | Total: 27m 45s | Avg: 27m 45s | Max: 27m 45s | Hits:  76%/1782  
      🟩 GCC9               Pass: 100%/2   | Total:  1h 02m | Avg: 31m 24s | Max: 31m 53s | Hits:  76%/3564  
      🟩 GCC10              Pass: 100%/2   | Total:  1h 00m | Avg: 30m 13s | Max: 30m 51s | Hits:  76%/3564  
      🟩 GCC11              Pass: 100%/2   | Total: 59m 22s | Avg: 29m 41s | Max: 31m 14s | Hits:  76%/3564  
      🟩 GCC12              Pass: 100%/2   | Total:  1h 01m | Avg: 30m 44s | Max: 32m 54s | Hits:  76%/3564  
      🟩 GCC13              Pass: 100%/10  | Total:  3h 23m | Avg: 20m 19s | Max: 34m 22s | Hits:  86%/17820 
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 59m | Avg: 59m 59s | Max:  1h 06m | Hits:  54%/3550  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  2h 24m | Avg: 48m 00s | Max: 55m 25s | Hits:  60%/5325  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 40m | Avg: 50m 18s | Max: 53m 23s | Hits:  64%/3562  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  7h 15m | Avg: 25m 35s | Max: 30m 45s | Hits:  79%/30277 
      🟩 GCC                Pass: 100%/21  | Total:  8h 54m | Avg: 25m 26s | Max: 34m 22s | Hits:  81%/37422 
      🟩 MSVC               Pass: 100%/5   | Total:  4h 23m | Avg: 52m 47s | Max:  1h 06m | Hits:  58%/8875  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 40m | Avg: 50m 18s | Max: 53m 23s | Hits:  64%/3562  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 27m 51s | Avg: 13m 55s | Max: 16m 34s | Hits:  88%/3564  
      🟩 rtx2080            Pass: 100%/33  | Total: 18h 06m | Avg: 32m 54s | Max:  1h 06m | Hits:  74%/58769 
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 40m | Avg: 22m 01s | Max: 55m 25s | Hits:  85%/17803 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 20h 40m | Avg: 32m 38s | Max:  1h 06m | Hits:  74%/67671 
      🟩 TestCPU            Pass: 100%/3   | Total: 50m 03s | Avg: 16m 41s | Max: 34m 55s | Hits:  90%/5338  
      🟩 TestGPU            Pass: 100%/4   | Total: 43m 50s | Avg: 10m 57s | Max: 11m 17s | Hits:  99%/7127  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 27m 51s | Avg: 13m 55s | Max: 16m 34s | Hits:  88%/3564  
      🟩 90;90a;100         Pass: 100%/1   | Total: 34m 22s | Avg: 34m 22s | Max: 34m 22s | Hits:  76%/1782  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 11h 18m | Avg: 33m 56s | Max:  1h 06m | Hits:  73%/35611 
      🟩 20                 Pass: 100%/23  | Total: 10h 18m | Avg: 26m 53s | Max: 55m 25s | Hits:  80%/40961 
    
  • 🟩 cudax: Pass: 100%/22 | Total: 2h 09m | Avg: 5m 54s | Max: 13m 04s | Hits: 96%/11264

    🟩 cpu
      🟩 amd64              Pass: 100%/18  | Total:  1h 55m | Avg:  6m 23s | Max: 13m 04s | Hits:  96%/9036  
      🟩 arm64              Pass: 100%/4   | Total: 14m 57s | Avg:  3m 44s | Max:  3m 54s | Hits:  98%/2228  
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total: 10m 21s | Avg: 10m 21s | Max: 10m 21s | Hits:  60%/262   
      🟩 12.5               Pass: 100%/2   | Total: 13m 29s | Avg:  6m 44s | Max:  6m 47s | Hits:  95%/710   
      🟩 12.8               Pass: 100%/19  | Total:  1h 46m | Avg:  5m 35s | Max: 13m 04s | Hits:  97%/10292 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total: 10m 21s | Avg: 10m 21s | Max: 10m 21s | Hits:  60%/262   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 13m 29s | Avg:  6m 44s | Max:  6m 47s | Hits:  95%/710   
      🟩 nvcc12.8           Pass: 100%/19  | Total:  1h 46m | Avg:  5m 35s | Max: 13m 04s | Hits:  97%/10292 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/22  | Total:  2h 09m | Avg:  5m 54s | Max: 13m 04s | Hits:  96%/11264 
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total:  4m 18s | Avg:  4m 18s | Max:  4m 18s | Hits:  98%/559   
      🟩 Clang15            Pass: 100%/1   | Total:  4m 05s | Avg:  4m 05s | Max:  4m 05s | Hits:  98%/557   
      🟩 Clang16            Pass: 100%/1   | Total:  4m 11s | Avg:  4m 11s | Max:  4m 11s | Hits:  98%/557   
      🟩 Clang17            Pass: 100%/1   | Total:  4m 08s | Avg:  4m 08s | Max:  4m 08s | Hits:  98%/557   
      🟩 Clang18            Pass: 100%/4   | Total: 24m 01s | Avg:  6m 00s | Max: 12m 29s | Hits:  98%/2228  
      🟩 GCC10              Pass: 100%/1   | Total:  4m 13s | Avg:  4m 13s | Max:  4m 13s | Hits:  98%/559   
      🟩 GCC11              Pass: 100%/1   | Total:  4m 28s | Avg:  4m 28s | Max:  4m 28s | Hits:  98%/557   
      🟩 GCC12              Pass: 100%/2   | Total: 17m 04s | Avg:  8m 32s | Max: 13m 04s | Hits:  98%/1114  
      🟩 GCC13              Pass: 100%/6   | Total: 29m 47s | Avg:  4m 57s | Max: 11m 15s | Hits:  98%/3342  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 10m 21s | Avg: 10m 21s | Max: 10m 21s | Hits:  60%/262   
      🟩 MSVC14.42          Pass: 100%/1   | Total:  9m 54s | Avg:  9m 54s | Max:  9m 54s | Hits:  60%/262   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 13m 29s | Avg:  6m 44s | Max:  6m 47s | Hits:  95%/710   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total: 40m 43s | Avg:  5m 05s | Max: 12m 29s | Hits:  98%/4458  
      🟩 GCC                Pass: 100%/10  | Total: 55m 32s | Avg:  5m 33s | Max: 13m 04s | Hits:  98%/5572  
      🟩 MSVC               Pass: 100%/2   | Total: 20m 15s | Avg: 10m 07s | Max: 10m 21s | Hits:  60%/524   
      🟩 NVHPC              Pass: 100%/2   | Total: 13m 29s | Avg:  6m 44s | Max:  6m 47s | Hits:  95%/710   
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 14m 46s | Avg:  7m 23s | Max: 11m 15s | Hits:  98%/1114  
      🟩 rtx2080            Pass: 100%/20  | Total:  1h 55m | Avg:  5m 45s | Max: 13m 04s | Hits:  96%/10150 
    🟩 jobs
      🟩 Build              Pass: 100%/19  | Total:  1h 33m | Avg:  4m 54s | Max: 10m 21s | Hits:  96%/9593  
      🟩 Test               Pass: 100%/3   | Total: 36m 48s | Avg: 12m 16s | Max: 13m 04s | Hits:  99%/1671  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 18m 28s | Avg:  6m 09s | Max: 11m 15s | Hits:  98%/1671  
      🟩 90a                Pass: 100%/1   | Total:  3m 36s | Avg:  3m 36s | Max:  3m 36s | Hits:  98%/557   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 17m 57s | Avg:  4m 29s | Max:  6m 47s | Hits:  97%/2026  
      🟩 20                 Pass: 100%/18  | Total:  1h 52m | Avg:  6m 13s | Max: 13m 04s | Hits:  96%/9238  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 15m 38s | Avg: 7m 49s | Max: 13m 06s | Hits: 97%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 15m 38s | Avg:  7m 49s | Max: 13m 06s | Hits:  97%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 15m 38s | Avg:  7m 49s | Max: 13m 06s | Hits:  97%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 15m 38s | Avg:  7m 49s | Max: 13m 06s | Hits:  97%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 15m 38s | Avg:  7m 49s | Max: 13m 06s | Hits:  97%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 15m 38s | Avg:  7m 49s | Max: 13m 06s | Hits:  97%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 15m 38s | Avg:  7m 49s | Max: 13m 06s | Hits:  97%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 15m 38s | Avg:  7m 49s | Max: 13m 06s | Hits:  97%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 32s | Avg:  2m 32s | Max:  2m 32s | Hits:  96%/154   
      🟩 Test               Pass: 100%/1   | Total: 13m 06s | Avg: 13m 06s | Max: 13m 06s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 39m 32s | Avg: 39m 32s | Max: 39m 32s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 39m 32s | Avg: 39m 32s | Max: 39m 32s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 39m 32s | Avg: 39m 32s | Max: 39m 32s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 39m 32s | Avg: 39m 32s | Max: 39m 32s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 39m 32s | Avg: 39m 32s | Max: 39m 32s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 39m 32s | Avg: 39m 32s | Max: 39m 32s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 39m 32s | Avg: 39m 32s | Max: 39m 32s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 39m 32s | Avg: 39m 32s | Max: 39m 32s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 39m 32s | Avg: 39m 32s | Max: 39m 32s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 158)

# Runner
111 linux-amd64-cpu16
15 windows-amd64-cpu16
10 linux-arm64-cpu16
8 linux-amd64-gpu-rtx2080-latest-1
6 linux-amd64-gpu-rtxa6000-latest-1
5 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1

Copy link
Contributor

🟨 CI finished in 1h 08m: Pass: 98%/158 | Total: 1d 01h | Avg: 9m 34s | Max: 38m 42s | Hits: 94%/242910
  • 🟨 libcudacxx: Pass: 95%/43 | Total: 7h 06m | Avg: 9m 54s | Max: 27m 18s | Hits: 92%/97717

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  95%/41  | Total:  6h 57m | Avg: 10m 10s | Max: 27m 18s | Hits:  92%/92048 
      🟩 arm64              Pass: 100%/2   | Total:  8m 55s | Avg:  4m 27s | Max:  4m 52s | Hits:  96%/5669  
    🔍 ctk: 12.8 🔍
      🟩 12.0               Pass: 100%/5   | Total: 39m 00s | Avg:  7m 48s | Max: 21m 51s | Hits:  98%/13711 
      🟩 12.5               Pass: 100%/2   | Total: 18m 40s | Avg:  9m 20s | Max:  9m 40s | Hits:  97%/5614  
      🔍 12.8               Pass:  94%/36  | Total:  6h 08m | Avg: 10m 13s | Max: 27m 18s | Hits:  90%/78392 
    🚨 cudacxx: ClangCUDA18 🚨
      🔥 ClangCUDA18        Pass:   0%/2   | Total: 41m 32s | Avg: 20m 46s | Max: 21m 21s
      🟩 nvcc12.0           Pass: 100%/5   | Total: 39m 00s | Avg:  7m 48s | Max: 21m 51s | Hits:  98%/13711 
      🟩 nvcc12.5           Pass: 100%/2   | Total: 18m 40s | Avg:  9m 20s | Max:  9m 40s | Hits:  97%/5614  
      🟩 nvcc12.8           Pass: 100%/34  | Total:  5h 26m | Avg:  9m 36s | Max: 27m 18s | Hits:  90%/78392 
    🚨 cudacxx_family: ClangCUDA 🚨
      🔥 ClangCUDA          Pass:   0%/2   | Total: 41m 32s | Avg: 20m 46s | Max: 21m 21s
      🟩 nvcc               Pass: 100%/41  | Total:  6h 24m | Avg:  9m 22s | Max: 27m 18s | Hits:  92%/97717 
    🔍 cxx: Clang18 🔍
      🟩 Clang14            Pass: 100%/4   | Total: 19m 06s | Avg:  4m 46s | Max:  5m 24s | Hits:  97%/11230 
      🟩 Clang15            Pass: 100%/2   | Total: 32m 20s | Avg: 16m 10s | Max: 27m 18s | Hits:  65%/5626  
      🟩 Clang16            Pass: 100%/2   | Total: 28m 40s | Avg: 14m 20s | Max: 24m 09s | Hits:  66%/5626  
      🟩 Clang17            Pass: 100%/2   | Total:  9m 56s | Avg:  4m 58s | Max:  5m 17s | Hits:  97%/5626  
      🔍 Clang18            Pass:  66%/6   | Total:  1h 05m | Avg: 10m 51s | Max: 21m 21s | Hits:  96%/8460  
      🟩 GCC7               Pass: 100%/2   | Total:  9m 04s | Avg:  4m 32s | Max:  4m 42s | Hits:  97%/5564  
      🟩 GCC8               Pass: 100%/1   | Total:  4m 04s | Avg:  4m 04s | Max:  4m 04s | Hits:  97%/2792  
      🟩 GCC9               Pass: 100%/2   | Total:  8m 17s | Avg:  4m 08s | Max:  4m 22s | Hits:  97%/5576  
      🟩 GCC10              Pass: 100%/2   | Total:  8m 50s | Avg:  4m 25s | Max:  4m 47s | Hits:  98%/5632  
      🟩 GCC11              Pass: 100%/2   | Total:  9m 42s | Avg:  4m 51s | Max:  4m 51s | Hits:  97%/5628  
      🟩 GCC12              Pass: 100%/2   | Total: 28m 49s | Avg: 14m 24s | Max: 24m 12s | Hits:  66%/5628  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 31m | Avg:  9m 08s | Max: 16m 18s | Hits:  97%/14351 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 45m 23s | Avg: 22m 41s | Max: 23m 32s | Hits:  98%/5102  
      🟩 MSVC14.42          Pass: 100%/2   | Total: 46m 38s | Avg: 23m 19s | Max: 24m 46s | Hits:  97%/5262  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 18m 40s | Avg:  9m 20s | Max:  9m 40s | Hits:  97%/5614  
    🔍 cxx_family: Clang 🔍
      🔍 Clang              Pass:  87%/16  | Total:  2h 35m | Avg:  9m 41s | Max: 27m 18s | Hits:  87%/36568 
      🟩 GCC                Pass: 100%/21  | Total:  2h 40m | Avg:  7m 37s | Max: 24m 12s | Hits:  93%/45171 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 32m | Avg: 23m 00s | Max: 24m 46s | Hits:  97%/10364 
      🟩 NVHPC              Pass: 100%/2   | Total: 18m 40s | Avg:  9m 20s | Max:  9m 40s | Hits:  97%/5614  
    🔍 gpu: rtx2080 🔍
      🟩 h100               Pass: 100%/2   | Total: 19m 22s | Avg:  9m 41s | Max: 11m 58s | Hits:  98%/2924  
      🔍 rtx2080            Pass:  95%/41  | Total:  6h 46m | Avg:  9m 55s | Max: 27m 18s | Hits:  92%/94793 
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  94%/37  | Total:  6h 00m | Avg:  9m 45s | Max: 27m 18s | Hits:  92%/97677 
      🟩 NVRTC              Pass: 100%/2   | Total: 32m 00s | Avg: 16m 00s | Max: 16m 18s | Hits:  90%/40    
      🟩 Test               Pass: 100%/3   | Total: 30m 33s | Avg: 10m 11s | Max: 11m 58s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 30s | Avg:  2m 30s | Max:  2m 30s
    🟩 sm
      🟩 75                 Pass: 100%/2   | Total: 32m 00s | Avg: 16m 00s | Max: 16m 18s | Hits:  90%/40    
      🟩 90                 Pass: 100%/2   | Total: 19m 22s | Avg:  9m 41s | Max: 11m 58s | Hits:  98%/2924  
      🟩 90;90a;100         Pass: 100%/1   | Total: 14m 57s | Avg: 14m 57s | Max: 14m 57s | Hits:  95%/2924  
    🟨 std
      🟨 17                 Pass:  95%/21  | Total:  2h 59m | Avg:  8m 33s | Max: 23m 32s | Hits:  98%/52312 
      🟨 20                 Pass:  95%/21  | Total:  4h 03m | Avg: 11m 36s | Max: 27m 18s | Hits:  85%/45405 
    
  • 🟩 cub: Pass: 100%/45 | Total: 8h 35m | Avg: 11m 27s | Max: 31m 32s | Hits: 93%/53485

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  8h 24m | Avg: 11m 43s | Max: 31m 32s | Hits:  92%/51055 
      🟩 arm64              Pass: 100%/2   | Total: 11m 20s | Avg:  5m 40s | Max:  6m 00s | Hits:  99%/2430  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 50m 14s | Avg: 10m 02s | Max: 27m 06s | Hits:  85%/5908  
      🟩 12.5               Pass: 100%/2   | Total: 22m 29s | Avg: 11m 14s | Max: 11m 32s | Hits:  98%/2248  
      🟩 12.8               Pass: 100%/38  | Total:  7h 22m | Avg: 11m 39s | Max: 31m 32s | Hits:  94%/45329 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 25s | Avg:  5m 12s | Max:  5m 18s | Hits: 100%/2100  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 50m 14s | Avg: 10m 02s | Max: 27m 06s | Hits:  85%/5908  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 22m 29s | Avg: 11m 14s | Max: 11m 32s | Hits:  98%/2248  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  7h 12m | Avg: 12m 00s | Max: 31m 32s | Hits:  93%/43229 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 25s | Avg:  5m 12s | Max:  5m 18s | Hits: 100%/2100  
      🟩 nvcc               Pass: 100%/43  | Total:  8h 25m | Avg: 11m 44s | Max: 31m 32s | Hits:  92%/51385 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 24m 27s | Avg:  6m 06s | Max:  6m 22s | Hits: 100%/4868  
      🟩 Clang15            Pass: 100%/2   | Total: 13m 13s | Avg:  6m 36s | Max:  6m 41s | Hits: 100%/2430  
      🟩 Clang16            Pass: 100%/2   | Total: 12m 05s | Avg:  6m 02s | Max:  6m 06s | Hits: 100%/2430  
      🟩 Clang17            Pass: 100%/2   | Total: 12m 49s | Avg:  6m 24s | Max:  6m 34s | Hits: 100%/2430  
      🟩 Clang18            Pass: 100%/7   | Total:  1h 15m | Avg: 10m 46s | Max: 24m 32s | Hits: 100%/8175  
      🟩 GCC7               Pass: 100%/2   | Total: 12m 22s | Avg:  6m 11s | Max:  6m 46s | Hits:  99%/2434  
      🟩 GCC8               Pass: 100%/1   | Total:  6m 28s | Avg:  6m 28s | Max:  6m 28s | Hits:  99%/1217  
      🟩 GCC9               Pass: 100%/2   | Total: 12m 31s | Avg:  6m 15s | Max:  6m 47s | Hits:  99%/2434  
      🟩 GCC10              Pass: 100%/2   | Total: 13m 00s | Avg:  6m 30s | Max:  6m 45s | Hits:  99%/2434  
      🟩 GCC11              Pass: 100%/2   | Total: 13m 50s | Avg:  6m 55s | Max:  7m 09s | Hits:  99%/2430  
      🟩 GCC12              Pass: 100%/2   | Total: 13m 44s | Avg:  6m 52s | Max:  7m 06s | Hits:  99%/2430  
      🟩 GCC13              Pass: 100%/11  | Total:  2h 47m | Avg: 15m 12s | Max: 25m 01s | Hits:  99%/13365 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 56m 07s | Avg: 28m 03s | Max: 29m 01s | Hits:  15%/2080  
      🟩 MSVC14.42          Pass: 100%/2   | Total: 59m 41s | Avg: 29m 50s | Max: 31m 32s | Hits:  15%/2080  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 22m 29s | Avg: 11m 14s | Max: 11m 32s | Hits:  98%/2248  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  2h 17m | Avg:  8m 06s | Max: 24m 32s | Hits: 100%/20333 
      🟩 GCC                Pass: 100%/22  | Total:  3h 59m | Avg: 10m 52s | Max: 25m 01s | Hits:  99%/26744 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 55m | Avg: 28m 57s | Max: 31m 32s | Hits:  15%/4160  
      🟩 NVHPC              Pass: 100%/2   | Total: 22m 29s | Avg: 11m 14s | Max: 11m 32s | Hits:  98%/2248  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total: 49m 06s | Avg: 16m 22s | Max: 23m 16s | Hits:  99%/3645  
      🟩 rtx2080            Pass: 100%/34  | Total:  5h 15m | Avg:  9m 16s | Max: 31m 32s | Hits:  91%/40120 
      🟩 rtxa6000           Pass: 100%/8   | Total:  2h 30m | Avg: 18m 50s | Max: 25m 01s | Hits:  99%/9720  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  5h 33m | Avg:  9m 00s | Max: 31m 32s | Hits:  91%/43765 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 23m 09s | Avg: 23m 09s | Max: 23m 09s | Hits:  99%/1215  
      🟩 GraphCapture       Pass: 100%/1   | Total: 19m 09s | Avg: 19m 09s | Max: 19m 09s | Hits:  99%/1215  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 12m | Avg: 24m 16s | Max: 25m 01s | Hits:  99%/3645  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 07m | Avg: 22m 20s | Max: 23m 13s | Hits:  99%/3645  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 49m 06s | Avg: 16m 22s | Max: 23m 16s | Hits:  99%/3645  
      🟩 90;90a;100         Pass: 100%/1   | Total:  7m 14s | Avg:  7m 14s | Max:  7m 14s | Hits:  99%/1215  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  3h 17m | Avg:  9m 51s | Max: 29m 01s | Hits:  88%/23535 
      🟩 20                 Pass: 100%/25  | Total:  5h 18m | Avg: 12m 44s | Max: 31m 32s | Hits:  96%/29950 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 6h 35m | Avg: 8m 46s | Max: 29m 33s | Hits: 96%/80136

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 17m 45s | Avg:  8m 52s | Max: 11m 27s | Hits:  99%/3564  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  6h 25m | Avg:  8m 57s | Max: 29m 33s | Hits:  96%/76573 
      🟩 arm64              Pass: 100%/2   | Total:  9m 33s | Avg:  4m 46s | Max:  5m 01s | Hits:  99%/3563  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 44m 15s | Avg:  8m 51s | Max: 22m 48s | Hits:  94%/8901  
      🟩 12.5               Pass: 100%/2   | Total: 27m 04s | Avg: 13m 32s | Max: 13m 41s | Hits:  99%/3562  
      🟩 12.8               Pass: 100%/38  | Total:  5h 23m | Avg:  8m 31s | Max: 29m 33s | Hits:  96%/67673 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 08s | Avg:  5m 04s | Max:  5m 09s | Hits: 100%/3562  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 44m 15s | Avg:  8m 51s | Max: 22m 48s | Hits:  94%/8901  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 27m 04s | Avg: 13m 32s | Max: 13m 41s | Hits:  99%/3562  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  5h 13m | Avg:  8m 42s | Max: 29m 33s | Hits:  96%/64111 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 08s | Avg:  5m 04s | Max:  5m 09s | Hits: 100%/3562  
      🟩 nvcc               Pass: 100%/43  | Total:  6h 24m | Avg:  8m 57s | Max: 29m 33s | Hits:  96%/76574 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 21m 39s | Avg:  5m 24s | Max:  5m 53s | Hits: 100%/7124  
      🟩 Clang15            Pass: 100%/2   | Total: 11m 25s | Avg:  5m 42s | Max:  5m 46s | Hits: 100%/3562  
      🟩 Clang16            Pass: 100%/2   | Total: 11m 40s | Avg:  5m 50s | Max:  6m 03s | Hits: 100%/3562  
      🟩 Clang17            Pass: 100%/2   | Total: 11m 01s | Avg:  5m 30s | Max:  5m 40s | Hits: 100%/3562  
      🟩 Clang18            Pass: 100%/7   | Total: 43m 27s | Avg:  6m 12s | Max: 10m 12s | Hits: 100%/12467 
      🟩 GCC7               Pass: 100%/2   | Total: 10m 42s | Avg:  5m 21s | Max:  5m 22s | Hits:  99%/3564  
      🟩 GCC8               Pass: 100%/1   | Total:  5m 28s | Avg:  5m 28s | Max:  5m 28s | Hits:  99%/1782  
      🟩 GCC9               Pass: 100%/2   | Total: 11m 07s | Avg:  5m 33s | Max:  5m 39s | Hits:  99%/3564  
      🟩 GCC10              Pass: 100%/2   | Total: 12m 06s | Avg:  6m 03s | Max:  6m 28s | Hits:  99%/3564  
      🟩 GCC11              Pass: 100%/2   | Total: 12m 06s | Avg:  6m 03s | Max:  6m 21s | Hits:  99%/3564  
      🟩 GCC12              Pass: 100%/2   | Total: 12m 34s | Avg:  6m 17s | Max:  6m 26s | Hits:  99%/3564  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 16m | Avg:  7m 40s | Max: 11m 50s | Hits:  99%/17820 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 46m 03s | Avg: 23m 01s | Max: 23m 15s | Hits:  70%/3550  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  1h 21m | Avg: 27m 18s | Max: 29m 33s | Hits:  70%/5325  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 27m 04s | Avg: 13m 32s | Max: 13m 41s | Hits:  99%/3562  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  1h 39m | Avg:  5m 50s | Max: 10m 12s | Hits: 100%/30277 
      🟩 GCC                Pass: 100%/21  | Total:  2h 20m | Avg:  6m 42s | Max: 11m 50s | Hits:  99%/37422 
      🟩 MSVC               Pass: 100%/5   | Total:  2h 07m | Avg: 25m 35s | Max: 29m 33s | Hits:  70%/8875  
      🟩 NVHPC              Pass: 100%/2   | Total: 27m 04s | Avg: 13m 32s | Max: 13m 41s | Hits:  99%/3562  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 16m 53s | Avg:  8m 26s | Max: 11m 50s | Hits:  99%/3564  
      🟩 rtx2080            Pass: 100%/33  | Total:  4h 14m | Avg:  7m 42s | Max: 23m 43s | Hits:  97%/58769 
      🟩 rtx4090            Pass: 100%/10  | Total:  2h 03m | Avg: 12m 22s | Max: 29m 33s | Hits:  94%/17803 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total:  5h 05m | Avg:  8m 02s | Max: 28m 40s | Hits:  96%/67671 
      🟩 TestCPU            Pass: 100%/3   | Total: 44m 51s | Avg: 14m 57s | Max: 29m 33s | Hits:  90%/5338  
      🟩 TestGPU            Pass: 100%/4   | Total: 44m 49s | Avg: 11m 12s | Max: 11m 50s | Hits:  99%/7127  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 16m 53s | Avg:  8m 26s | Max: 11m 50s | Hits:  99%/3564  
      🟩 90;90a;100         Pass: 100%/1   | Total:  6m 05s | Avg:  6m 05s | Max:  6m 05s | Hits:  99%/1782  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  2h 55m | Avg:  8m 45s | Max: 23m 43s | Hits:  95%/35611 
      🟩 20                 Pass: 100%/23  | Total:  3h 22m | Avg:  8m 47s | Max: 29m 33s | Hits:  97%/40961 
    
  • 🟩 cudax: Pass: 100%/22 | Total: 2h 00m | Avg: 5m 29s | Max: 13m 46s | Hits: 97%/11264

    🟩 cpu
      🟩 amd64              Pass: 100%/18  | Total:  1h 49m | Avg:  6m 04s | Max: 13m 46s | Hits:  97%/9036  
      🟩 arm64              Pass: 100%/4   | Total: 11m 26s | Avg:  2m 51s | Max:  3m 00s | Hits:  99%/2228  
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total:  9m 50s | Avg:  9m 50s | Max:  9m 50s | Hits:  61%/262   
      🟩 12.5               Pass: 100%/2   | Total: 12m 43s | Avg:  6m 21s | Max:  6m 22s | Hits:  96%/710   
      🟩 12.8               Pass: 100%/19  | Total:  1h 38m | Avg:  5m 10s | Max: 13m 46s | Hits:  98%/10292 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total:  9m 50s | Avg:  9m 50s | Max:  9m 50s | Hits:  61%/262   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 12m 43s | Avg:  6m 21s | Max:  6m 22s | Hits:  96%/710   
      🟩 nvcc12.8           Pass: 100%/19  | Total:  1h 38m | Avg:  5m 10s | Max: 13m 46s | Hits:  98%/10292 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/22  | Total:  2h 00m | Avg:  5m 29s | Max: 13m 46s | Hits:  97%/11264 
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total:  3m 34s | Avg:  3m 34s | Max:  3m 34s | Hits: 100%/559   
      🟩 Clang15            Pass: 100%/1   | Total:  3m 39s | Avg:  3m 39s | Max:  3m 39s | Hits: 100%/557   
      🟩 Clang16            Pass: 100%/1   | Total:  3m 33s | Avg:  3m 33s | Max:  3m 33s | Hits: 100%/557   
      🟩 Clang17            Pass: 100%/1   | Total:  3m 52s | Avg:  3m 52s | Max:  3m 52s | Hits: 100%/557   
      🟩 Clang18            Pass: 100%/4   | Total: 21m 07s | Avg:  5m 16s | Max: 12m 04s | Hits: 100%/2228  
      🟩 GCC10              Pass: 100%/1   | Total:  3m 46s | Avg:  3m 46s | Max:  3m 46s | Hits:  99%/559   
      🟩 GCC11              Pass: 100%/1   | Total:  3m 42s | Avg:  3m 42s | Max:  3m 42s | Hits:  99%/557   
      🟩 GCC12              Pass: 100%/2   | Total: 16m 32s | Avg:  8m 16s | Max: 12m 56s | Hits:  99%/1114  
      🟩 GCC13              Pass: 100%/6   | Total: 29m 10s | Avg:  4m 51s | Max: 13m 46s | Hits:  99%/3342  
      🟩 MSVC14.39          Pass: 100%/1   | Total:  9m 50s | Avg:  9m 50s | Max:  9m 50s | Hits:  61%/262   
      🟩 MSVC14.42          Pass: 100%/1   | Total:  9m 24s | Avg:  9m 24s | Max:  9m 24s | Hits:  61%/262   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 12m 43s | Avg:  6m 21s | Max:  6m 22s | Hits:  96%/710   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total: 35m 45s | Avg:  4m 28s | Max: 12m 04s | Hits: 100%/4458  
      🟩 GCC                Pass: 100%/10  | Total: 53m 10s | Avg:  5m 19s | Max: 13m 46s | Hits:  99%/5572  
      🟩 MSVC               Pass: 100%/2   | Total: 19m 14s | Avg:  9m 37s | Max:  9m 50s | Hits:  61%/524   
      🟩 NVHPC              Pass: 100%/2   | Total: 12m 43s | Avg:  6m 21s | Max:  6m 22s | Hits:  96%/710   
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 16m 49s | Avg:  8m 24s | Max: 13m 46s | Hits:  99%/1114  
      🟩 rtx2080            Pass: 100%/20  | Total:  1h 44m | Avg:  5m 12s | Max: 12m 56s | Hits:  97%/10150 
    🟩 jobs
      🟩 Build              Pass: 100%/19  | Total:  1h 22m | Avg:  4m 19s | Max:  9m 50s | Hits:  97%/9593  
      🟩 Test               Pass: 100%/3   | Total: 38m 46s | Avg: 12m 55s | Max: 13m 46s | Hits:  99%/1671  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 20m 17s | Avg:  6m 45s | Max: 13m 46s | Hits:  99%/1671  
      🟩 90a                Pass: 100%/1   | Total:  3m 13s | Avg:  3m 13s | Max:  3m 13s | Hits:  99%/557   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 15m 41s | Avg:  3m 55s | Max:  6m 21s | Hits:  99%/2026  
      🟩 20                 Pass: 100%/18  | Total:  1h 45m | Avg:  5m 50s | Max: 13m 46s | Hits:  97%/9238  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 15m 25s | Avg: 7m 42s | Max: 13m 03s | Hits: 98%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 15m 25s | Avg:  7m 42s | Max: 13m 03s | Hits:  98%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 15m 25s | Avg:  7m 42s | Max: 13m 03s | Hits:  98%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 15m 25s | Avg:  7m 42s | Max: 13m 03s | Hits:  98%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 15m 25s | Avg:  7m 42s | Max: 13m 03s | Hits:  98%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 15m 25s | Avg:  7m 42s | Max: 13m 03s | Hits:  98%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 15m 25s | Avg:  7m 42s | Max: 13m 03s | Hits:  98%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 15m 25s | Avg:  7m 42s | Max: 13m 03s | Hits:  98%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 22s | Avg:  2m 22s | Max:  2m 22s | Hits:  98%/154   
      🟩 Test               Pass: 100%/1   | Total: 13m 03s | Avg: 13m 03s | Max: 13m 03s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 38m 42s | Avg: 38m 42s | Max: 38m 42s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 38m 42s | Avg: 38m 42s | Max: 38m 42s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 38m 42s | Avg: 38m 42s | Max: 38m 42s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 38m 42s | Avg: 38m 42s | Max: 38m 42s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 38m 42s | Avg: 38m 42s | Max: 38m 42s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 38m 42s | Avg: 38m 42s | Max: 38m 42s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 38m 42s | Avg: 38m 42s | Max: 38m 42s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 38m 42s | Avg: 38m 42s | Max: 38m 42s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 38m 42s | Avg: 38m 42s | Max: 38m 42s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 158)

# Runner
111 linux-amd64-cpu16
15 windows-amd64-cpu16
10 linux-arm64-cpu16
8 linux-amd64-gpu-rtx2080-latest-1
6 linux-amd64-gpu-rtxa6000-latest-1
5 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1

Copy link
Contributor

🟩 CI finished in 1h 01m: Pass: 100%/158 | Total: 23h 39m | Avg: 8m 58s | Max: 40m 45s | Hits: 94%/248540
  • 🟩 cub: Pass: 100%/45 | Total: 8h 22m | Avg: 11m 09s | Max: 30m 23s | Hits: 93%/53485

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  8h 11m | Avg: 11m 25s | Max: 30m 23s | Hits:  92%/51055 
      🟩 arm64              Pass: 100%/2   | Total: 10m 59s | Avg:  5m 29s | Max:  5m 43s | Hits:  99%/2430  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 48m 44s | Avg:  9m 44s | Max: 26m 27s | Hits:  85%/5908  
      🟩 12.5               Pass: 100%/2   | Total: 20m 02s | Avg: 10m 01s | Max: 10m 03s | Hits:  98%/2248  
      🟩 12.8               Pass: 100%/38  | Total:  7h 13m | Avg: 11m 24s | Max: 30m 23s | Hits:  94%/45329 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  9m 32s | Avg:  4m 46s | Max:  4m 47s | Hits: 100%/2100  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 48m 44s | Avg:  9m 44s | Max: 26m 27s | Hits:  85%/5908  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 20m 02s | Avg: 10m 01s | Max: 10m 03s | Hits:  98%/2248  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  7h 03m | Avg: 11m 46s | Max: 30m 23s | Hits:  93%/43229 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 32s | Avg:  4m 46s | Max:  4m 47s | Hits: 100%/2100  
      🟩 nvcc               Pass: 100%/43  | Total:  8h 12m | Avg: 11m 27s | Max: 30m 23s | Hits:  92%/51385 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 22m 56s | Avg:  5m 44s | Max:  6m 11s | Hits: 100%/4868  
      🟩 Clang15            Pass: 100%/2   | Total: 12m 43s | Avg:  6m 21s | Max:  6m 34s | Hits: 100%/2430  
      🟩 Clang16            Pass: 100%/2   | Total: 11m 59s | Avg:  5m 59s | Max:  6m 01s | Hits: 100%/2430  
      🟩 Clang17            Pass: 100%/2   | Total: 12m 41s | Avg:  6m 20s | Max:  6m 22s | Hits: 100%/2430  
      🟩 Clang18            Pass: 100%/7   | Total:  1h 13m | Avg: 10m 30s | Max: 24m 21s | Hits: 100%/8175  
      🟩 GCC7               Pass: 100%/2   | Total: 11m 37s | Avg:  5m 48s | Max:  5m 57s | Hits:  99%/2434  
      🟩 GCC8               Pass: 100%/1   | Total:  6m 24s | Avg:  6m 24s | Max:  6m 24s | Hits:  99%/1217  
      🟩 GCC9               Pass: 100%/2   | Total: 12m 26s | Avg:  6m 13s | Max:  6m 46s | Hits:  99%/2434  
      🟩 GCC10              Pass: 100%/2   | Total: 12m 45s | Avg:  6m 22s | Max:  6m 30s | Hits:  99%/2434  
      🟩 GCC11              Pass: 100%/2   | Total: 12m 38s | Avg:  6m 19s | Max:  6m 25s | Hits:  99%/2430  
      🟩 GCC12              Pass: 100%/2   | Total: 13m 22s | Avg:  6m 41s | Max:  6m 42s | Hits:  99%/2430  
      🟩 GCC13              Pass: 100%/11  | Total:  2h 42m | Avg: 14m 44s | Max: 25m 16s | Hits:  99%/13365 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 56m 22s | Avg: 28m 11s | Max: 29m 55s | Hits:  15%/2080  
      🟩 MSVC14.42          Pass: 100%/2   | Total:  1h 00m | Avg: 30m 19s | Max: 30m 23s | Hits:  15%/2080  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 20m 02s | Avg: 10m 01s | Max: 10m 03s | Hits:  98%/2248  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  2h 13m | Avg:  7m 52s | Max: 24m 21s | Hits: 100%/20333 
      🟩 GCC                Pass: 100%/22  | Total:  3h 51m | Avg: 10m 30s | Max: 25m 16s | Hits:  99%/26744 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 57m | Avg: 29m 15s | Max: 30m 23s | Hits:  15%/4160  
      🟩 NVHPC              Pass: 100%/2   | Total: 20m 02s | Avg: 10m 01s | Max: 10m 03s | Hits:  98%/2248  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total: 49m 38s | Avg: 16m 32s | Max: 23m 19s | Hits:  99%/3645  
      🟩 rtx2080            Pass: 100%/34  | Total:  5h 06m | Avg:  9m 00s | Max: 30m 23s | Hits:  91%/40120 
      🟩 rtxa6000           Pass: 100%/8   | Total:  2h 26m | Avg: 18m 15s | Max: 25m 16s | Hits:  99%/9720  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  5h 24m | Avg:  8m 45s | Max: 30m 23s | Hits:  91%/43765 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 20m 28s | Avg: 20m 28s | Max: 20m 28s | Hits:  99%/1215  
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 05s | Avg: 16m 05s | Max: 16m 05s | Hits:  99%/1215  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 12m | Avg: 24m 18s | Max: 25m 16s | Hits:  99%/3645  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 08m | Avg: 22m 51s | Max: 24m 43s | Hits:  99%/3645  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 49m 38s | Avg: 16m 32s | Max: 23m 19s | Hits:  99%/3645  
      🟩 90;90a;100         Pass: 100%/1   | Total:  6m 37s | Avg:  6m 37s | Max:  6m 37s | Hits:  99%/1215  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  3h 14m | Avg:  9m 43s | Max: 30m 23s | Hits:  88%/23535 
      🟩 20                 Pass: 100%/25  | Total:  5h 07m | Avg: 12m 18s | Max: 30m 16s | Hits:  96%/29950 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 6h 28m | Avg: 8m 38s | Max: 31m 38s | Hits: 96%/80136

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 17m 04s | Avg:  8m 32s | Max: 11m 04s | Hits:  99%/3564  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  6h 19m | Avg:  8m 49s | Max: 31m 38s | Hits:  96%/76573 
      🟩 arm64              Pass: 100%/2   | Total:  9m 30s | Avg:  4m 45s | Max:  5m 02s | Hits:  99%/3563  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 43m 18s | Avg:  8m 39s | Max: 23m 50s | Hits:  94%/8901  
      🟩 12.5               Pass: 100%/2   | Total: 26m 23s | Avg: 13m 11s | Max: 13m 20s | Hits:  99%/3562  
      🟩 12.8               Pass: 100%/38  | Total:  5h 19m | Avg:  8m 24s | Max: 31m 38s | Hits:  96%/67673 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 05s | Avg:  5m 02s | Max:  5m 07s | Hits: 100%/3562  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 43m 18s | Avg:  8m 39s | Max: 23m 50s | Hits:  94%/8901  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 26m 23s | Avg: 13m 11s | Max: 13m 20s | Hits:  99%/3562  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  5h 09m | Avg:  8m 35s | Max: 31m 38s | Hits:  96%/64111 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 05s | Avg:  5m 02s | Max:  5m 07s | Hits: 100%/3562  
      🟩 nvcc               Pass: 100%/43  | Total:  6h 18m | Avg:  8m 48s | Max: 31m 38s | Hits:  96%/76574 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 20m 08s | Avg:  5m 02s | Max:  5m 15s | Hits: 100%/7124  
      🟩 Clang15            Pass: 100%/2   | Total: 11m 36s | Avg:  5m 48s | Max:  5m 53s | Hits: 100%/3562  
      🟩 Clang16            Pass: 100%/2   | Total: 10m 23s | Avg:  5m 11s | Max:  5m 14s | Hits: 100%/3562  
      🟩 Clang17            Pass: 100%/2   | Total: 11m 32s | Avg:  5m 46s | Max:  5m 48s | Hits:  99%/3562  
      🟩 Clang18            Pass: 100%/7   | Total: 42m 56s | Avg:  6m 08s | Max: 10m 07s | Hits: 100%/12467 
      🟩 GCC7               Pass: 100%/2   | Total: 10m 00s | Avg:  5m 00s | Max:  5m 13s | Hits:  99%/3564  
      🟩 GCC8               Pass: 100%/1   | Total:  5m 17s | Avg:  5m 17s | Max:  5m 17s | Hits:  99%/1782  
      🟩 GCC9               Pass: 100%/2   | Total: 10m 52s | Avg:  5m 26s | Max:  5m 57s | Hits:  99%/3564  
      🟩 GCC10              Pass: 100%/2   | Total: 10m 44s | Avg:  5m 22s | Max:  5m 26s | Hits:  99%/3564  
      🟩 GCC11              Pass: 100%/2   | Total: 11m 28s | Avg:  5m 44s | Max:  6m 03s | Hits:  99%/3564  
      🟩 GCC12              Pass: 100%/2   | Total: 11m 41s | Avg:  5m 50s | Max:  6m 04s | Hits:  99%/3564  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 14m | Avg:  7m 27s | Max: 11m 28s | Hits:  99%/17820 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 48m 49s | Avg: 24m 24s | Max: 24m 59s | Hits:  70%/3550  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  1h 22m | Avg: 27m 31s | Max: 31m 38s | Hits:  70%/5325  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 26m 23s | Avg: 13m 11s | Max: 13m 20s | Hits:  99%/3562  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  1h 36m | Avg:  5m 40s | Max: 10m 07s | Hits:  99%/30277 
      🟩 GCC                Pass: 100%/21  | Total:  2h 14m | Avg:  6m 24s | Max: 11m 28s | Hits:  99%/37422 
      🟩 MSVC               Pass: 100%/5   | Total:  2h 11m | Avg: 26m 16s | Max: 31m 38s | Hits:  70%/8875  
      🟩 NVHPC              Pass: 100%/2   | Total: 26m 23s | Avg: 13m 11s | Max: 13m 20s | Hits:  99%/3562  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 15m 50s | Avg:  7m 55s | Max: 11m 28s | Hits:  99%/3564  
      🟩 rtx2080            Pass: 100%/33  | Total:  4h 11m | Avg:  7m 37s | Max: 25m 52s | Hits:  97%/58769 
      🟩 rtx4090            Pass: 100%/10  | Total:  2h 01m | Avg: 12m 08s | Max: 31m 38s | Hits:  94%/17803 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total:  4h 58m | Avg:  7m 50s | Max: 25m 52s | Hits:  96%/67671 
      🟩 TestCPU            Pass: 100%/3   | Total: 46m 48s | Avg: 15m 36s | Max: 31m 38s | Hits:  90%/5338  
      🟩 TestGPU            Pass: 100%/4   | Total: 43m 59s | Avg: 10m 59s | Max: 11m 28s | Hits:  99%/7127  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 15m 50s | Avg:  7m 55s | Max: 11m 28s | Hits:  99%/3564  
      🟩 90;90a;100         Pass: 100%/1   | Total:  5m 43s | Avg:  5m 43s | Max:  5m 43s | Hits:  99%/1782  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  2h 54m | Avg:  8m 43s | Max: 25m 52s | Hits:  95%/35611 
      🟩 20                 Pass: 100%/23  | Total:  3h 17m | Avg:  8m 35s | Max: 31m 38s | Hits:  97%/40961 
    
  • 🟩 libcudacxx: Pass: 100%/43 | Total: 5h 55m | Avg: 8m 15s | Max: 24m 24s | Hits: 94%/103347

    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total:  5h 47m | Avg:  8m 28s | Max: 24m 24s | Hits:  94%/97678 
      🟩 arm64              Pass: 100%/2   | Total:  7m 59s | Avg:  3m 59s | Max:  4m 15s | Hits:  98%/5669  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 36m 32s | Avg:  7m 18s | Max: 21m 08s | Hits:  98%/13711 
      🟩 12.5               Pass: 100%/2   | Total: 17m 41s | Avg:  8m 50s | Max:  9m 20s | Hits:  97%/5614  
      🟩 12.8               Pass: 100%/36  | Total:  5h 01m | Avg:  8m 21s | Max: 24m 24s | Hits:  93%/84022 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 42m 13s | Avg: 21m 06s | Max: 21m 45s | Hits:  27%/5630  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 36m 32s | Avg:  7m 18s | Max: 21m 08s | Hits:  98%/13711 
      🟩 nvcc12.5           Pass: 100%/2   | Total: 17m 41s | Avg:  8m 50s | Max:  9m 20s | Hits:  97%/5614  
      🟩 nvcc12.8           Pass: 100%/34  | Total:  4h 18m | Avg:  7m 36s | Max: 24m 24s | Hits:  98%/78392 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 42m 13s | Avg: 21m 06s | Max: 21m 45s | Hits:  27%/5630  
      🟩 nvcc               Pass: 100%/41  | Total:  5h 13m | Avg:  7m 38s | Max: 24m 24s | Hits:  98%/97717 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 17m 32s | Avg:  4m 23s | Max:  4m 55s | Hits:  98%/11230 
      🟩 Clang15            Pass: 100%/2   | Total:  9m 46s | Avg:  4m 53s | Max:  5m 15s | Hits:  98%/5626  
      🟩 Clang16            Pass: 100%/2   | Total:  9m 33s | Avg:  4m 46s | Max:  4m 56s | Hits:  98%/5626  
      🟩 Clang17            Pass: 100%/2   | Total:  9m 50s | Avg:  4m 55s | Max:  5m 15s | Hits:  98%/5626  
      🟩 Clang18            Pass: 100%/6   | Total:  1h 03m | Avg: 10m 38s | Max: 21m 45s | Hits:  70%/14090 
      🟩 GCC7               Pass: 100%/2   | Total:  7m 49s | Avg:  3m 54s | Max:  4m 06s | Hits:  97%/5564  
      🟩 GCC8               Pass: 100%/1   | Total:  3m 46s | Avg:  3m 46s | Max:  3m 46s | Hits:  99%/2792  
      🟩 GCC9               Pass: 100%/2   | Total:  8m 18s | Avg:  4m 09s | Max:  4m 28s | Hits:  97%/5576  
      🟩 GCC10              Pass: 100%/2   | Total:  8m 14s | Avg:  4m 07s | Max:  4m 16s | Hits:  98%/5632  
      🟩 GCC11              Pass: 100%/2   | Total:  8m 00s | Avg:  4m 00s | Max:  4m 05s | Hits:  98%/5628  
      🟩 GCC12              Pass: 100%/2   | Total:  8m 37s | Avg:  4m 18s | Max:  4m 31s | Hits:  98%/5628  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 30m | Avg:  9m 04s | Max: 18m 24s | Hits:  96%/14351 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 45m 27s | Avg: 22m 43s | Max: 24m 19s | Hits:  98%/5102  
      🟩 MSVC14.42          Pass: 100%/2   | Total: 46m 06s | Avg: 23m 03s | Max: 24m 24s | Hits:  98%/5262  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 17m 41s | Avg:  8m 50s | Max:  9m 20s | Hits:  97%/5614  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/16  | Total:  1h 50m | Avg:  6m 54s | Max: 21m 45s | Hits:  88%/42198 
      🟩 GCC                Pass: 100%/21  | Total:  2h 15m | Avg:  6m 27s | Max: 18m 24s | Hits:  97%/45171 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 31m | Avg: 22m 53s | Max: 24m 24s | Hits:  98%/10364 
      🟩 NVHPC              Pass: 100%/2   | Total: 17m 41s | Avg:  8m 50s | Max:  9m 20s | Hits:  97%/5614  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 16m 14s | Avg:  8m 07s | Max: 11m 55s | Hits:  99%/2924  
      🟩 rtx2080            Pass: 100%/41  | Total:  5h 39m | Avg:  8m 16s | Max: 24m 24s | Hits:  94%/100423
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  4h 49m | Avg:  7m 48s | Max: 24m 24s | Hits:  94%/103307
      🟩 NVRTC              Pass: 100%/2   | Total: 34m 46s | Avg: 17m 23s | Max: 18m 24s | Hits:  90%/40    
      🟩 Test               Pass: 100%/3   | Total: 29m 05s | Avg:  9m 41s | Max: 11m 55s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 15s | Avg:  2m 15s | Max:  2m 15s
    🟩 sm
      🟩 75                 Pass: 100%/2   | Total: 34m 46s | Avg: 17m 23s | Max: 18m 24s | Hits:  90%/40    
      🟩 90                 Pass: 100%/2   | Total: 16m 14s | Avg:  8m 07s | Max: 11m 55s | Hits:  99%/2924  
      🟩 90;90a;100         Pass: 100%/1   | Total: 16m 18s | Avg: 16m 18s | Max: 16m 18s | Hits:  89%/2924  
    🟩 std
      🟩 17                 Pass: 100%/21  | Total:  2h 59m | Avg:  8m 31s | Max: 24m 19s | Hits:  94%/55106 
      🟩 20                 Pass: 100%/21  | Total:  2h 53m | Avg:  8m 16s | Max: 24m 24s | Hits:  93%/48241 
    
  • 🟩 cudax: Pass: 100%/22 | Total: 1h 56m | Avg: 5m 18s | Max: 14m 06s | Hits: 97%/11264

    🟩 cpu
      🟩 amd64              Pass: 100%/18  | Total:  1h 45m | Avg:  5m 52s | Max: 14m 06s | Hits:  97%/9036  
      🟩 arm64              Pass: 100%/4   | Total: 11m 10s | Avg:  2m 47s | Max:  2m 54s | Hits:  99%/2228  
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total: 10m 29s | Avg: 10m 29s | Max: 10m 29s | Hits:  61%/262   
      🟩 12.5               Pass: 100%/2   | Total: 11m 07s | Avg:  5m 33s | Max:  5m 41s | Hits:  96%/710   
      🟩 12.8               Pass: 100%/19  | Total:  1h 35m | Avg:  5m 00s | Max: 14m 06s | Hits:  98%/10292 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total: 10m 29s | Avg: 10m 29s | Max: 10m 29s | Hits:  61%/262   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 11m 07s | Avg:  5m 33s | Max:  5m 41s | Hits:  96%/710   
      🟩 nvcc12.8           Pass: 100%/19  | Total:  1h 35m | Avg:  5m 00s | Max: 14m 06s | Hits:  98%/10292 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/22  | Total:  1h 56m | Avg:  5m 18s | Max: 14m 06s | Hits:  97%/11264 
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total:  3m 16s | Avg:  3m 16s | Max:  3m 16s | Hits: 100%/559   
      🟩 Clang15            Pass: 100%/1   | Total:  3m 25s | Avg:  3m 25s | Max:  3m 25s | Hits: 100%/557   
      🟩 Clang16            Pass: 100%/1   | Total:  3m 31s | Avg:  3m 31s | Max:  3m 31s | Hits: 100%/557   
      🟩 Clang17            Pass: 100%/1   | Total:  3m 18s | Avg:  3m 18s | Max:  3m 18s | Hits: 100%/557   
      🟩 Clang18            Pass: 100%/4   | Total: 20m 56s | Avg:  5m 14s | Max: 12m 05s | Hits:  99%/2228  
      🟩 GCC10              Pass: 100%/1   | Total:  3m 17s | Avg:  3m 17s | Max:  3m 17s | Hits:  99%/559   
      🟩 GCC11              Pass: 100%/1   | Total:  3m 13s | Avg:  3m 13s | Max:  3m 13s | Hits:  99%/557   
      🟩 GCC12              Pass: 100%/2   | Total: 16m 27s | Avg:  8m 13s | Max: 13m 09s | Hits:  99%/1114  
      🟩 GCC13              Pass: 100%/6   | Total: 28m 34s | Avg:  4m 45s | Max: 14m 06s | Hits:  99%/3342  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 10m 29s | Avg: 10m 29s | Max: 10m 29s | Hits:  61%/262   
      🟩 MSVC14.42          Pass: 100%/1   | Total:  9m 18s | Avg:  9m 18s | Max:  9m 18s | Hits:  61%/262   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 11m 07s | Avg:  5m 33s | Max:  5m 41s | Hits:  96%/710   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total: 34m 26s | Avg:  4m 18s | Max: 12m 05s | Hits:  99%/4458  
      🟩 GCC                Pass: 100%/10  | Total: 51m 31s | Avg:  5m 09s | Max: 14m 06s | Hits:  99%/5572  
      🟩 MSVC               Pass: 100%/2   | Total: 19m 47s | Avg:  9m 53s | Max: 10m 29s | Hits:  61%/524   
      🟩 NVHPC              Pass: 100%/2   | Total: 11m 07s | Avg:  5m 33s | Max:  5m 41s | Hits:  96%/710   
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 16m 58s | Avg:  8m 29s | Max: 14m 06s | Hits:  99%/1114  
      🟩 rtx2080            Pass: 100%/20  | Total:  1h 39m | Avg:  4m 59s | Max: 13m 09s | Hits:  97%/10150 
    🟩 jobs
      🟩 Build              Pass: 100%/19  | Total:  1h 17m | Avg:  4m 04s | Max: 10m 29s | Hits:  97%/9593  
      🟩 Test               Pass: 100%/3   | Total: 39m 20s | Avg: 13m 06s | Max: 14m 06s | Hits:  99%/1671  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 19m 51s | Avg:  6m 37s | Max: 14m 06s | Hits:  99%/1671  
      🟩 90a                Pass: 100%/1   | Total:  3m 04s | Avg:  3m 04s | Max:  3m 04s | Hits:  99%/557   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 14m 03s | Avg:  3m 30s | Max:  5m 41s | Hits:  99%/2026  
      🟩 20                 Pass: 100%/18  | Total:  1h 42m | Avg:  5m 42s | Max: 14m 06s | Hits:  97%/9238  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 15m 15s | Avg: 7m 37s | Max: 12m 56s | Hits: 98%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 15m 15s | Avg:  7m 37s | Max: 12m 56s | Hits:  98%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 15m 15s | Avg:  7m 37s | Max: 12m 56s | Hits:  98%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 15m 15s | Avg:  7m 37s | Max: 12m 56s | Hits:  98%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 15m 15s | Avg:  7m 37s | Max: 12m 56s | Hits:  98%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 15m 15s | Avg:  7m 37s | Max: 12m 56s | Hits:  98%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 15m 15s | Avg:  7m 37s | Max: 12m 56s | Hits:  98%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 15m 15s | Avg:  7m 37s | Max: 12m 56s | Hits:  98%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 19s | Avg:  2m 19s | Max:  2m 19s | Hits:  98%/154   
      🟩 Test               Pass: 100%/1   | Total: 12m 56s | Avg: 12m 56s | Max: 12m 56s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 40m 45s | Avg: 40m 45s | Max: 40m 45s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 40m 45s | Avg: 40m 45s | Max: 40m 45s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 40m 45s | Avg: 40m 45s | Max: 40m 45s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 40m 45s | Avg: 40m 45s | Max: 40m 45s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 40m 45s | Avg: 40m 45s | Max: 40m 45s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 40m 45s | Avg: 40m 45s | Max: 40m 45s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 40m 45s | Avg: 40m 45s | Max: 40m 45s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 40m 45s | Avg: 40m 45s | Max: 40m 45s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 40m 45s | Avg: 40m 45s | Max: 40m 45s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 158)

# Runner
111 linux-amd64-cpu16
15 windows-amd64-cpu16
10 linux-arm64-cpu16
8 linux-amd64-gpu-rtx2080-latest-1
6 linux-amd64-gpu-rtxa6000-latest-1
5 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1

@fbusato fbusato requested a review from miscco February 26, 2025 19:30
Copy link
Contributor

🟩 CI finished in 1h 10m: Pass: 100%/158 | Total: 23h 32m | Avg: 8m 56s | Max: 40m 14s | Hits: 93%/248540
  • 🟩 cub: Pass: 100%/45 | Total: 8h 13m | Avg: 10m 57s | Max: 30m 52s | Hits: 93%/53485

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  8h 02m | Avg: 11m 13s | Max: 30m 52s | Hits:  92%/51055 
      🟩 arm64              Pass: 100%/2   | Total: 10m 58s | Avg:  5m 29s | Max:  5m 48s | Hits:  99%/2430  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 49m 51s | Avg:  9m 58s | Max: 27m 33s | Hits:  85%/5908  
      🟩 12.5               Pass: 100%/2   | Total: 20m 57s | Avg: 10m 28s | Max: 10m 30s | Hits:  98%/2248  
      🟩 12.8               Pass: 100%/38  | Total:  7h 02m | Avg: 11m 07s | Max: 30m 52s | Hits:  94%/45329 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  9m 19s | Avg:  4m 39s | Max:  4m 45s | Hits: 100%/2100  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 49m 51s | Avg:  9m 58s | Max: 27m 33s | Hits:  85%/5908  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 20m 57s | Avg: 10m 28s | Max: 10m 30s | Hits:  98%/2248  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  6h 53m | Avg: 11m 28s | Max: 30m 52s | Hits:  93%/43229 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 19s | Avg:  4m 39s | Max:  4m 45s | Hits: 100%/2100  
      🟩 nvcc               Pass: 100%/43  | Total:  8h 04m | Avg: 11m 15s | Max: 30m 52s | Hits:  92%/51385 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 22m 27s | Avg:  5m 36s | Max:  5m 57s | Hits: 100%/4868  
      🟩 Clang15            Pass: 100%/2   | Total: 12m 58s | Avg:  6m 29s | Max:  6m 32s | Hits: 100%/2430  
      🟩 Clang16            Pass: 100%/2   | Total: 11m 56s | Avg:  5m 58s | Max:  6m 01s | Hits: 100%/2430  
      🟩 Clang17            Pass: 100%/2   | Total: 12m 12s | Avg:  6m 06s | Max:  6m 07s | Hits: 100%/2430  
      🟩 Clang18            Pass: 100%/7   | Total:  1h 10m | Avg: 10m 07s | Max: 22m 51s | Hits: 100%/8175  
      🟩 GCC7               Pass: 100%/2   | Total: 11m 21s | Avg:  5m 40s | Max:  5m 50s | Hits:  99%/2434  
      🟩 GCC8               Pass: 100%/1   | Total:  6m 30s | Avg:  6m 30s | Max:  6m 30s | Hits:  99%/1217  
      🟩 GCC9               Pass: 100%/2   | Total: 12m 33s | Avg:  6m 16s | Max:  6m 26s | Hits:  99%/2434  
      🟩 GCC10              Pass: 100%/2   | Total: 12m 28s | Avg:  6m 14s | Max:  6m 15s | Hits:  99%/2434  
      🟩 GCC11              Pass: 100%/2   | Total: 12m 27s | Avg:  6m 13s | Max:  6m 16s | Hits:  99%/2430  
      🟩 GCC12              Pass: 100%/2   | Total: 13m 08s | Avg:  6m 34s | Max:  6m 44s | Hits:  99%/2430  
      🟩 GCC13              Pass: 100%/11  | Total:  2h 36m | Avg: 14m 14s | Max: 23m 20s | Hits:  99%/13365 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 57m 33s | Avg: 28m 46s | Max: 30m 00s | Hits:  15%/2080  
      🟩 MSVC14.42          Pass: 100%/2   | Total: 59m 22s | Avg: 29m 41s | Max: 30m 52s | Hits:  15%/2080  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 20m 57s | Avg: 10m 28s | Max: 10m 30s | Hits:  98%/2248  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  2h 10m | Avg:  7m 40s | Max: 22m 51s | Hits: 100%/20333 
      🟩 GCC                Pass: 100%/22  | Total:  3h 45m | Avg: 10m 13s | Max: 23m 20s | Hits:  99%/26744 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 56m | Avg: 29m 13s | Max: 30m 52s | Hits:  15%/4160  
      🟩 NVHPC              Pass: 100%/2   | Total: 20m 57s | Avg: 10m 28s | Max: 10m 30s | Hits:  98%/2248  
    🟩 gpu
      🟩 h100               Pass: 100%/3   | Total: 49m 29s | Avg: 16m 29s | Max: 23m 20s | Hits:  99%/3645  
      🟩 rtx2080            Pass: 100%/34  | Total:  5h 05m | Avg:  8m 59s | Max: 30m 52s | Hits:  91%/40120 
      🟩 rtxa6000           Pass: 100%/8   | Total:  2h 18m | Avg: 17m 17s | Max: 23m 06s | Hits:  99%/9720  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  5h 23m | Avg:  8m 44s | Max: 30m 52s | Hits:  91%/43765 
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 20m 46s | Avg: 20m 46s | Max: 20m 46s | Hits:  99%/1215  
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 06s | Avg: 16m 06s | Max: 16m 06s | Hits:  99%/1215  
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 09m | Avg: 23m 05s | Max: 23m 20s | Hits:  99%/3645  
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 03m | Avg: 21m 17s | Max: 21m 24s | Hits:  99%/3645  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 49m 29s | Avg: 16m 29s | Max: 23m 20s | Hits:  99%/3645  
      🟩 90;90a;100         Pass: 100%/1   | Total:  6m 54s | Avg:  6m 54s | Max:  6m 54s | Hits:  99%/1215  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  3h 13m | Avg:  9m 39s | Max: 30m 00s | Hits:  88%/23535 
      🟩 20                 Pass: 100%/25  | Total:  5h 00m | Avg: 12m 00s | Max: 30m 52s | Hits:  96%/29950 
    
  • 🟩 thrust: Pass: 100%/45 | Total: 6h 31m | Avg: 8m 42s | Max: 33m 20s | Hits: 96%/80136

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 17m 10s | Avg:  8m 35s | Max: 10m 59s | Hits:  99%/3564  
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  6h 22m | Avg:  8m 53s | Max: 33m 20s | Hits:  96%/76573 
      🟩 arm64              Pass: 100%/2   | Total:  9m 28s | Avg:  4m 44s | Max:  5m 07s | Hits:  99%/3563  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 42m 37s | Avg:  8m 31s | Max: 23m 09s | Hits:  94%/8901  
      🟩 12.5               Pass: 100%/2   | Total: 26m 42s | Avg: 13m 21s | Max: 13m 50s | Hits:  99%/3562  
      🟩 12.8               Pass: 100%/38  | Total:  5h 22m | Avg:  8m 28s | Max: 33m 20s | Hits:  96%/67673 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  9m 53s | Avg:  4m 56s | Max:  4m 58s | Hits: 100%/3562  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 42m 37s | Avg:  8m 31s | Max: 23m 09s | Hits:  94%/8901  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 26m 42s | Avg: 13m 21s | Max: 13m 50s | Hits:  99%/3562  
      🟩 nvcc12.8           Pass: 100%/36  | Total:  5h 12m | Avg:  8m 40s | Max: 33m 20s | Hits:  96%/64111 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 53s | Avg:  4m 56s | Max:  4m 58s | Hits: 100%/3562  
      🟩 nvcc               Pass: 100%/43  | Total:  6h 21m | Avg:  8m 52s | Max: 33m 20s | Hits:  96%/76574 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 19m 59s | Avg:  4m 59s | Max:  5m 20s | Hits: 100%/7124  
      🟩 Clang15            Pass: 100%/2   | Total: 11m 34s | Avg:  5m 47s | Max:  5m 47s | Hits: 100%/3562  
      🟩 Clang16            Pass: 100%/2   | Total: 11m 07s | Avg:  5m 33s | Max:  5m 46s | Hits: 100%/3562  
      🟩 Clang17            Pass: 100%/2   | Total: 11m 00s | Avg:  5m 30s | Max:  5m 45s | Hits: 100%/3562  
      🟩 Clang18            Pass: 100%/7   | Total: 42m 25s | Avg:  6m 03s | Max: 10m 05s | Hits: 100%/12467 
      🟩 GCC7               Pass: 100%/2   | Total: 10m 29s | Avg:  5m 14s | Max:  5m 37s | Hits:  99%/3564  
      🟩 GCC8               Pass: 100%/1   | Total:  5m 14s | Avg:  5m 14s | Max:  5m 14s | Hits:  99%/1782  
      🟩 GCC9               Pass: 100%/2   | Total: 10m 32s | Avg:  5m 16s | Max:  5m 20s | Hits:  99%/3564  
      🟩 GCC10              Pass: 100%/2   | Total: 11m 36s | Avg:  5m 48s | Max:  5m 54s | Hits:  99%/3564  
      🟩 GCC11              Pass: 100%/2   | Total: 11m 04s | Avg:  5m 32s | Max:  5m 41s | Hits:  99%/3564  
      🟩 GCC12              Pass: 100%/2   | Total: 12m 20s | Avg:  6m 10s | Max:  6m 16s | Hits:  99%/3564  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 15m | Avg:  7m 31s | Max: 11m 18s | Hits:  99%/17820 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 46m 37s | Avg: 23m 18s | Max: 23m 28s | Hits:  70%/3550  
      🟩 MSVC14.42          Pass: 100%/3   | Total:  1h 25m | Avg: 28m 35s | Max: 33m 20s | Hits:  70%/5325  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 26m 42s | Avg: 13m 21s | Max: 13m 50s | Hits:  99%/3562  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  1h 36m | Avg:  5m 39s | Max: 10m 05s | Hits: 100%/30277 
      🟩 GCC                Pass: 100%/21  | Total:  2h 16m | Avg:  6m 29s | Max: 11m 18s | Hits:  99%/37422 
      🟩 MSVC               Pass: 100%/5   | Total:  2h 12m | Avg: 26m 28s | Max: 33m 20s | Hits:  70%/8875  
      🟩 NVHPC              Pass: 100%/2   | Total: 26m 42s | Avg: 13m 21s | Max: 13m 50s | Hits:  99%/3562  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 15m 41s | Avg:  7m 50s | Max: 10m 41s | Hits:  99%/3564  
      🟩 rtx2080            Pass: 100%/33  | Total:  4h 11m | Avg:  7m 38s | Max: 26m 38s | Hits:  97%/58769 
      🟩 rtx4090            Pass: 100%/10  | Total:  2h 04m | Avg: 12m 24s | Max: 33m 20s | Hits:  94%/17803 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total:  5h 00m | Avg:  7m 53s | Max: 26m 38s | Hits:  96%/67671 
      🟩 TestCPU            Pass: 100%/3   | Total: 48m 29s | Avg: 16m 09s | Max: 33m 20s | Hits:  90%/5338  
      🟩 TestGPU            Pass: 100%/4   | Total: 43m 03s | Avg: 10m 45s | Max: 11m 18s | Hits:  99%/7127  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 15m 41s | Avg:  7m 50s | Max: 10m 41s | Hits:  99%/3564  
      🟩 90;90a;100         Pass: 100%/1   | Total:  6m 34s | Avg:  6m 34s | Max:  6m 34s | Hits:  99%/1782  
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  2h 52m | Avg:  8m 37s | Max: 26m 38s | Hits:  95%/35611 
      🟩 20                 Pass: 100%/23  | Total:  3h 22m | Avg:  8m 47s | Max: 33m 20s | Hits:  97%/40961 
    
  • 🟩 libcudacxx: Pass: 100%/43 | Total: 5h 59m | Avg: 8m 21s | Max: 26m 19s | Hits: 91%/103347

    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total:  5h 50m | Avg:  8m 33s | Max: 26m 19s | Hits:  91%/97678 
      🟩 arm64              Pass: 100%/2   | Total:  8m 31s | Avg:  4m 15s | Max:  4m 34s | Hits:  96%/5669  
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 45m 41s | Avg:  9m 08s | Max: 21m 45s | Hits:  89%/13711 
      🟩 12.5               Pass: 100%/2   | Total: 17m 09s | Avg:  8m 34s | Max:  8m 44s | Hits:  97%/5614  
      🟩 12.8               Pass: 100%/36  | Total:  4h 56m | Avg:  8m 14s | Max: 26m 19s | Hits:  91%/84022 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 41m 34s | Avg: 20m 47s | Max: 22m 07s | Hits:  27%/5630  
      🟩 nvcc12.0           Pass: 100%/5   | Total: 45m 41s | Avg:  9m 08s | Max: 21m 45s | Hits:  89%/13711 
      🟩 nvcc12.5           Pass: 100%/2   | Total: 17m 09s | Avg:  8m 34s | Max:  8m 44s | Hits:  97%/5614  
      🟩 nvcc12.8           Pass: 100%/34  | Total:  4h 14m | Avg:  7m 29s | Max: 26m 19s | Hits:  96%/78392 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 41m 34s | Avg: 20m 47s | Max: 22m 07s | Hits:  27%/5630  
      🟩 nvcc               Pass: 100%/41  | Total:  5h 17m | Avg:  7m 45s | Max: 26m 19s | Hits:  95%/97717 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 25m 59s | Avg:  6m 29s | Max: 11m 46s | Hits:  87%/11230 
      🟩 Clang15            Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  4m 53s | Hits:  99%/5626  
      🟩 Clang16            Pass: 100%/2   | Total:  9m 02s | Avg:  4m 31s | Max:  4m 49s | Hits:  98%/5626  
      🟩 Clang17            Pass: 100%/2   | Total:  9m 34s | Avg:  4m 47s | Max:  4m 57s | Hits:  97%/5626  
      🟩 Clang18            Pass: 100%/6   | Total:  1h 04m | Avg: 10m 40s | Max: 22m 07s | Hits:  69%/14090 
      🟩 GCC7               Pass: 100%/2   | Total:  7m 25s | Avg:  3m 42s | Max:  3m 50s | Hits:  99%/5564  
      🟩 GCC8               Pass: 100%/1   | Total:  4m 25s | Avg:  4m 25s | Max:  4m 25s | Hits:  97%/2792  
      🟩 GCC9               Pass: 100%/2   | Total:  7m 55s | Avg:  3m 57s | Max:  4m 06s | Hits:  97%/5576  
      🟩 GCC10              Pass: 100%/2   | Total:  8m 11s | Avg:  4m 05s | Max:  4m 10s | Hits:  98%/5632  
      🟩 GCC11              Pass: 100%/2   | Total:  8m 31s | Avg:  4m 15s | Max:  4m 27s | Hits:  97%/5628  
      🟩 GCC12              Pass: 100%/2   | Total:  8m 49s | Avg:  4m 24s | Max:  4m 25s | Hits:  97%/5628  
      🟩 GCC13              Pass: 100%/10  | Total:  1h 22m | Avg:  8m 12s | Max: 16m 30s | Hits:  97%/14351 
      🟩 MSVC14.29          Pass: 100%/2   | Total: 48m 04s | Avg: 24m 02s | Max: 26m 19s | Hits:  70%/5102  
      🟩 MSVC14.42          Pass: 100%/2   | Total: 48m 31s | Avg: 24m 15s | Max: 25m 04s | Hits:  98%/5262  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 17m 09s | Avg:  8m 34s | Max:  8m 44s | Hits:  97%/5614  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/16  | Total:  1h 58m | Avg:  7m 23s | Max: 22m 07s | Hits:  85%/42198 
      🟩 GCC                Pass: 100%/21  | Total:  2h 07m | Avg:  6m 03s | Max: 16m 30s | Hits:  97%/45171 
      🟩 MSVC               Pass: 100%/4   | Total:  1h 36m | Avg: 24m 08s | Max: 26m 19s | Hits:  84%/10364 
      🟩 NVHPC              Pass: 100%/2   | Total: 17m 09s | Avg:  8m 34s | Max:  8m 44s | Hits:  97%/5614  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 16m 19s | Avg:  8m 09s | Max: 11m 58s | Hits:  99%/2924  
      🟩 rtx2080            Pass: 100%/41  | Total:  5h 43m | Avg:  8m 22s | Max: 26m 19s | Hits:  91%/100423
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  4h 50m | Avg:  7m 51s | Max: 26m 19s | Hits:  91%/103307
      🟩 NVRTC              Pass: 100%/2   | Total: 31m 59s | Avg: 15m 59s | Max: 16m 30s | Hits:  90%/40    
      🟩 Test               Pass: 100%/3   | Total: 34m 41s | Avg: 11m 33s | Max: 13m 32s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 12s | Avg:  2m 12s | Max:  2m 12s
    🟩 sm
      🟩 75                 Pass: 100%/2   | Total: 31m 59s | Avg: 15m 59s | Max: 16m 30s | Hits:  90%/40    
      🟩 90                 Pass: 100%/2   | Total: 16m 19s | Avg:  8m 09s | Max: 11m 58s | Hits:  99%/2924  
      🟩 90;90a;100         Pass: 100%/1   | Total:  4m 33s | Avg:  4m 33s | Max:  4m 33s | Hits:  99%/2924  
    🟩 std
      🟩 17                 Pass: 100%/21  | Total:  3h 07m | Avg:  8m 54s | Max: 26m 19s | Hits:  90%/55106 
      🟩 20                 Pass: 100%/21  | Total:  2h 50m | Avg:  8m 06s | Max: 25m 04s | Hits:  93%/48241 
    
  • 🟩 cudax: Pass: 100%/22 | Total: 1h 53m | Avg: 5m 10s | Max: 13m 11s | Hits: 97%/11264

    🟩 cpu
      🟩 amd64              Pass: 100%/18  | Total:  1h 42m | Avg:  5m 42s | Max: 13m 11s | Hits:  97%/9036  
      🟩 arm64              Pass: 100%/4   | Total: 11m 09s | Avg:  2m 47s | Max:  2m 56s | Hits:  99%/2228  
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total: 10m 20s | Avg: 10m 20s | Max: 10m 20s | Hits:  61%/262   
      🟩 12.5               Pass: 100%/2   | Total: 10m 19s | Avg:  5m 09s | Max:  5m 12s | Hits:  96%/710   
      🟩 12.8               Pass: 100%/19  | Total:  1h 33m | Avg:  4m 54s | Max: 13m 11s | Hits:  98%/10292 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total: 10m 20s | Avg: 10m 20s | Max: 10m 20s | Hits:  61%/262   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 10m 19s | Avg:  5m 09s | Max:  5m 12s | Hits:  96%/710   
      🟩 nvcc12.8           Pass: 100%/19  | Total:  1h 33m | Avg:  4m 54s | Max: 13m 11s | Hits:  98%/10292 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/22  | Total:  1h 53m | Avg:  5m 10s | Max: 13m 11s | Hits:  97%/11264 
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total:  3m 14s | Avg:  3m 14s | Max:  3m 14s | Hits: 100%/559   
      🟩 Clang15            Pass: 100%/1   | Total:  3m 27s | Avg:  3m 27s | Max:  3m 27s | Hits: 100%/557   
      🟩 Clang16            Pass: 100%/1   | Total:  3m 17s | Avg:  3m 17s | Max:  3m 17s | Hits: 100%/557   
      🟩 Clang17            Pass: 100%/1   | Total:  3m 21s | Avg:  3m 21s | Max:  3m 21s | Hits: 100%/557   
      🟩 Clang18            Pass: 100%/4   | Total: 20m 50s | Avg:  5m 12s | Max: 11m 59s | Hits: 100%/2228  
      🟩 GCC10              Pass: 100%/1   | Total:  3m 05s | Avg:  3m 05s | Max:  3m 05s | Hits:  99%/559   
      🟩 GCC11              Pass: 100%/1   | Total:  3m 09s | Avg:  3m 09s | Max:  3m 09s | Hits:  99%/557   
      🟩 GCC12              Pass: 100%/2   | Total: 16m 41s | Avg:  8m 20s | Max: 13m 11s | Hits:  99%/1114  
      🟩 GCC13              Pass: 100%/6   | Total: 25m 49s | Avg:  4m 18s | Max: 11m 15s | Hits:  99%/3342  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 10m 20s | Avg: 10m 20s | Max: 10m 20s | Hits:  61%/262   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 10m 16s | Avg: 10m 16s | Max: 10m 16s | Hits:  61%/262   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 10m 19s | Avg:  5m 09s | Max:  5m 12s | Hits:  96%/710   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total: 34m 09s | Avg:  4m 16s | Max: 11m 59s | Hits: 100%/4458  
      🟩 GCC                Pass: 100%/10  | Total: 48m 44s | Avg:  4m 52s | Max: 13m 11s | Hits:  99%/5572  
      🟩 MSVC               Pass: 100%/2   | Total: 20m 36s | Avg: 10m 18s | Max: 10m 20s | Hits:  61%/524   
      🟩 NVHPC              Pass: 100%/2   | Total: 10m 19s | Avg:  5m 09s | Max:  5m 12s | Hits:  96%/710   
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 14m 14s | Avg:  7m 07s | Max: 11m 15s | Hits:  99%/1114  
      🟩 rtx2080            Pass: 100%/20  | Total:  1h 39m | Avg:  4m 58s | Max: 13m 11s | Hits:  97%/10150 
    🟩 jobs
      🟩 Build              Pass: 100%/19  | Total:  1h 17m | Avg:  4m 04s | Max: 10m 20s | Hits:  97%/9593  
      🟩 Test               Pass: 100%/3   | Total: 36m 25s | Avg: 12m 08s | Max: 13m 11s | Hits:  99%/1671  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 17m 10s | Avg:  5m 43s | Max: 11m 15s | Hits:  99%/1671  
      🟩 90a                Pass: 100%/1   | Total:  2m 58s | Avg:  2m 58s | Max:  2m 58s | Hits:  99%/557   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 13m 36s | Avg:  3m 24s | Max:  5m 07s | Hits:  99%/2026  
      🟩 20                 Pass: 100%/18  | Total:  1h 40m | Avg:  5m 34s | Max: 13m 11s | Hits:  97%/9238  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 14m 26s | Avg: 7m 13s | Max: 12m 13s | Hits: 98%/308

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 14m 26s | Avg:  7m 13s | Max: 12m 13s | Hits:  98%/308   
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total: 14m 26s | Avg:  7m 13s | Max: 12m 13s | Hits:  98%/308   
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total: 14m 26s | Avg:  7m 13s | Max: 12m 13s | Hits:  98%/308   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 14m 26s | Avg:  7m 13s | Max: 12m 13s | Hits:  98%/308   
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 14m 26s | Avg:  7m 13s | Max: 12m 13s | Hits:  98%/308   
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 14m 26s | Avg:  7m 13s | Max: 12m 13s | Hits:  98%/308   
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total: 14m 26s | Avg:  7m 13s | Max: 12m 13s | Hits:  98%/308   
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 13s | Avg:  2m 13s | Max:  2m 13s | Hits:  98%/154   
      🟩 Test               Pass: 100%/1   | Total: 12m 13s | Avg: 12m 13s | Max: 12m 13s | Hits:  98%/154   
    
  • 🟩 python: Pass: 100%/1 | Total: 40m 14s | Avg: 40m 14s | Max: 40m 14s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 40m 14s | Avg: 40m 14s | Max: 40m 14s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 40m 14s | Avg: 40m 14s | Max: 40m 14s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 40m 14s | Avg: 40m 14s | Max: 40m 14s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 40m 14s | Avg: 40m 14s | Max: 40m 14s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 40m 14s | Avg: 40m 14s | Max: 40m 14s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 40m 14s | Avg: 40m 14s | Max: 40m 14s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 40m 14s | Avg: 40m 14s | Max: 40m 14s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 40m 14s | Avg: 40m 14s | Max: 40m 14s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 158)

# Runner
111 linux-amd64-cpu16
15 windows-amd64-cpu16
10 linux-arm64-cpu16
8 linux-amd64-gpu-rtx2080-latest-1
6 linux-amd64-gpu-rtxa6000-latest-1
5 linux-amd64-gpu-h100-latest-1
3 linux-amd64-gpu-rtx4090-latest-1

@fbusato fbusato added the 3.0 Targeted for 3.0 release label Feb 27, 2025
const _Tp& __data, int __src_lane, uint32_t __lane_mask = 0xFFFFFFFF, _CUDA_VSTD::integral_constant<int, _Width> = {})
{
constexpr auto __warp_size = 32u;
constexpr bool __is_void_ptr = _CUDA_VSTD::is_same_v<_Up, void*> || _CUDA_VSTD::is_same_v<_Up, const void*>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed qualifiers already so

Suggested change
constexpr bool __is_void_ptr = _CUDA_VSTD::is_same_v<_Up, void*> || _CUDA_VSTD::is_same_v<_Up, const void*>;
constexpr bool __is_void_ptr = _CUDA_VSTD::is_same_v<_Up, void*>;

Maybe this can be inlined now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, we need both. Here we have a mutable pointer to const void, not const void* const.
see https://godbolt.org/z/chezM9KaT

@fbusato fbusato requested a review from miscco February 27, 2025 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.0 Targeted for 3.0 release
Projects
Status: In Review
Development

Successfully merging this pull request may close these issues.

[FEA]: Provide generic and safe C++ interfaces for warp shuffle
3 participants