Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport to 2.8: PTX: fix cp.async.bulk.tensor and mbarrier.arrive (#3628) #3630

Merged

Conversation

bernhardmgruber
Copy link
Contributor

@bernhardmgruber bernhardmgruber commented Jan 31, 2025

Copy link
Contributor

🟩 CI finished in 1h 31m: Pass: 100%/169 | Total: 2d 21h | Avg: 24m 37s | Max: 1h 13m | Hits: 456%/20880
  • 🟩 libcudacxx: Pass: 100%/48 | Total: 8h 26m | Avg: 10m 33s | Max: 35m 29s | Hits: 678%/10028

    🟩 cpu
      🟩 amd64              Pass: 100%/46  | Total:  8h 18m | Avg: 10m 50s | Max: 35m 29s | Hits: 678%/10028 
      🟩 arm64              Pass: 100%/2   | Total:  7m 55s | Avg:  3m 57s | Max:  4m 13s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total: 53m 21s | Avg:  7m 37s | Max: 25m 01s | Hits: 682%/2324  
      🟩 12.5               Pass: 100%/2   | Total:  1h 05m | Avg: 32m 55s | Max: 35m 29s
      🟩 12.6               Pass: 100%/39  | Total:  6h 27m | Avg:  9m 56s | Max: 29m 22s | Hits: 676%/7704  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/4   | Total:  1h 07m | Avg: 16m 57s | Max: 20m 33s
      🟩 nvcc11.1           Pass: 100%/7   | Total: 53m 21s | Avg:  7m 37s | Max: 25m 01s | Hits: 682%/2324  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 05m | Avg: 32m 55s | Max: 35m 29s
      🟩 nvcc12.6           Pass: 100%/35  | Total:  5h 19m | Avg:  9m 08s | Max: 29m 22s | Hits: 676%/7704  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/4   | Total:  1h 07m | Avg: 16m 57s | Max: 20m 33s
      🟩 nvcc               Pass: 100%/44  | Total:  7h 19m | Avg:  9m 58s | Max: 35m 29s | Hits: 678%/10028 
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total: 16m 19s | Avg:  4m 04s | Max:  5m 21s
      🟩 Clang10            Pass: 100%/1   | Total:  5m 29s | Avg:  5m 29s | Max:  5m 29s
      🟩 Clang11            Pass: 100%/1   | Total:  4m 37s | Avg:  4m 37s | Max:  4m 37s
      🟩 Clang12            Pass: 100%/1   | Total:  4m 33s | Avg:  4m 33s | Max:  4m 33s
      🟩 Clang13            Pass: 100%/1   | Total:  4m 27s | Avg:  4m 27s | Max:  4m 27s
      🟩 Clang14            Pass: 100%/1   | Total:  4m 41s | Avg:  4m 41s | Max:  4m 41s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 43s | Avg:  4m 43s | Max:  4m 43s
      🟩 Clang16            Pass: 100%/1   | Total:  4m 34s | Avg:  4m 34s | Max:  4m 34s
      🟩 Clang17            Pass: 100%/1   | Total:  4m 48s | Avg:  4m 48s | Max:  4m 48s
      🟩 Clang18            Pass: 100%/8   | Total:  1h 30m | Avg: 11m 20s | Max: 20m 33s
      🟩 GCC6               Pass: 100%/2   | Total: 15m 53s | Avg:  7m 56s | Max: 12m 59s
      🟩 GCC7               Pass: 100%/2   | Total:  7m 30s | Avg:  3m 45s | Max:  4m 08s
      🟩 GCC8               Pass: 100%/1   | Total:  4m 04s | Avg:  4m 04s | Max:  4m 04s
      🟩 GCC9               Pass: 100%/3   | Total: 27m 15s | Avg:  9m 05s | Max: 21m 28s
      🟩 GCC10              Pass: 100%/1   | Total: 20m 02s | Avg: 20m 02s | Max: 20m 02s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 52s | Avg:  3m 52s | Max:  3m 52s
      🟩 GCC12              Pass: 100%/1   | Total:  4m 22s | Avg:  4m 22s | Max:  4m 22s
      🟩 GCC13              Pass: 100%/10  | Total:  1h 37m | Avg:  9m 45s | Max: 19m 08s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  6m 12s | Avg:  6m 12s | Max:  6m 12s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 25m 01s | Avg: 25m 01s | Max: 25m 01s | Hits: 682%/2324  
      🟩 MSVC14.29          Pass: 100%/1   | Total: 27m 13s | Avg: 27m 13s | Max: 27m 13s | Hits: 677%/2519  
      🟩 MSVC14.39          Pass: 100%/2   | Total: 57m 05s | Avg: 28m 32s | Max: 29m 22s | Hits: 676%/5185  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 05m | Avg: 32m 55s | Max: 35m 29s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/20  | Total:  2h 24m | Avg:  7m 14s | Max: 20m 33s
      🟩 GCC                Pass: 100%/21  | Total:  3h 00m | Avg:  8m 35s | Max: 21m 28s
      🟩 Intel              Pass: 100%/1   | Total:  6m 12s | Avg:  6m 12s | Max:  6m 12s
      🟩 MSVC               Pass: 100%/4   | Total:  1h 49m | Avg: 27m 19s | Max: 29m 22s | Hits: 678%/10028 
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 05m | Avg: 32m 55s | Max: 35m 29s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/8   | Total:  1h 34m | Avg: 11m 48s | Max: 19m 08s
      🟩 v100               Pass: 100%/40  | Total:  6h 52m | Avg: 10m 18s | Max: 35m 29s | Hits: 678%/10028 
    🟩 jobs
      🟩 Build              Pass: 100%/41  | Total:  6h 59m | Avg: 10m 13s | Max: 35m 29s | Hits: 678%/10028 
      🟩 NVRTC              Pass: 100%/4   | Total:  1h 07m | Avg: 16m 45s | Max: 19m 08s
      🟩 Test               Pass: 100%/2   | Total: 18m 16s | Avg:  9m 08s | Max:  9m 20s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 06s | Avg:  2m 06s | Max:  2m 06s
    🟩 sm
      🟩 75                 Pass: 100%/4   | Total:  1h 07m | Avg: 16m 45s | Max: 19m 08s
      🟩 90                 Pass: 100%/1   | Total: 14m 56s | Avg: 14m 56s | Max: 14m 56s
      🟩 90a                Pass: 100%/2   | Total: 20m 31s | Avg: 10m 15s | Max: 13m 11s
    🟩 std
      🟩 11                 Pass: 100%/6   | Total: 32m 18s | Avg:  5m 23s | Max: 16m 07s
      🟩 14                 Pass: 100%/5   | Total:  1h 02m | Avg: 12m 25s | Max: 25m 01s | Hits: 682%/2324  
      🟩 17                 Pass: 100%/13  | Total:  2h 56m | Avg: 13m 33s | Max: 30m 22s | Hits: 677%/5038  
      🟩 20                 Pass: 100%/23  | Total:  3h 54m | Avg: 10m 10s | Max: 35m 29s | Hits: 676%/2666  
    
  • 🟩 cub: Pass: 100%/47 | Total: 1d 11h | Avg: 45m 23s | Max: 1h 13m | Hits: 278%/3132

    🟩 cpu
      🟩 amd64              Pass: 100%/45  | Total:  1d 09h | Avg: 44m 52s | Max:  1h 13m | Hits: 278%/3132  
      🟩 arm64              Pass: 100%/2   | Total:  1h 53m | Avg: 56m 46s | Max: 56m 49s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  1h 04m | Avg:  9m 14s | Max: 38m 11s | Hits: 599%/783   
      🟩 12.5               Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 09m
      🟩 12.6               Pass: 100%/38  | Total:  1d 08h | Avg: 50m 55s | Max:  1h 13m | Hits: 171%/2349  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 00m
      🟩 nvcc11.1           Pass: 100%/7   | Total:  1h 04m | Avg:  9m 14s | Max: 38m 11s | Hits: 599%/783   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 09m
      🟩 nvcc12.6           Pass: 100%/36  | Total:  1d 06h | Avg: 50m 24s | Max:  1h 13m | Hits: 171%/2349  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 00m
      🟩 nvcc               Pass: 100%/45  | Total:  1d 09h | Avg: 44m 43s | Max:  1h 13m | Hits: 278%/3132  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  2h 10m | Avg: 32m 36s | Max:  1h 01m
      🟩 Clang10            Pass: 100%/1   | Total: 57m 46s | Avg: 57m 46s | Max: 57m 46s
      🟩 Clang11            Pass: 100%/1   | Total:  1h 02m | Avg:  1h 02m | Max:  1h 02m
      🟩 Clang12            Pass: 100%/1   | Total: 56m 14s | Avg: 56m 14s | Max: 56m 14s
      🟩 Clang13            Pass: 100%/1   | Total: 59m 54s | Avg: 59m 54s | Max: 59m 54s
      🟩 Clang14            Pass: 100%/1   | Total: 54m 06s | Avg: 54m 06s | Max: 54m 06s
      🟩 Clang15            Pass: 100%/1   | Total: 56m 24s | Avg: 56m 24s | Max: 56m 24s
      🟩 Clang16            Pass: 100%/1   | Total: 55m 42s | Avg: 55m 42s | Max: 55m 42s
      🟩 Clang17            Pass: 100%/1   | Total:  1h 02m | Avg:  1h 02m | Max:  1h 02m
      🟩 Clang18            Pass: 100%/7   | Total:  5h 32m | Avg: 47m 28s | Max:  1h 00m
      🟩 GCC6               Pass: 100%/2   | Total:  8m 45s | Avg:  4m 22s | Max:  4m 27s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 52m | Avg: 56m 10s | Max: 57m 13s
      🟩 GCC8               Pass: 100%/1   | Total: 56m 05s | Avg: 56m 05s | Max: 56m 05s
      🟩 GCC9               Pass: 100%/3   | Total:  1h 09m | Avg: 23m 19s | Max:  1h 01m
      🟩 GCC10              Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
      🟩 GCC11              Pass: 100%/1   | Total: 59m 01s | Avg: 59m 01s | Max: 59m 01s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 52m | Avg: 37m 39s | Max:  1h 04m
      🟩 GCC13              Pass: 100%/8   | Total:  4h 39m | Avg: 34m 59s | Max:  1h 05m
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  1h 06m | Avg:  1h 06m | Max:  1h 06m
      🟩 MSVC14.16          Pass: 100%/1   | Total: 38m 11s | Avg: 38m 11s | Max: 38m 11s | Hits: 599%/783   
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 13m | Avg:  1h 13m | Max:  1h 13m | Hits: 172%/783   
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 08m | Hits: 171%/1566  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 09m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total: 15h 27m | Avg: 48m 49s | Max:  1h 02m
      🟩 GCC                Pass: 100%/21  | Total: 12h 39m | Avg: 36m 10s | Max:  1h 05m
      🟩 Intel              Pass: 100%/1   | Total:  1h 06m | Avg:  1h 06m | Max:  1h 06m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 05m | Avg:  1h 01m | Max:  1h 13m | Hits: 278%/3132  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 09m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 48m 45s | Avg: 24m 22s | Max: 25m 02s
      🟩 rtxa6000           Pass: 100%/8   | Total:  3h 53m | Avg: 29m 14s | Max:  1h 05m
      🟩 v100               Pass: 100%/37  | Total:  1d 06h | Avg: 50m 00s | Max:  1h 13m | Hits: 278%/3132  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total:  1d 09h | Avg: 49m 53s | Max:  1h 13m | Hits: 278%/3132  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 16m 18s | Avg: 16m 18s | Max: 16m 18s
      🟩 GraphCapture       Pass: 100%/1   | Total: 14m 46s | Avg: 14m 46s | Max: 14m 46s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 07m | Avg: 22m 35s | Max: 23m 43s
      🟩 TestGPU            Pass: 100%/2   | Total: 38m 29s | Avg: 19m 14s | Max: 19m 28s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 48m 45s | Avg: 24m 22s | Max: 25m 02s
      🟩 90a                Pass: 100%/1   | Total: 27m 42s | Avg: 27m 42s | Max: 27m 42s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  2h 10m | Avg: 26m 00s | Max:  1h 00m
      🟩 14                 Pass: 100%/4   | Total:  2h 39m | Avg: 39m 45s | Max:  1h 01m | Hits: 599%/783   
      🟩 17                 Pass: 100%/12  | Total: 10h 36m | Avg: 53m 03s | Max:  1h 13m | Hits: 172%/1566  
      🟩 20                 Pass: 100%/26  | Total: 20h 07m | Avg: 46m 26s | Max:  1h 08m | Hits: 171%/783   
    
  • 🟩 thrust: Pass: 100%/45 | Total: 22h 22m | Avg: 29m 50s | Max: 1h 02m | Hits: 226%/7408

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 39m 49s | Avg: 19m 54s | Max: 28m 47s
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 21h 14m | Avg: 29m 38s | Max:  1h 02m | Hits: 226%/7408  
      🟩 arm64              Pass: 100%/2   | Total:  1h 08m | Avg: 34m 03s | Max: 37m 51s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total: 50m 28s | Avg:  7m 12s | Max: 24m 17s | Hits: 368%/1852  
      🟩 12.5               Pass: 100%/2   | Total:  1h 49m | Avg: 54m 59s | Max: 56m 46s
      🟩 12.6               Pass: 100%/36  | Total: 19h 42m | Avg: 32m 50s | Max:  1h 02m | Hits: 178%/5556  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 00m | Avg: 30m 14s | Max: 31m 37s
      🟩 nvcc11.1           Pass: 100%/7   | Total: 50m 28s | Avg:  7m 12s | Max: 24m 17s | Hits: 368%/1852  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 49m | Avg: 54m 59s | Max: 56m 46s
      🟩 nvcc12.6           Pass: 100%/34  | Total: 18h 41m | Avg: 32m 59s | Max:  1h 02m | Hits: 178%/5556  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 00m | Avg: 30m 14s | Max: 31m 37s
      🟩 nvcc               Pass: 100%/43  | Total: 21h 22m | Avg: 29m 48s | Max:  1h 02m | Hits: 226%/7408  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  1h 09m | Avg: 17m 15s | Max: 30m 58s
      🟩 Clang10            Pass: 100%/1   | Total: 37m 54s | Avg: 37m 54s | Max: 37m 54s
      🟩 Clang11            Pass: 100%/1   | Total: 32m 40s | Avg: 32m 40s | Max: 32m 40s
      🟩 Clang12            Pass: 100%/1   | Total: 35m 26s | Avg: 35m 26s | Max: 35m 26s
      🟩 Clang13            Pass: 100%/1   | Total: 35m 38s | Avg: 35m 38s | Max: 35m 38s
      🟩 Clang14            Pass: 100%/1   | Total: 35m 34s | Avg: 35m 34s | Max: 35m 34s
      🟩 Clang15            Pass: 100%/1   | Total: 36m 17s | Avg: 36m 17s | Max: 36m 17s
      🟩 Clang16            Pass: 100%/1   | Total: 35m 28s | Avg: 35m 28s | Max: 35m 28s
      🟩 Clang17            Pass: 100%/1   | Total: 34m 07s | Avg: 34m 07s | Max: 34m 07s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 53m | Avg: 24m 50s | Max: 34m 20s
      🟩 GCC6               Pass: 100%/2   | Total:  8m 35s | Avg:  4m 17s | Max:  4m 39s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 04m | Avg: 32m 29s | Max: 37m 32s
      🟩 GCC8               Pass: 100%/1   | Total: 37m 51s | Avg: 37m 51s | Max: 37m 51s
      🟩 GCC9               Pass: 100%/3   | Total: 48m 03s | Avg: 16m 01s | Max: 39m 11s
      🟩 GCC10              Pass: 100%/1   | Total: 37m 58s | Avg: 37m 58s | Max: 37m 58s
      🟩 GCC11              Pass: 100%/1   | Total: 34m 09s | Avg: 34m 09s | Max: 34m 09s
      🟩 GCC12              Pass: 100%/1   | Total: 35m 18s | Avg: 35m 18s | Max: 35m 18s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 10m | Avg: 23m 49s | Max: 39m 01s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 45m 12s | Avg: 45m 12s | Max: 45m 12s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 24m 17s | Avg: 24m 17s | Max: 24m 17s | Hits: 368%/1852  
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 02m | Avg:  1h 02m | Max:  1h 02m | Hits: 178%/1852  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  1h 56m | Avg: 58m 21s | Max:  1h 00m | Hits: 178%/3704  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 49m | Avg: 54m 59s | Max: 56m 46s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  8h 46m | Avg: 27m 41s | Max: 37m 54s
      🟩 GCC                Pass: 100%/19  | Total:  7h 37m | Avg: 24m 04s | Max: 39m 11s
      🟩 Intel              Pass: 100%/1   | Total: 45m 12s | Avg: 45m 12s | Max: 45m 12s
      🟩 MSVC               Pass: 100%/4   | Total:  3h 23m | Avg: 50m 57s | Max:  1h 02m | Hits: 226%/7408  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 49m | Avg: 54m 59s | Max: 56m 46s
    🟩 gpu
      🟩 rtx4090            Pass: 100%/8   | Total:  2h 26m | Avg: 18m 15s | Max: 35m 27s
      🟩 v100               Pass: 100%/37  | Total: 19h 56m | Avg: 32m 20s | Max:  1h 02m | Hits: 226%/7408  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total: 21h 35m | Avg: 32m 22s | Max:  1h 02m | Hits: 226%/7408  
      🟩 TestCPU            Pass: 100%/2   | Total: 14m 57s | Avg:  7m 28s | Max:  7m 35s
      🟩 TestGPU            Pass: 100%/3   | Total: 32m 30s | Avg: 10m 50s | Max: 11m 05s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 20m 02s | Avg: 20m 02s | Max: 20m 02s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  1h 09m | Avg: 13m 49s | Max: 29m 21s
      🟩 14                 Pass: 100%/4   | Total:  1h 37m | Avg: 24m 21s | Max: 37m 32s | Hits: 368%/1852  
      🟩 17                 Pass: 100%/12  | Total:  7h 20m | Avg: 36m 40s | Max:  1h 02m | Hits: 178%/3704  
      🟩 20                 Pass: 100%/22  | Total: 11h 36m | Avg: 31m 38s | Max:  1h 00m | Hits: 178%/1852  
    
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 26m | Avg: 5m 37s | Max: 16m 32s | Hits: 574%/312

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  2h 11m | Avg:  5m 57s | Max: 16m 32s | Hits: 574%/312   
      🟩 arm64              Pass: 100%/4   | Total: 15m 17s | Avg:  3m 49s | Max:  4m 04s
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 20m 29s | Avg:  6m 49s | Max: 12m 19s | Hits: 574%/156   
      🟩 12.5               Pass: 100%/2   | Total: 12m 58s | Avg:  6m 29s | Max:  6m 39s
      🟩 12.6               Pass: 100%/21  | Total:  1h 52m | Avg:  5m 22s | Max: 16m 32s | Hits: 575%/156   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 20m 29s | Avg:  6m 49s | Max: 12m 19s | Hits: 574%/156   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 12m 58s | Avg:  6m 29s | Max:  6m 39s
      🟩 nvcc12.6           Pass: 100%/21  | Total:  1h 52m | Avg:  5m 22s | Max: 16m 32s | Hits: 575%/156   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 26m | Avg:  5m 37s | Max: 16m 32s | Hits: 574%/312   
    🟩 cxx
      🟩 Clang9             Pass: 100%/1   | Total:  4m 00s | Avg:  4m 00s | Max:  4m 00s
      🟩 Clang10            Pass: 100%/1   | Total:  4m 43s | Avg:  4m 43s | Max:  4m 43s
      🟩 Clang11            Pass: 100%/1   | Total:  4m 04s | Avg:  4m 04s | Max:  4m 04s
      🟩 Clang12            Pass: 100%/1   | Total:  4m 04s | Avg:  4m 04s | Max:  4m 04s
      🟩 Clang13            Pass: 100%/1   | Total:  4m 06s | Avg:  4m 06s | Max:  4m 06s
      🟩 Clang14            Pass: 100%/1   | Total:  4m 05s | Avg:  4m 05s | Max:  4m 05s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 08s | Avg:  4m 08s | Max:  4m 08s
      🟩 Clang16            Pass: 100%/1   | Total:  4m 13s | Avg:  4m 13s | Max:  4m 13s
      🟩 Clang17            Pass: 100%/1   | Total:  3m 52s | Avg:  3m 52s | Max:  3m 52s
      🟩 Clang18            Pass: 100%/4   | Total: 23m 34s | Avg:  5m 53s | Max: 12m 26s
      🟩 GCC9               Pass: 100%/1   | Total:  4m 10s | Avg:  4m 10s | Max:  4m 10s
      🟩 GCC10              Pass: 100%/1   | Total:  4m 03s | Avg:  4m 03s | Max:  4m 03s
      🟩 GCC11              Pass: 100%/1   | Total:  4m 18s | Avg:  4m 18s | Max:  4m 18s
      🟩 GCC12              Pass: 100%/2   | Total: 20m 37s | Avg: 10m 18s | Max: 16m 32s
      🟩 GCC13              Pass: 100%/4   | Total: 14m 58s | Avg:  3m 44s | Max:  4m 04s
      🟩 MSVC14.36          Pass: 100%/1   | Total: 12m 19s | Avg: 12m 19s | Max: 12m 19s | Hits: 574%/156   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 12m 13s | Avg: 12m 13s | Max: 12m 13s | Hits: 575%/156   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 12m 58s | Avg:  6m 29s | Max:  6m 39s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/13  | Total:  1h 00m | Avg:  4m 40s | Max: 12m 26s
      🟩 GCC                Pass: 100%/9   | Total: 48m 06s | Avg:  5m 20s | Max: 16m 32s
      🟩 MSVC               Pass: 100%/2   | Total: 24m 32s | Avg: 12m 16s | Max: 12m 19s | Hits: 574%/312   
      🟩 NVHPC              Pass: 100%/2   | Total: 12m 58s | Avg:  6m 29s | Max:  6m 39s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 36m 59s | Avg:  9m 14s | Max: 16m 32s
      🟩 v100               Pass: 100%/22  | Total:  1h 49m | Avg:  4m 58s | Max: 12m 19s | Hits: 574%/312   
    🟩 jobs
      🟩 Build              Pass: 100%/24  | Total:  1h 57m | Avg:  4m 53s | Max: 12m 19s | Hits: 574%/312   
      🟩 Test               Pass: 100%/2   | Total: 28m 58s | Avg: 14m 29s | Max: 16m 32s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  3m 20s | Avg:  3m 20s | Max:  3m 20s
      🟩 90a                Pass: 100%/1   | Total:  3m 33s | Avg:  3m 33s | Max:  3m 33s
    🟩 std
      🟩 17                 Pass: 100%/6   | Total: 25m 49s | Avg:  4m 18s | Max:  6m 39s
      🟩 20                 Pass: 100%/20  | Total:  2h 00m | Avg:  6m 01s | Max: 16m 32s | Hits: 574%/312   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 7m 33s | Avg: 3m 46s | Max: 5m 21s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  7m 33s | Avg:  3m 46s | Max:  5m 21s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  7m 33s | Avg:  3m 46s | Max:  5m 21s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  7m 33s | Avg:  3m 46s | Max:  5m 21s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  7m 33s | Avg:  3m 46s | Max:  5m 21s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  7m 33s | Avg:  3m 46s | Max:  5m 21s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  7m 33s | Avg:  3m 46s | Max:  5m 21s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  7m 33s | Avg:  3m 46s | Max:  5m 21s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 12s | Avg:  2m 12s | Max:  2m 12s
      🟩 Test               Pass: 100%/1   | Total:  5m 21s | Avg:  5m 21s | Max:  5m 21s
    
  • 🟩 python: Pass: 100%/1 | Total: 24m 28s | Avg: 24m 28s | Max: 24m 28s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 24m 28s | Avg: 24m 28s | Max: 24m 28s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 24m 28s | Avg: 24m 28s | Max: 24m 28s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 24m 28s | Avg: 24m 28s | Max: 24m 28s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 24m 28s | Avg: 24m 28s | Max: 24m 28s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 24m 28s | Avg: 24m 28s | Max: 24m 28s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 24m 28s | Avg: 24m 28s | Max: 24m 28s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 24m 28s | Avg: 24m 28s | Max: 24m 28s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 24m 28s | Avg: 24m 28s | Max: 24m 28s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 169)

# Runner
125 linux-amd64-cpu16
14 windows-amd64-cpu16
10 linux-amd64-gpu-rtx2080-latest-1
10 linux-arm64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
1 linux-amd64-gpu-h100-latest-1

Fix PTX ISA version for cp.async.bulk.tensor
Fix parameter order for mbarrier.arrive
@bernhardmgruber bernhardmgruber enabled auto-merge (squash) January 31, 2025 19:10
Copy link
Contributor

🟩 CI finished in 2h 02m: Pass: 100%/169 | Total: 2d 19h | Avg: 23m 58s | Max: 1h 13m | Hits: 458%/20880
  • 🟩 libcudacxx: Pass: 100%/48 | Total: 7h 43m | Avg: 9m 39s | Max: 33m 46s | Hits: 682%/10028

    🟩 cpu
      🟩 amd64              Pass: 100%/46  | Total:  7h 36m | Avg:  9m 54s | Max: 33m 46s | Hits: 682%/10028 
      🟩 arm64              Pass: 100%/2   | Total:  7m 38s | Avg:  3m 49s | Max:  4m 09s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  1h 01m | Avg:  8m 48s | Max: 25m 44s | Hits: 682%/2324  
      🟩 12.5               Pass: 100%/2   | Total: 42m 14s | Avg: 21m 07s | Max: 33m 46s
      🟩 12.6               Pass: 100%/39  | Total:  5h 59m | Avg:  9m 13s | Max: 32m 16s | Hits: 682%/7704  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/4   | Total:  1h 10m | Avg: 17m 40s | Max: 23m 12s
      🟩 nvcc11.1           Pass: 100%/7   | Total:  1h 01m | Avg:  8m 48s | Max: 25m 44s | Hits: 682%/2324  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 42m 14s | Avg: 21m 07s | Max: 33m 46s
      🟩 nvcc12.6           Pass: 100%/35  | Total:  4h 49m | Avg:  8m 15s | Max: 32m 16s | Hits: 682%/7704  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/4   | Total:  1h 10m | Avg: 17m 40s | Max: 23m 12s
      🟩 nvcc               Pass: 100%/44  | Total:  6h 32m | Avg:  8m 55s | Max: 33m 46s | Hits: 682%/10028 
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total: 15m 47s | Avg:  3m 56s | Max:  5m 05s
      🟩 Clang10            Pass: 100%/1   | Total:  4m 44s | Avg:  4m 44s | Max:  4m 44s
      🟩 Clang11            Pass: 100%/1   | Total:  4m 32s | Avg:  4m 32s | Max:  4m 32s
      🟩 Clang12            Pass: 100%/1   | Total: 13m 30s | Avg: 13m 30s | Max: 13m 30s
      🟩 Clang13            Pass: 100%/1   | Total:  4m 14s | Avg:  4m 14s | Max:  4m 14s
      🟩 Clang14            Pass: 100%/1   | Total:  4m 41s | Avg:  4m 41s | Max:  4m 41s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 24s | Avg:  4m 24s | Max:  4m 24s
      🟩 Clang16            Pass: 100%/1   | Total:  4m 34s | Avg:  4m 34s | Max:  4m 34s
      🟩 Clang17            Pass: 100%/1   | Total:  4m 49s | Avg:  4m 49s | Max:  4m 49s
      🟩 Clang18            Pass: 100%/8   | Total:  1h 33m | Avg: 11m 44s | Max: 23m 12s
      🟩 GCC6               Pass: 100%/2   | Total: 23m 37s | Avg: 11m 48s | Max: 20m 33s
      🟩 GCC7               Pass: 100%/2   | Total:  6m 57s | Avg:  3m 28s | Max:  3m 39s
      🟩 GCC8               Pass: 100%/1   | Total:  3m 55s | Avg:  3m 55s | Max:  3m 55s
      🟩 GCC9               Pass: 100%/3   | Total: 10m 17s | Avg:  3m 25s | Max:  4m 16s
      🟩 GCC10              Pass: 100%/1   | Total:  4m 14s | Avg:  4m 14s | Max:  4m 14s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 58s | Avg:  3m 58s | Max:  3m 58s
      🟩 GCC12              Pass: 100%/1   | Total:  4m 11s | Avg:  4m 11s | Max:  4m 11s
      🟩 GCC13              Pass: 100%/10  | Total:  1h 32m | Avg:  9m 15s | Max: 18m 24s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  6m 24s | Avg:  6m 24s | Max:  6m 24s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 25m 44s | Avg: 25m 44s | Max: 25m 44s | Hits: 682%/2324  
      🟩 MSVC14.29          Pass: 100%/1   | Total: 26m 44s | Avg: 26m 44s | Max: 26m 44s | Hits: 682%/2519  
      🟩 MSVC14.39          Pass: 100%/2   | Total: 57m 38s | Avg: 28m 49s | Max: 32m 16s | Hits: 682%/5185  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 42m 14s | Avg: 21m 07s | Max: 33m 46s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/20  | Total:  2h 35m | Avg:  7m 45s | Max: 23m 12s
      🟩 GCC                Pass: 100%/21  | Total:  2h 29m | Avg:  7m 07s | Max: 20m 33s
      🟩 Intel              Pass: 100%/1   | Total:  6m 24s | Avg:  6m 24s | Max:  6m 24s
      🟩 MSVC               Pass: 100%/4   | Total:  1h 50m | Avg: 27m 31s | Max: 32m 16s | Hits: 682%/10028 
      🟩 NVHPC              Pass: 100%/2   | Total: 42m 14s | Avg: 21m 07s | Max: 33m 46s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/8   | Total:  1h 33m | Avg: 11m 41s | Max: 18m 24s
      🟩 v100               Pass: 100%/40  | Total:  6h 10m | Avg:  9m 15s | Max: 33m 46s | Hits: 682%/10028 
    🟩 jobs
      🟩 Build              Pass: 100%/41  | Total:  6h 17m | Avg:  9m 11s | Max: 33m 46s | Hits: 682%/10028 
      🟩 NVRTC              Pass: 100%/4   | Total:  1h 05m | Avg: 16m 29s | Max: 18m 24s
      🟩 Test               Pass: 100%/2   | Total: 18m 38s | Avg:  9m 19s | Max:  9m 26s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  1m 53s | Avg:  1m 53s | Max:  1m 53s
    🟩 sm
      🟩 75                 Pass: 100%/4   | Total:  1h 05m | Avg: 16m 29s | Max: 18m 24s
      🟩 90                 Pass: 100%/1   | Total: 13m 34s | Avg: 13m 34s | Max: 13m 34s
      🟩 90a                Pass: 100%/2   | Total: 16m 17s | Avg:  8m 08s | Max: 12m 25s
    🟩 std
      🟩 11                 Pass: 100%/6   | Total: 52m 24s | Avg:  8m 44s | Max: 20m 33s
      🟩 14                 Pass: 100%/5   | Total: 51m 16s | Avg: 10m 15s | Max: 25m 44s | Hits: 682%/2324  
      🟩 17                 Pass: 100%/13  | Total:  2h 40m | Avg: 12m 20s | Max: 33m 46s | Hits: 682%/5038  
      🟩 20                 Pass: 100%/23  | Total:  3h 17m | Avg:  8m 35s | Max: 32m 16s | Hits: 681%/2666  
    
  • 🟩 cub: Pass: 100%/47 | Total: 1d 10h | Avg: 44m 15s | Max: 1h 13m | Hits: 278%/3132

    🟩 cpu
      🟩 amd64              Pass: 100%/45  | Total:  1d 08h | Avg: 43m 38s | Max:  1h 13m | Hits: 278%/3132  
      🟩 arm64              Pass: 100%/2   | Total:  1h 55m | Avg: 57m 57s | Max: 59m 14s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  1h 07m | Avg:  9m 39s | Max: 40m 53s | Hits: 599%/783   
      🟩 12.5               Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 13m
      🟩 12.6               Pass: 100%/38  | Total:  1d 07h | Avg: 49m 16s | Max:  1h 08m | Hits: 171%/2349  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 59m | Avg: 59m 58s | Max:  1h 02m
      🟩 nvcc11.1           Pass: 100%/7   | Total:  1h 07m | Avg:  9m 39s | Max: 40m 53s | Hits: 599%/783   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 13m
      🟩 nvcc12.6           Pass: 100%/36  | Total:  1d 05h | Avg: 48m 40s | Max:  1h 08m | Hits: 171%/2349  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 59m | Avg: 59m 58s | Max:  1h 02m
      🟩 nvcc               Pass: 100%/45  | Total:  1d 08h | Avg: 43m 33s | Max:  1h 13m | Hits: 278%/3132  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  1h 58m | Avg: 29m 41s | Max: 56m 09s
      🟩 Clang10            Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
      🟩 Clang11            Pass: 100%/1   | Total: 53m 10s | Avg: 53m 10s | Max: 53m 10s
      🟩 Clang12            Pass: 100%/1   | Total: 53m 06s | Avg: 53m 06s | Max: 53m 06s
      🟩 Clang13            Pass: 100%/1   | Total: 56m 31s | Avg: 56m 31s | Max: 56m 31s
      🟩 Clang14            Pass: 100%/1   | Total: 59m 22s | Avg: 59m 22s | Max: 59m 22s
      🟩 Clang15            Pass: 100%/1   | Total: 55m 03s | Avg: 55m 03s | Max: 55m 03s
      🟩 Clang16            Pass: 100%/1   | Total: 54m 03s | Avg: 54m 03s | Max: 54m 03s
      🟩 Clang17            Pass: 100%/1   | Total: 54m 13s | Avg: 54m 13s | Max: 54m 13s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 38m | Avg: 48m 24s | Max:  1h 02m
      🟩 GCC6               Pass: 100%/2   | Total:  8m 35s | Avg:  4m 17s | Max:  4m 32s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 44m | Avg: 52m 11s | Max: 52m 13s
      🟩 GCC8               Pass: 100%/1   | Total: 55m 37s | Avg: 55m 37s | Max: 55m 37s
      🟩 GCC9               Pass: 100%/3   | Total:  1h 09m | Avg: 23m 13s | Max:  1h 00m
      🟩 GCC10              Pass: 100%/1   | Total: 54m 58s | Avg: 54m 58s | Max: 54m 58s
      🟩 GCC11              Pass: 100%/1   | Total: 56m 45s | Avg: 56m 45s | Max: 56m 45s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 47m | Avg: 35m 49s | Max:  1h 00m
      🟩 GCC13              Pass: 100%/8   | Total:  4h 32m | Avg: 34m 03s | Max: 58m 23s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  1h 02m | Avg:  1h 02m | Max:  1h 02m
      🟩 MSVC14.16          Pass: 100%/1   | Total: 40m 53s | Avg: 40m 53s | Max: 40m 53s | Hits: 599%/783   
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m | Hits: 172%/783   
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 08m | Hits: 171%/1566  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 13m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total: 15h 03m | Avg: 47m 32s | Max:  1h 02m
      🟩 GCC                Pass: 100%/21  | Total: 12h 09m | Avg: 34m 45s | Max:  1h 00m
      🟩 Intel              Pass: 100%/1   | Total:  1h 02m | Avg:  1h 02m | Max:  1h 02m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 04m | Avg:  1h 01m | Max:  1h 08m | Hits: 278%/3132  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 20m | Avg:  1h 10m | Max:  1h 13m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 46m 43s | Avg: 23m 21s | Max: 24m 01s
      🟩 rtxa6000           Pass: 100%/8   | Total:  3h 50m | Avg: 28m 46s | Max: 56m 54s
      🟩 v100               Pass: 100%/37  | Total:  1d 06h | Avg: 48m 43s | Max:  1h 13m | Hits: 278%/3132  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total:  1d 08h | Avg: 48m 29s | Max:  1h 13m | Hits: 278%/3132  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 17m 45s | Avg: 17m 45s | Max: 17m 45s
      🟩 GraphCapture       Pass: 100%/1   | Total: 14m 23s | Avg: 14m 23s | Max: 14m 23s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 08m | Avg: 22m 40s | Max: 22m 44s
      🟩 TestGPU            Pass: 100%/2   | Total: 39m 45s | Avg: 19m 52s | Max: 20m 56s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 46m 43s | Avg: 23m 21s | Max: 24m 01s
      🟩 90a                Pass: 100%/1   | Total: 27m 39s | Avg: 27m 39s | Max: 27m 39s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  1h 58m | Avg: 23m 43s | Max: 53m 46s
      🟩 14                 Pass: 100%/4   | Total:  2h 33m | Avg: 38m 25s | Max: 56m 09s | Hits: 599%/783   
      🟩 17                 Pass: 100%/12  | Total: 10h 29m | Avg: 52m 28s | Max:  1h 07m | Hits: 172%/1566  
      🟩 20                 Pass: 100%/26  | Total: 19h 37m | Avg: 45m 18s | Max:  1h 13m | Hits: 170%/783   
    
  • 🟩 thrust: Pass: 100%/45 | Total: 22h 19m | Avg: 29m 46s | Max: 1h 11m | Hits: 226%/7408

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 40m 58s | Avg: 20m 29s | Max: 30m 30s
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 21h 16m | Avg: 29m 41s | Max:  1h 11m | Hits: 226%/7408  
      🟩 arm64              Pass: 100%/2   | Total:  1h 03m | Avg: 31m 36s | Max: 33m 43s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total: 50m 55s | Avg:  7m 16s | Max: 24m 11s | Hits: 368%/1852  
      🟩 12.5               Pass: 100%/2   | Total:  1h 49m | Avg: 54m 49s | Max: 56m 42s
      🟩 12.6               Pass: 100%/36  | Total: 19h 39m | Avg: 32m 45s | Max:  1h 11m | Hits: 178%/5556  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 58m 42s | Avg: 29m 21s | Max: 29m 28s
      🟩 nvcc11.1           Pass: 100%/7   | Total: 50m 55s | Avg:  7m 16s | Max: 24m 11s | Hits: 368%/1852  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 49m | Avg: 54m 49s | Max: 56m 42s
      🟩 nvcc12.6           Pass: 100%/34  | Total: 18h 40m | Avg: 32m 57s | Max:  1h 11m | Hits: 178%/5556  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 58m 42s | Avg: 29m 21s | Max: 29m 28s
      🟩 nvcc               Pass: 100%/43  | Total: 21h 21m | Avg: 29m 47s | Max:  1h 11m | Hits: 226%/7408  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  1h 11m | Avg: 17m 58s | Max: 33m 00s
      🟩 Clang10            Pass: 100%/1   | Total: 35m 13s | Avg: 35m 13s | Max: 35m 13s
      🟩 Clang11            Pass: 100%/1   | Total: 30m 38s | Avg: 30m 38s | Max: 30m 38s
      🟩 Clang12            Pass: 100%/1   | Total: 34m 20s | Avg: 34m 20s | Max: 34m 20s
      🟩 Clang13            Pass: 100%/1   | Total: 34m 06s | Avg: 34m 06s | Max: 34m 06s
      🟩 Clang14            Pass: 100%/1   | Total: 34m 33s | Avg: 34m 33s | Max: 34m 33s
      🟩 Clang15            Pass: 100%/1   | Total: 35m 29s | Avg: 35m 29s | Max: 35m 29s
      🟩 Clang16            Pass: 100%/1   | Total: 36m 33s | Avg: 36m 33s | Max: 36m 33s
      🟩 Clang17            Pass: 100%/1   | Total: 33m 17s | Avg: 33m 17s | Max: 33m 17s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 52m | Avg: 24m 36s | Max: 34m 33s
      🟩 GCC6               Pass: 100%/2   | Total:  8m 25s | Avg:  4m 12s | Max:  4m 27s
      🟩 GCC7               Pass: 100%/2   | Total: 59m 37s | Avg: 29m 48s | Max: 32m 21s
      🟩 GCC8               Pass: 100%/1   | Total: 35m 02s | Avg: 35m 02s | Max: 35m 02s
      🟩 GCC9               Pass: 100%/3   | Total: 45m 50s | Avg: 15m 16s | Max: 36m 54s
      🟩 GCC10              Pass: 100%/1   | Total: 37m 44s | Avg: 37m 44s | Max: 37m 44s
      🟩 GCC11              Pass: 100%/1   | Total: 36m 23s | Avg: 36m 23s | Max: 36m 23s
      🟩 GCC12              Pass: 100%/1   | Total: 39m 30s | Avg: 39m 30s | Max: 39m 30s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 11m | Avg: 23m 53s | Max: 39m 16s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 43m 33s | Avg: 43m 33s | Max: 43m 33s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 24m 11s | Avg: 24m 11s | Max: 24m 11s | Hits: 368%/1852  
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m | Hits: 178%/1852  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 11m | Hits: 178%/3704  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 49m | Avg: 54m 49s | Max: 56m 42s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  8h 38m | Avg: 27m 16s | Max: 36m 33s
      🟩 GCC                Pass: 100%/19  | Total:  7h 33m | Avg: 23m 52s | Max: 39m 30s
      🟩 Intel              Pass: 100%/1   | Total: 43m 33s | Avg: 43m 33s | Max: 43m 33s
      🟩 MSVC               Pass: 100%/4   | Total:  3h 34m | Avg: 53m 39s | Max:  1h 11m | Hits: 226%/7408  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 49m | Avg: 54m 49s | Max: 56m 42s
    🟩 gpu
      🟩 rtx4090            Pass: 100%/8   | Total:  2h 28m | Avg: 18m 34s | Max: 37m 01s
      🟩 v100               Pass: 100%/37  | Total: 19h 51m | Avg: 32m 11s | Max:  1h 11m | Hits: 226%/7408  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total: 21h 33m | Avg: 32m 19s | Max:  1h 11m | Hits: 226%/7408  
      🟩 TestCPU            Pass: 100%/2   | Total: 15m 30s | Avg:  7m 45s | Max:  8m 13s
      🟩 TestGPU            Pass: 100%/3   | Total: 31m 01s | Avg: 10m 20s | Max: 10m 50s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 21m 03s | Avg: 21m 03s | Max: 21m 03s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  1h 09m | Avg: 13m 50s | Max: 29m 31s
      🟩 14                 Pass: 100%/4   | Total:  1h 33m | Avg: 23m 29s | Max: 33m 00s | Hits: 368%/1852  
      🟩 17                 Pass: 100%/12  | Total:  7h 17m | Avg: 36m 28s | Max:  1h 00m | Hits: 178%/3704  
      🟩 20                 Pass: 100%/22  | Total: 11h 37m | Avg: 31m 43s | Max:  1h 11m | Hits: 178%/1852  
    
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 11m | Avg: 5m 02s | Max: 11m 54s | Hits: 574%/312

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  1h 56m | Avg:  5m 18s | Max: 11m 54s | Hits: 574%/312   
      🟩 arm64              Pass: 100%/4   | Total: 14m 21s | Avg:  3m 35s | Max:  3m 39s
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 16m 45s | Avg:  5m 35s | Max:  9m 18s | Hits: 574%/156   
      🟩 12.5               Pass: 100%/2   | Total: 11m 23s | Avg:  5m 41s | Max:  5m 48s
      🟩 12.6               Pass: 100%/21  | Total:  1h 42m | Avg:  4m 54s | Max: 11m 54s | Hits: 574%/156   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 16m 45s | Avg:  5m 35s | Max:  9m 18s | Hits: 574%/156   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 11m 23s | Avg:  5m 41s | Max:  5m 48s
      🟩 nvcc12.6           Pass: 100%/21  | Total:  1h 42m | Avg:  4m 54s | Max: 11m 54s | Hits: 574%/156   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 11m | Avg:  5m 02s | Max: 11m 54s | Hits: 574%/312   
    🟩 cxx
      🟩 Clang9             Pass: 100%/1   | Total:  3m 50s | Avg:  3m 50s | Max:  3m 50s
      🟩 Clang10            Pass: 100%/1   | Total:  4m 39s | Avg:  4m 39s | Max:  4m 39s
      🟩 Clang11            Pass: 100%/1   | Total:  3m 45s | Avg:  3m 45s | Max:  3m 45s
      🟩 Clang12            Pass: 100%/1   | Total:  3m 38s | Avg:  3m 38s | Max:  3m 38s
      🟩 Clang13            Pass: 100%/1   | Total:  4m 14s | Avg:  4m 14s | Max:  4m 14s
      🟩 Clang14            Pass: 100%/1   | Total:  3m 48s | Avg:  3m 48s | Max:  3m 48s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 07s | Avg:  4m 07s | Max:  4m 07s
      🟩 Clang16            Pass: 100%/1   | Total:  3m 50s | Avg:  3m 50s | Max:  3m 50s
      🟩 Clang17            Pass: 100%/1   | Total:  3m 42s | Avg:  3m 42s | Max:  3m 42s
      🟩 Clang18            Pass: 100%/4   | Total: 22m 22s | Avg:  5m 35s | Max: 11m 12s
      🟩 GCC9               Pass: 100%/1   | Total:  3m 37s | Avg:  3m 37s | Max:  3m 37s
      🟩 GCC10              Pass: 100%/1   | Total:  3m 59s | Avg:  3m 59s | Max:  3m 59s
      🟩 GCC11              Pass: 100%/1   | Total:  4m 10s | Avg:  4m 10s | Max:  4m 10s
      🟩 GCC12              Pass: 100%/2   | Total: 16m 07s | Avg:  8m 03s | Max: 11m 54s
      🟩 GCC13              Pass: 100%/4   | Total: 13m 22s | Avg:  3m 20s | Max:  3m 39s
      🟩 MSVC14.36          Pass: 100%/1   | Total:  9m 18s | Avg:  9m 18s | Max:  9m 18s | Hits: 574%/156   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 11m 16s | Avg: 11m 16s | Max: 11m 16s | Hits: 574%/156   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 11m 23s | Avg:  5m 41s | Max:  5m 48s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/13  | Total: 57m 55s | Avg:  4m 27s | Max: 11m 12s
      🟩 GCC                Pass: 100%/9   | Total: 41m 15s | Avg:  4m 35s | Max: 11m 54s
      🟩 MSVC               Pass: 100%/2   | Total: 20m 34s | Avg: 10m 17s | Max: 11m 16s | Hits: 574%/312   
      🟩 NVHPC              Pass: 100%/2   | Total: 11m 23s | Avg:  5m 41s | Max:  5m 48s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 31m 18s | Avg:  7m 49s | Max: 11m 54s
      🟩 v100               Pass: 100%/22  | Total:  1h 39m | Avg:  4m 32s | Max: 11m 16s | Hits: 574%/312   
    🟩 jobs
      🟩 Build              Pass: 100%/24  | Total:  1h 48m | Avg:  4m 30s | Max: 11m 16s | Hits: 574%/312   
      🟩 Test               Pass: 100%/2   | Total: 23m 06s | Avg: 11m 33s | Max: 11m 54s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  3m 05s | Avg:  3m 05s | Max:  3m 05s
      🟩 90a                Pass: 100%/1   | Total:  3m 07s | Avg:  3m 07s | Max:  3m 07s
    🟩 std
      🟩 17                 Pass: 100%/6   | Total: 23m 30s | Avg:  3m 55s | Max:  5m 48s
      🟩 20                 Pass: 100%/20  | Total:  1h 47m | Avg:  5m 22s | Max: 11m 54s | Hits: 574%/312   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 6m 47s | Avg: 3m 23s | Max: 4m 40s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  6m 47s | Avg:  3m 23s | Max:  4m 40s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  6m 47s | Avg:  3m 23s | Max:  4m 40s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  6m 47s | Avg:  3m 23s | Max:  4m 40s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  6m 47s | Avg:  3m 23s | Max:  4m 40s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  6m 47s | Avg:  3m 23s | Max:  4m 40s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  6m 47s | Avg:  3m 23s | Max:  4m 40s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  6m 47s | Avg:  3m 23s | Max:  4m 40s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 07s | Avg:  2m 07s | Max:  2m 07s
      🟩 Test               Pass: 100%/1   | Total:  4m 40s | Avg:  4m 40s | Max:  4m 40s
    
  • 🟩 python: Pass: 100%/1 | Total: 29m 47s | Avg: 29m 47s | Max: 29m 47s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 29m 47s | Avg: 29m 47s | Max: 29m 47s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 29m 47s | Avg: 29m 47s | Max: 29m 47s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 29m 47s | Avg: 29m 47s | Max: 29m 47s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 29m 47s | Avg: 29m 47s | Max: 29m 47s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 29m 47s | Avg: 29m 47s | Max: 29m 47s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 29m 47s | Avg: 29m 47s | Max: 29m 47s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 29m 47s | Avg: 29m 47s | Max: 29m 47s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 29m 47s | Avg: 29m 47s | Max: 29m 47s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 169)

# Runner
125 linux-amd64-cpu16
14 windows-amd64-cpu16
10 linux-amd64-gpu-rtx2080-latest-1
10 linux-arm64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
1 linux-amd64-gpu-h100-latest-1

Copy link
Contributor

github-actions bot commented Feb 2, 2025

🟩 CI finished in 1h 14m: Pass: 100%/169 | Total: 1d 17h | Avg: 14m 50s | Max: 1h 13m | Hits: 458%/20880
  • 🟩 libcudacxx: Pass: 100%/48 | Total: 9h 25m | Avg: 11m 47s | Max: 31m 26s | Hits: 671%/10028

    🟩 cpu
      🟩 amd64              Pass: 100%/46  | Total:  9h 18m | Avg: 12m 08s | Max: 31m 26s | Hits: 671%/10028 
      🟩 arm64              Pass: 100%/2   | Total:  7m 26s | Avg:  3m 43s | Max:  4m 02s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  2h 10m | Avg: 18m 35s | Max: 25m 28s | Hits: 650%/2324  
      🟩 12.5               Pass: 100%/2   | Total: 59m 36s | Avg: 29m 48s | Max: 31m 26s
      🟩 12.6               Pass: 100%/39  | Total:  6h 15m | Avg:  9m 38s | Max: 27m 51s | Hits: 677%/7704  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/4   | Total:  1h 09m | Avg: 17m 15s | Max: 22m 28s
      🟩 nvcc11.1           Pass: 100%/7   | Total:  2h 10m | Avg: 18m 35s | Max: 25m 28s | Hits: 650%/2324  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 59m 36s | Avg: 29m 48s | Max: 31m 26s
      🟩 nvcc12.6           Pass: 100%/35  | Total:  5h 06m | Avg:  8m 46s | Max: 27m 51s | Hits: 677%/7704  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/4   | Total:  1h 09m | Avg: 17m 15s | Max: 22m 28s
      🟩 nvcc               Pass: 100%/44  | Total:  8h 16m | Avg: 11m 17s | Max: 31m 26s | Hits: 671%/10028 
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  1h 01m | Avg: 15m 25s | Max: 24m 17s
      🟩 Clang10            Pass: 100%/1   | Total: 21m 19s | Avg: 21m 19s | Max: 21m 19s
      🟩 Clang11            Pass: 100%/1   | Total: 20m 00s | Avg: 20m 00s | Max: 20m 00s
      🟩 Clang12            Pass: 100%/1   | Total:  4m 19s | Avg:  4m 19s | Max:  4m 19s
      🟩 Clang13            Pass: 100%/1   | Total:  4m 33s | Avg:  4m 33s | Max:  4m 33s
      🟩 Clang14            Pass: 100%/1   | Total:  4m 15s | Avg:  4m 15s | Max:  4m 15s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 38s | Avg:  4m 38s | Max:  4m 38s
      🟩 Clang16            Pass: 100%/1   | Total:  4m 54s | Avg:  4m 54s | Max:  4m 54s
      🟩 Clang17            Pass: 100%/1   | Total:  4m 53s | Avg:  4m 53s | Max:  4m 53s
      🟩 Clang18            Pass: 100%/8   | Total:  1h 30m | Avg: 11m 17s | Max: 22m 28s
      🟩 GCC6               Pass: 100%/2   | Total: 25m 54s | Avg: 12m 57s | Max: 22m 50s
      🟩 GCC7               Pass: 100%/2   | Total:  6m 53s | Avg:  3m 26s | Max:  3m 42s
      🟩 GCC8               Pass: 100%/1   | Total:  3m 59s | Avg:  3m 59s | Max:  3m 59s
      🟩 GCC9               Pass: 100%/3   | Total: 42m 13s | Avg: 14m 04s | Max: 23m 03s
      🟩 GCC10              Pass: 100%/1   | Total:  4m 06s | Avg:  4m 06s | Max:  4m 06s
      🟩 GCC11              Pass: 100%/1   | Total:  4m 08s | Avg:  4m 08s | Max:  4m 08s
      🟩 GCC12              Pass: 100%/1   | Total:  4m 15s | Avg:  4m 15s | Max:  4m 15s
      🟩 GCC13              Pass: 100%/10  | Total:  1h 26m | Avg:  8m 37s | Max: 15m 52s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total:  5m 48s | Avg:  5m 48s | Max:  5m 48s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 25m 28s | Avg: 25m 28s | Max: 25m 28s | Hits: 650%/2324  
      🟩 MSVC14.29          Pass: 100%/1   | Total: 23m 47s | Avg: 23m 47s | Max: 23m 47s | Hits: 678%/2519  
      🟩 MSVC14.39          Pass: 100%/2   | Total: 52m 24s | Avg: 26m 12s | Max: 27m 51s | Hits: 677%/5185  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 59m 36s | Avg: 29m 48s | Max: 31m 26s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/20  | Total:  3h 40m | Avg: 11m 02s | Max: 24m 17s
      🟩 GCC                Pass: 100%/21  | Total:  2h 57m | Avg:  8m 27s | Max: 23m 03s
      🟩 Intel              Pass: 100%/1   | Total:  5m 48s | Avg:  5m 48s | Max:  5m 48s
      🟩 MSVC               Pass: 100%/4   | Total:  1h 41m | Avg: 25m 24s | Max: 27m 51s | Hits: 671%/10028 
      🟩 NVHPC              Pass: 100%/2   | Total: 59m 36s | Avg: 29m 48s | Max: 31m 26s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/8   | Total:  1h 25m | Avg: 10m 44s | Max: 15m 52s
      🟩 v100               Pass: 100%/40  | Total:  7h 59m | Avg: 11m 59s | Max: 31m 26s | Hits: 671%/10028 
    🟩 jobs
      🟩 Build              Pass: 100%/41  | Total:  8h 06m | Avg: 11m 52s | Max: 31m 26s | Hits: 671%/10028 
      🟩 NVRTC              Pass: 100%/4   | Total: 59m 57s | Avg: 14m 59s | Max: 15m 52s
      🟩 Test               Pass: 100%/2   | Total: 17m 16s | Avg:  8m 38s | Max:  8m 53s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  1m 56s | Avg:  1m 56s | Max:  1m 56s
    🟩 sm
      🟩 75                 Pass: 100%/4   | Total: 59m 57s | Avg: 14m 59s | Max: 15m 52s
      🟩 90                 Pass: 100%/1   | Total: 12m 30s | Avg: 12m 30s | Max: 12m 30s
      🟩 90a                Pass: 100%/2   | Total: 16m 22s | Avg:  8m 11s | Max: 12m 33s
    🟩 std
      🟩 11                 Pass: 100%/6   | Total:  1h 43m | Avg: 17m 16s | Max: 24m 17s
      🟩 14                 Pass: 100%/5   | Total: 52m 23s | Avg: 10m 28s | Max: 25m 28s | Hits: 650%/2324  
      🟩 17                 Pass: 100%/13  | Total:  3h 08m | Avg: 14m 31s | Max: 28m 10s | Hits: 678%/5038  
      🟩 20                 Pass: 100%/23  | Total:  3h 39m | Avg:  9m 31s | Max: 31m 26s | Hits: 676%/2666  
    
  • 🟩 cub: Pass: 100%/47 | Total: 17h 45m | Avg: 22m 40s | Max: 1h 13m | Hits: 282%/3132

    🟩 cpu
      🟩 amd64              Pass: 100%/45  | Total: 17h 34m | Avg: 23m 26s | Max:  1h 13m | Hits: 282%/3132  
      🟩 arm64              Pass: 100%/2   | Total: 10m 46s | Avg:  5m 23s | Max:  5m 36s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  1h 05m | Avg:  9m 25s | Max: 39m 25s | Hits: 599%/783   
      🟩 12.5               Pass: 100%/2   | Total:  2h 25m | Avg:  1h 12m | Max:  1h 13m
      🟩 12.6               Pass: 100%/38  | Total: 14h 14m | Avg: 22m 29s | Max:  1h 08m | Hits: 176%/2349  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  9m 27s | Avg:  4m 43s | Max:  4m 52s
      🟩 nvcc11.1           Pass: 100%/7   | Total:  1h 05m | Avg:  9m 25s | Max: 39m 25s | Hits: 599%/783   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 25m | Avg:  1h 12m | Max:  1h 13m
      🟩 nvcc12.6           Pass: 100%/36  | Total: 14h 05m | Avg: 23m 28s | Max:  1h 08m | Hits: 176%/2349  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 27s | Avg:  4m 43s | Max:  4m 52s
      🟩 nvcc               Pass: 100%/45  | Total: 17h 36m | Avg: 23m 28s | Max:  1h 13m | Hits: 282%/3132  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  2h 00m | Avg: 30m 14s | Max: 56m 27s
      🟩 Clang10            Pass: 100%/1   | Total:  1h 00m | Avg:  1h 00m | Max:  1h 00m
      🟩 Clang11            Pass: 100%/1   | Total: 54m 40s | Avg: 54m 40s | Max: 54m 40s
      🟩 Clang12            Pass: 100%/1   | Total: 57m 49s | Avg: 57m 49s | Max: 57m 49s
      🟩 Clang13            Pass: 100%/1   | Total: 58m 38s | Avg: 58m 38s | Max: 58m 38s
      🟩 Clang14            Pass: 100%/1   | Total:  6m 11s | Avg:  6m 11s | Max:  6m 11s
      🟩 Clang15            Pass: 100%/1   | Total:  6m 06s | Avg:  6m 06s | Max:  6m 06s
      🟩 Clang16            Pass: 100%/1   | Total:  5m 50s | Avg:  5m 50s | Max:  5m 50s
      🟩 Clang17            Pass: 100%/1   | Total:  5m 41s | Avg:  5m 41s | Max:  5m 41s
      🟩 Clang18            Pass: 100%/7   | Total:  1h 06m | Avg:  9m 31s | Max: 21m 00s
      🟩 GCC6               Pass: 100%/2   | Total:  8m 30s | Avg:  4m 15s | Max:  4m 23s
      🟩 GCC7               Pass: 100%/2   | Total: 10m 47s | Avg:  5m 23s | Max:  5m 36s
      🟩 GCC8               Pass: 100%/1   | Total:  5m 35s | Avg:  5m 35s | Max:  5m 35s
      🟩 GCC9               Pass: 100%/3   | Total: 15m 16s | Avg:  5m 05s | Max:  6m 10s
      🟩 GCC10              Pass: 100%/1   | Total:  6m 21s | Avg:  6m 21s | Max:  6m 21s
      🟩 GCC11              Pass: 100%/1   | Total:  6m 00s | Avg:  6m 00s | Max:  6m 00s
      🟩 GCC12              Pass: 100%/3   | Total: 33m 33s | Avg: 11m 11s | Max: 22m 32s
      🟩 GCC13              Pass: 100%/8   | Total:  1h 33m | Avg: 11m 38s | Max: 21m 37s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 56m 42s | Avg: 56m 42s | Max: 56m 42s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 39m 25s | Avg: 39m 25s | Max: 39m 25s | Hits: 599%/783   
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 06m | Avg:  1h 06m | Max:  1h 06m | Hits: 177%/783   
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 08m | Hits: 176%/1566  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 25m | Avg:  1h 12m | Max:  1h 13m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  7h 23m | Avg: 23m 20s | Max:  1h 00m
      🟩 GCC                Pass: 100%/21  | Total:  2h 59m | Avg:  8m 31s | Max: 22m 32s
      🟩 Intel              Pass: 100%/1   | Total: 56m 42s | Avg: 56m 42s | Max: 56m 42s
      🟩 MSVC               Pass: 100%/4   | Total:  4h 01m | Avg:  1h 00m | Max:  1h 08m | Hits: 282%/3132  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 25m | Avg:  1h 12m | Max:  1h 13m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 26m 59s | Avg: 13m 29s | Max: 22m 32s
      🟩 rtxa6000           Pass: 100%/8   | Total:  2h 02m | Avg: 15m 17s | Max: 21m 37s
      🟩 v100               Pass: 100%/37  | Total: 15h 16m | Avg: 24m 45s | Max:  1h 13m | Hits: 282%/3132  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total: 15h 33m | Avg: 23m 19s | Max:  1h 13m | Hits: 282%/3132  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 15m 20s | Avg: 15m 20s | Max: 15m 20s
      🟩 GraphCapture       Pass: 100%/1   | Total: 14m 14s | Avg: 14m 14s | Max: 14m 14s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 05m | Avg: 21m 43s | Max: 22m 32s
      🟩 TestGPU            Pass: 100%/2   | Total: 37m 46s | Avg: 18m 53s | Max: 19m 01s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 26m 59s | Avg: 13m 29s | Max: 22m 32s
      🟩 90a                Pass: 100%/1   | Total:  4m 43s | Avg:  4m 43s | Max:  4m 43s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  1h 14m | Avg: 14m 52s | Max: 56m 27s
      🟩 14                 Pass: 100%/4   | Total:  1h 44m | Avg: 26m 10s | Max: 55m 32s | Hits: 599%/783   
      🟩 17                 Pass: 100%/12  | Total:  6h 02m | Avg: 30m 11s | Max:  1h 11m | Hits: 177%/1566  
      🟩 20                 Pass: 100%/26  | Total:  8h 44m | Avg: 20m 09s | Max:  1h 13m | Hits: 175%/783   
    
  • 🟩 thrust: Pass: 100%/45 | Total: 11h 54m | Avg: 15m 52s | Max: 57m 32s | Hits: 238%/7408

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 16m 20s | Avg:  8m 10s | Max: 10m 31s
    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total: 11h 44m | Avg: 16m 23s | Max: 57m 32s | Hits: 238%/7408  
      🟩 arm64              Pass: 100%/2   | Total:  9m 25s | Avg:  4m 42s | Max:  4m 54s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total: 51m 53s | Avg:  7m 24s | Max: 26m 04s | Hits: 368%/1852  
      🟩 12.5               Pass: 100%/2   | Total:  1h 54m | Avg: 57m 10s | Max: 57m 32s
      🟩 12.6               Pass: 100%/36  | Total:  9h 07m | Avg: 15m 12s | Max: 57m 06s | Hits: 194%/5556  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 11s | Avg:  5m 05s | Max:  5m 14s
      🟩 nvcc11.1           Pass: 100%/7   | Total: 51m 53s | Avg:  7m 24s | Max: 26m 04s | Hits: 368%/1852  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 54m | Avg: 57m 10s | Max: 57m 32s
      🟩 nvcc12.6           Pass: 100%/34  | Total:  8h 57m | Avg: 15m 48s | Max: 57m 06s | Hits: 194%/5556  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 11s | Avg:  5m 05s | Max:  5m 14s
      🟩 nvcc               Pass: 100%/43  | Total: 11h 43m | Avg: 16m 22s | Max: 57m 32s | Hits: 238%/7408  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  1h 05m | Avg: 16m 27s | Max: 32m 01s
      🟩 Clang10            Pass: 100%/1   | Total: 37m 08s | Avg: 37m 08s | Max: 37m 08s
      🟩 Clang11            Pass: 100%/1   | Total: 34m 00s | Avg: 34m 00s | Max: 34m 00s
      🟩 Clang12            Pass: 100%/1   | Total: 32m 39s | Avg: 32m 39s | Max: 32m 39s
      🟩 Clang13            Pass: 100%/1   | Total: 31m 53s | Avg: 31m 53s | Max: 31m 53s
      🟩 Clang14            Pass: 100%/1   | Total:  5m 01s | Avg:  5m 01s | Max:  5m 01s
      🟩 Clang15            Pass: 100%/1   | Total:  5m 45s | Avg:  5m 45s | Max:  5m 45s
      🟩 Clang16            Pass: 100%/1   | Total:  5m 09s | Avg:  5m 09s | Max:  5m 09s
      🟩 Clang17            Pass: 100%/1   | Total:  5m 47s | Avg:  5m 47s | Max:  5m 47s
      🟩 Clang18            Pass: 100%/7   | Total: 42m 45s | Avg:  6m 06s | Max: 10m 06s
      🟩 GCC6               Pass: 100%/2   | Total:  8m 14s | Avg:  4m 07s | Max:  4m 33s
      🟩 GCC7               Pass: 100%/2   | Total:  9m 46s | Avg:  4m 53s | Max:  5m 16s
      🟩 GCC8               Pass: 100%/1   | Total:  5m 23s | Avg:  5m 23s | Max:  5m 23s
      🟩 GCC9               Pass: 100%/3   | Total: 14m 51s | Avg:  4m 57s | Max:  5m 46s
      🟩 GCC10              Pass: 100%/1   | Total:  5m 22s | Avg:  5m 22s | Max:  5m 22s
      🟩 GCC11              Pass: 100%/1   | Total:  5m 13s | Avg:  5m 13s | Max:  5m 13s
      🟩 GCC12              Pass: 100%/1   | Total:  5m 40s | Avg:  5m 40s | Max:  5m 40s
      🟩 GCC13              Pass: 100%/8   | Total: 56m 06s | Avg:  7m 00s | Max: 10m 53s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 41m 24s | Avg: 41m 24s | Max: 41m 24s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 26m 04s | Avg: 26m 04s | Max: 26m 04s | Hits: 368%/1852  
      🟩 MSVC14.29          Pass: 100%/1   | Total: 45m 42s | Avg: 45m 42s | Max: 45m 42s | Hits: 194%/1852  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  1h 49m | Avg: 54m 58s | Max: 57m 06s | Hits: 194%/3704  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 54m | Avg: 57m 10s | Max: 57m 32s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  4h 25m | Avg: 13m 59s | Max: 37m 08s
      🟩 GCC                Pass: 100%/19  | Total:  1h 50m | Avg:  5m 49s | Max: 10m 53s
      🟩 Intel              Pass: 100%/1   | Total: 41m 24s | Avg: 41m 24s | Max: 41m 24s
      🟩 MSVC               Pass: 100%/4   | Total:  3h 01m | Avg: 45m 25s | Max: 57m 06s | Hits: 238%/7408  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 54m | Avg: 57m 10s | Max: 57m 32s
    🟩 gpu
      🟩 rtx4090            Pass: 100%/8   | Total:  1h 03m | Avg:  7m 57s | Max: 10m 53s
      🟩 v100               Pass: 100%/37  | Total: 10h 50m | Avg: 17m 34s | Max: 57m 32s | Hits: 238%/7408  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total: 11h 07m | Avg: 16m 41s | Max: 57m 32s | Hits: 238%/7408  
      🟩 TestCPU            Pass: 100%/2   | Total: 14m 53s | Avg:  7m 26s | Max:  7m 43s
      🟩 TestGPU            Pass: 100%/3   | Total: 31m 30s | Avg: 10m 30s | Max: 10m 53s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total:  4m 19s | Avg:  4m 19s | Max:  4m 19s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total: 41m 38s | Avg:  8m 19s | Max: 25m 20s
      🟩 14                 Pass: 100%/4   | Total:  1h 07m | Avg: 16m 58s | Max: 32m 01s | Hits: 368%/1852  
      🟩 17                 Pass: 100%/12  | Total:  4h 31m | Avg: 22m 38s | Max: 57m 32s | Hits: 194%/3704  
      🟩 20                 Pass: 100%/22  | Total:  5h 16m | Avg: 14m 22s | Max: 57m 06s | Hits: 194%/1852  
    
  • 🟩 cudax: Pass: 100%/26 | Total: 2h 09m | Avg: 4m 59s | Max: 12m 45s | Hits: 575%/312

    🟩 cpu
      🟩 amd64              Pass: 100%/22  | Total:  1h 57m | Avg:  5m 20s | Max: 12m 45s | Hits: 575%/312   
      🟩 arm64              Pass: 100%/4   | Total: 12m 14s | Avg:  3m 03s | Max:  3m 11s
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 17m 31s | Avg:  5m 50s | Max:  9m 20s | Hits: 574%/156   
      🟩 12.5               Pass: 100%/2   | Total: 12m 37s | Avg:  6m 18s | Max:  6m 29s
      🟩 12.6               Pass: 100%/21  | Total:  1h 39m | Avg:  4m 45s | Max: 12m 45s | Hits: 576%/156   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 17m 31s | Avg:  5m 50s | Max:  9m 20s | Hits: 574%/156   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 12m 37s | Avg:  6m 18s | Max:  6m 29s
      🟩 nvcc12.6           Pass: 100%/21  | Total:  1h 39m | Avg:  4m 45s | Max: 12m 45s | Hits: 576%/156   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/26  | Total:  2h 09m | Avg:  4m 59s | Max: 12m 45s | Hits: 575%/312   
    🟩 cxx
      🟩 Clang9             Pass: 100%/1   | Total:  4m 27s | Avg:  4m 27s | Max:  4m 27s
      🟩 Clang10            Pass: 100%/1   | Total:  4m 28s | Avg:  4m 28s | Max:  4m 28s
      🟩 Clang11            Pass: 100%/1   | Total:  3m 47s | Avg:  3m 47s | Max:  3m 47s
      🟩 Clang12            Pass: 100%/1   | Total:  3m 46s | Avg:  3m 46s | Max:  3m 46s
      🟩 Clang13            Pass: 100%/1   | Total:  3m 44s | Avg:  3m 44s | Max:  3m 44s
      🟩 Clang14            Pass: 100%/1   | Total:  3m 43s | Avg:  3m 43s | Max:  3m 43s
      🟩 Clang15            Pass: 100%/1   | Total:  3m 53s | Avg:  3m 53s | Max:  3m 53s
      🟩 Clang16            Pass: 100%/1   | Total:  3m 47s | Avg:  3m 47s | Max:  3m 47s
      🟩 Clang17            Pass: 100%/1   | Total:  3m 42s | Avg:  3m 42s | Max:  3m 42s
      🟩 Clang18            Pass: 100%/4   | Total: 22m 05s | Avg:  5m 31s | Max: 12m 11s
      🟩 GCC9               Pass: 100%/1   | Total:  3m 44s | Avg:  3m 44s | Max:  3m 44s
      🟩 GCC10              Pass: 100%/1   | Total:  3m 26s | Avg:  3m 26s | Max:  3m 26s
      🟩 GCC11              Pass: 100%/1   | Total:  3m 21s | Avg:  3m 21s | Max:  3m 21s
      🟩 GCC12              Pass: 100%/2   | Total: 16m 24s | Avg:  8m 12s | Max: 12m 45s
      🟩 GCC13              Pass: 100%/4   | Total: 12m 15s | Avg:  3m 03s | Max:  3m 10s
      🟩 MSVC14.36          Pass: 100%/1   | Total:  9m 20s | Avg:  9m 20s | Max:  9m 20s | Hits: 574%/156   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 11m 26s | Avg: 11m 26s | Max: 11m 26s | Hits: 576%/156   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 12m 37s | Avg:  6m 18s | Max:  6m 29s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/13  | Total: 57m 22s | Avg:  4m 24s | Max: 12m 11s
      🟩 GCC                Pass: 100%/9   | Total: 39m 10s | Avg:  4m 21s | Max: 12m 45s
      🟩 MSVC               Pass: 100%/2   | Total: 20m 46s | Avg: 10m 23s | Max: 11m 26s | Hits: 575%/312   
      🟩 NVHPC              Pass: 100%/2   | Total: 12m 37s | Avg:  6m 18s | Max:  6m 29s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 32m 15s | Avg:  8m 03s | Max: 12m 45s
      🟩 v100               Pass: 100%/22  | Total:  1h 37m | Avg:  4m 26s | Max: 11m 26s | Hits: 575%/312   
    🟩 jobs
      🟩 Build              Pass: 100%/24  | Total:  1h 44m | Avg:  4m 22s | Max: 11m 26s | Hits: 575%/312   
      🟩 Test               Pass: 100%/2   | Total: 24m 56s | Avg: 12m 28s | Max: 12m 45s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  3m 05s | Avg:  3m 05s | Max:  3m 05s
      🟩 90a                Pass: 100%/1   | Total:  3m 10s | Avg:  3m 10s | Max:  3m 10s
    🟩 std
      🟩 17                 Pass: 100%/6   | Total: 23m 20s | Avg:  3m 53s | Max:  6m 08s
      🟩 20                 Pass: 100%/20  | Total:  1h 46m | Avg:  5m 19s | Max: 12m 45s | Hits: 575%/312   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 6m 42s | Avg: 3m 21s | Max: 4m 39s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  6m 42s | Avg:  3m 21s | Max:  4m 39s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  6m 42s | Avg:  3m 21s | Max:  4m 39s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  6m 42s | Avg:  3m 21s | Max:  4m 39s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  6m 42s | Avg:  3m 21s | Max:  4m 39s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  6m 42s | Avg:  3m 21s | Max:  4m 39s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  6m 42s | Avg:  3m 21s | Max:  4m 39s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  6m 42s | Avg:  3m 21s | Max:  4m 39s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 03s | Avg:  2m 03s | Max:  2m 03s
      🟩 Test               Pass: 100%/1   | Total:  4m 39s | Avg:  4m 39s | Max:  4m 39s
    
  • 🟩 python: Pass: 100%/1 | Total: 25m 31s | Avg: 25m 31s | Max: 25m 31s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 25m 31s | Avg: 25m 31s | Max: 25m 31s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 25m 31s | Avg: 25m 31s | Max: 25m 31s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 25m 31s | Avg: 25m 31s | Max: 25m 31s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 25m 31s | Avg: 25m 31s | Max: 25m 31s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 25m 31s | Avg: 25m 31s | Max: 25m 31s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 25m 31s | Avg: 25m 31s | Max: 25m 31s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 25m 31s | Avg: 25m 31s | Max: 25m 31s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 25m 31s | Avg: 25m 31s | Max: 25m 31s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 169)

# Runner
125 linux-amd64-cpu16
14 windows-amd64-cpu16
10 linux-amd64-gpu-rtx2080-latest-1
10 linux-arm64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
1 linux-amd64-gpu-h100-latest-1

@bernhardmgruber bernhardmgruber merged commit 68cc4ec into NVIDIA:branch/2.8.x Feb 2, 2025
185 checks passed
@bernhardmgruber bernhardmgruber deleted the backport_ptx_fixes branch February 3, 2025 07:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants