Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PTX: Update existing instructions #3584

Merged

Conversation

bernhardmgruber
Copy link
Contributor

@bernhardmgruber bernhardmgruber commented Jan 29, 2025

@bernhardmgruber bernhardmgruber force-pushed the ptx_update_existing_instr branch from 1cfadae to 2333ef1 Compare January 29, 2025 22:11
@bernhardmgruber
Copy link
Contributor Author

/ok to test

Copy link
Contributor

🟨 CI finished in 4h 38m: Pass: 98%/152 | Total: 3d 03h | Avg: 29m 41s | Max: 1h 14m | Hits: 416%/21531
  • 🟨 cub: Pass: 97%/44 | Total: 1d 15h | Avg: 53m 12s | Max: 1h 10m | Hits: 159%/3552

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  97%/42  | Total:  1d 12h | Avg: 52m 43s | Max:  1h 10m | Hits: 159%/3552  
      🟩 arm64              Pass: 100%/2   | Total:  2h 06m | Avg:  1h 03m | Max:  1h 07m
    🔍 ctk: 12.6 🔍
      🟩 12.0               Pass: 100%/5   | Total:  4h 53m | Avg: 58m 39s | Max:  1h 02m | Hits: 159%/888   
      🟩 12.5               Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 07m
      🔍 12.6               Pass:  97%/37  | Total:  1d 07h | Avg: 51m 44s | Max:  1h 10m | Hits: 159%/2664  
    🔍 cudacxx: nvcc12.6 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 06m | Avg:  1h 03m | Max:  1h 06m
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 53m | Avg: 58m 39s | Max:  1h 02m | Hits: 159%/888   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 07m
      🔍 nvcc12.6           Pass:  97%/35  | Total:  1d 05h | Avg: 51m 05s | Max:  1h 10m | Hits: 159%/2664  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 06m | Avg:  1h 03m | Max:  1h 06m
      🔍 nvcc               Pass:  97%/42  | Total:  1d 12h | Avg: 52m 43s | Max:  1h 10m | Hits: 159%/3552  
    🔍 cxx: GCC13 🔍
      🟩 Clang14            Pass: 100%/4   | Total:  3h 59m | Avg: 59m 53s | Max:  1h 02m
      🟩 Clang15            Pass: 100%/2   | Total:  1h 52m | Avg: 56m 25s | Max: 57m 01s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 56m | Avg: 58m 15s | Max: 59m 09s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 58m | Avg: 59m 19s | Max:  1h 02m
      🟩 Clang18            Pass: 100%/7   | Total:  6h 03m | Avg: 51m 57s | Max:  1h 07m
      🟩 GCC7               Pass: 100%/2   | Total:  1h 55m | Avg: 57m 58s | Max: 59m 01s
      🟩 GCC8               Pass: 100%/1   | Total: 54m 55s | Avg: 54m 55s | Max: 54m 55s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 56m | Avg: 58m 20s | Max: 59m 17s
      🟩 GCC10              Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 02m
      🟩 GCC11              Pass: 100%/2   | Total:  2h 02m | Avg:  1h 01m | Max:  1h 02m
      🟩 GCC12              Pass: 100%/4   | Total:  2h 45m | Avg: 41m 27s | Max: 59m 30s
      🔍 GCC13              Pass:  87%/8   | Total:  4h 50m | Avg: 36m 18s | Max: 59m 33s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 10m | Avg:  1h 05m | Max:  1h 08m | Hits: 159%/1776  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 17m | Avg:  1h 08m | Max:  1h 10m | Hits: 158%/1776  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 07m
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/17  | Total: 15h 51m | Avg: 55m 57s | Max:  1h 07m
      🔍 GCC                Pass:  95%/21  | Total: 16h 27m | Avg: 47m 02s | Max:  1h 02m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 28m | Avg:  1h 07m | Max:  1h 10m | Hits: 159%/3552  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 07m
    🔍 gpu: v100 🔍
      🟩 h100               Pass: 100%/2   | Total: 48m 27s | Avg: 24m 13s | Max: 29m 06s
      🔍 v100               Pass:  97%/42  | Total:  1d 14h | Avg: 54m 35s | Max:  1h 10m | Hits: 159%/3552  
    🚨 jobs: GraphCapture 🚨
      🟩 Build              Pass: 100%/37  | Total:  1d 12h | Avg: 59m 11s | Max:  1h 10m | Hits: 159%/3552  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 22m 39s | Avg: 22m 39s | Max: 22m 39s
      🔥 GraphCapture       Pass:   0%/1   | Total: 17m 42s | Avg: 17m 42s | Max: 17m 42s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 06m | Avg: 22m 07s | Max: 24m 16s
      🟩 TestGPU            Pass: 100%/2   | Total: 44m 30s | Avg: 22m 15s | Max: 24m 49s
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/20  | Total: 20h 05m | Avg:  1h 00m | Max:  1h 08m | Hits: 159%/2664  
      🔍 20                 Pass:  95%/24  | Total: 18h 56m | Avg: 47m 20s | Max:  1h 10m | Hits: 158%/888   
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 48m 27s | Avg: 24m 13s | Max: 29m 06s
      🟩 90a                Pass: 100%/1   | Total: 25m 54s | Avg: 25m 54s | Max: 25m 54s
    
  • 🟨 thrust: Pass: 97%/42 | Total: 23h 27m | Avg: 33m 30s | Max: 1h 14m | Hits: 177%/7384

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  97%/40  | Total: 22h 28m | Avg: 33m 43s | Max:  1h 14m | Hits: 177%/7384  
      🟩 arm64              Pass: 100%/2   | Total: 58m 38s | Avg: 29m 19s | Max: 30m 31s
    🔍 ctk: 12.6 🔍
      🟩 12.0               Pass: 100%/5   | Total:  3h 00m | Avg: 36m 02s | Max: 50m 34s | Hits: 177%/1846  
      🟩 12.5               Pass: 100%/2   | Total:  1h 47m | Avg: 53m 51s | Max: 55m 36s
      🔍 12.6               Pass:  97%/35  | Total: 18h 39m | Avg: 31m 59s | Max:  1h 14m | Hits: 177%/5538  
    🔍 cudacxx: nvcc12.6 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 58m 09s | Avg: 29m 04s | Max: 29m 49s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 00m | Avg: 36m 02s | Max: 50m 34s | Hits: 177%/1846  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 47m | Avg: 53m 51s | Max: 55m 36s
      🔍 nvcc12.6           Pass:  96%/33  | Total: 17h 41m | Avg: 32m 09s | Max:  1h 14m | Hits: 177%/5538  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 58m 09s | Avg: 29m 04s | Max: 29m 49s
      🔍 nvcc               Pass:  97%/40  | Total: 22h 29m | Avg: 33m 44s | Max:  1h 14m | Hits: 177%/7384  
    🔍 cxx: GCC13 🔍
      🟩 Clang14            Pass: 100%/4   | Total:  2h 08m | Avg: 32m 13s | Max: 34m 21s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 02m | Avg: 31m 20s | Max: 32m 34s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 04m | Avg: 32m 10s | Max: 33m 00s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 14m | Avg: 37m 14s | Max: 39m 47s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 59m | Avg: 25m 38s | Max: 33m 39s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 02m | Avg: 31m 21s | Max: 32m 02s
      🟩 GCC8               Pass: 100%/1   | Total: 33m 33s | Avg: 33m 33s | Max: 33m 33s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 08m | Avg: 34m 20s | Max: 35m 09s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 09m | Avg: 34m 58s | Max: 36m 18s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 06m | Avg: 33m 07s | Max: 33m 10s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 07m | Avg: 33m 38s | Max: 34m 14s
      🔍 GCC13              Pass:  87%/8   | Total:  2h 55m | Avg: 21m 58s | Max: 36m 51s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 51m | Avg: 55m 31s | Max:  1h 00m | Hits: 177%/3692  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 14m | Avg:  1h 07m | Max:  1h 14m | Hits: 177%/3692  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 47m | Avg: 53m 51s | Max: 55m 36s
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/17  | Total:  8h 29m | Avg: 29m 59s | Max: 39m 47s
      🔍 GCC                Pass:  94%/19  | Total:  9h 04m | Avg: 28m 38s | Max: 36m 51s
      🟩 MSVC               Pass: 100%/4   | Total:  4h 05m | Avg:  1h 01m | Max:  1h 14m | Hits: 177%/7384  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 47m | Avg: 53m 51s | Max: 55m 36s
    🔍 jobs: TestGPU 🔍
      🟩 Build              Pass: 100%/37  | Total: 22h 32m | Avg: 36m 32s | Max:  1h 14m | Hits: 177%/7384  
      🟩 TestCPU            Pass: 100%/2   | Total: 15m 49s | Avg:  7m 54s | Max:  8m 09s
      🔍 TestGPU            Pass:  66%/3   | Total: 39m 23s | Avg: 13m 07s | Max: 20m 57s
    🟨 cmake_options
      🟨 -DTHRUST_DISPATCH_TYPE=Force32bit Pass:  50%/2   | Total: 35m 09s | Avg: 17m 34s | Max: 28m 33s
    🟨 gpu
      🟨 v100               Pass:  97%/42  | Total: 23h 27m | Avg: 33m 30s | Max:  1h 14m | Hits: 177%/7384  
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 19m 33s | Avg: 19m 33s | Max: 19m 33s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 12h 45m | Avg: 38m 17s | Max:  1h 14m | Hits: 177%/5538  
      🟩 20                 Pass: 100%/20  | Total: 10h 06m | Avg: 30m 19s | Max:  1h 00m | Hits: 177%/1846  
    
  • 🟩 libcudacxx: Pass: 100%/43 | Total: 9h 39m | Avg: 13m 27s | Max: 31m 23s | Hits: 683%/10073

    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total:  9h 15m | Avg: 13m 32s | Max: 31m 23s | Hits: 683%/10073 
      🟩 arm64              Pass: 100%/2   | Total: 23m 33s | Avg: 11m 46s | Max: 20m 10s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 06m | Avg: 13m 14s | Max: 21m 24s | Hits: 684%/2473  
      🟩 12.5               Pass: 100%/2   | Total: 42m 55s | Avg: 21m 27s | Max: 31m 23s
      🟩 12.6               Pass: 100%/36  | Total:  7h 49m | Avg: 13m 03s | Max: 31m 22s | Hits: 683%/7600  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/4   | Total:  1h 09m | Avg: 17m 16s | Max: 21m 45s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 06m | Avg: 13m 14s | Max: 21m 24s | Hits: 684%/2473  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 42m 55s | Avg: 21m 27s | Max: 31m 23s
      🟩 nvcc12.6           Pass: 100%/32  | Total:  6h 40m | Avg: 12m 31s | Max: 31m 22s | Hits: 683%/7600  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/4   | Total:  1h 09m | Avg: 17m 16s | Max: 21m 45s
      🟩 nvcc               Pass: 100%/39  | Total:  8h 29m | Avg: 13m 04s | Max: 31m 23s | Hits: 683%/10073 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 07m | Avg: 16m 59s | Max: 21m 05s
      🟩 Clang15            Pass: 100%/2   | Total: 26m 50s | Avg: 13m 25s | Max: 19m 32s
      🟩 Clang16            Pass: 100%/2   | Total: 22m 39s | Avg: 11m 19s | Max: 11m 46s
      🟩 Clang17            Pass: 100%/2   | Total: 31m 03s | Avg: 15m 31s | Max: 20m 54s
      🟩 Clang18            Pass: 100%/8   | Total:  2h 27m | Avg: 18m 24s | Max: 31m 22s
      🟩 GCC7               Pass: 100%/2   | Total: 10m 13s | Avg:  5m 06s | Max:  6m 40s
      🟩 GCC8               Pass: 100%/1   | Total:  3m 59s | Avg:  3m 59s | Max:  3m 59s
      🟩 GCC9               Pass: 100%/2   | Total: 10m 28s | Avg:  5m 14s | Max:  6m 54s
      🟩 GCC10              Pass: 100%/2   | Total:  7m 49s | Avg:  3m 54s | Max:  4m 05s
      🟩 GCC11              Pass: 100%/2   | Total: 10m 31s | Avg:  5m 15s | Max:  6m 19s
      🟩 GCC12              Pass: 100%/2   | Total:  8m 21s | Avg:  4m 10s | Max:  4m 15s
      🟩 GCC13              Pass: 100%/8   | Total:  1h 26m | Avg: 10m 49s | Max: 26m 36s
      🟩 MSVC14.29          Pass: 100%/2   | Total: 47m 56s | Avg: 23m 58s | Max: 26m 32s | Hits: 684%/4956  
      🟩 MSVC14.39          Pass: 100%/2   | Total: 54m 33s | Avg: 27m 16s | Max: 29m 30s | Hits: 683%/5117  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 42m 55s | Avg: 21m 27s | Max: 31m 23s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/18  | Total:  4h 55m | Avg: 16m 25s | Max: 31m 22s
      🟩 GCC                Pass: 100%/19  | Total:  2h 17m | Avg:  7m 15s | Max: 26m 36s
      🟩 MSVC               Pass: 100%/4   | Total:  1h 42m | Avg: 25m 37s | Max: 29m 30s | Hits: 683%/10073 
      🟩 NVHPC              Pass: 100%/2   | Total: 42m 55s | Avg: 21m 27s | Max: 31m 23s
    🟩 gpu
      🟩 v100               Pass: 100%/43  | Total:  9h 39m | Avg: 13m 27s | Max: 31m 23s | Hits: 683%/10073 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total:  7h 59m | Avg: 12m 37s | Max: 31m 23s | Hits: 683%/10073 
      🟩 NVRTC              Pass: 100%/2   | Total: 47m 04s | Avg: 23m 32s | Max: 26m 36s
      🟩 Test               Pass: 100%/2   | Total: 50m 23s | Avg: 25m 11s | Max: 31m 22s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 07s | Avg:  2m 07s | Max:  2m 07s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total: 14m 16s | Avg: 14m 16s | Max: 14m 16s
      🟩 90a                Pass: 100%/2   | Total: 19m 19s | Avg:  9m 39s | Max: 12m 15s
    🟩 std
      🟩 17                 Pass: 100%/21  | Total:  4h 47m | Avg: 13m 41s | Max: 31m 23s | Hits: 684%/7439  
      🟩 20                 Pass: 100%/21  | Total:  4h 49m | Avg: 13m 46s | Max: 31m 22s | Hits: 681%/2634  
    
  • 🟩 cudax: Pass: 100%/20 | Total: 1h 59m | Avg: 5m 58s | Max: 15m 48s | Hits: 383%/522

    🟩 cpu
      🟩 amd64              Pass: 100%/16  | Total:  1h 45m | Avg:  6m 36s | Max: 15m 48s | Hits: 383%/522   
      🟩 arm64              Pass: 100%/4   | Total: 13m 43s | Avg:  3m 25s | Max:  3m 33s
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total: 10m 07s | Avg: 10m 07s | Max: 10m 07s | Hits: 383%/261   
      🟩 12.5               Pass: 100%/2   | Total: 12m 26s | Avg:  6m 13s | Max:  6m 16s
      🟩 12.6               Pass: 100%/17  | Total:  1h 36m | Avg:  5m 41s | Max: 15m 48s | Hits: 383%/261   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total: 10m 07s | Avg: 10m 07s | Max: 10m 07s | Hits: 383%/261   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 12m 26s | Avg:  6m 13s | Max:  6m 16s
      🟩 nvcc12.6           Pass: 100%/17  | Total:  1h 36m | Avg:  5m 41s | Max: 15m 48s | Hits: 383%/261   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/20  | Total:  1h 59m | Avg:  5m 58s | Max: 15m 48s | Hits: 383%/522   
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total:  4m 06s | Avg:  4m 06s | Max:  4m 06s
      🟩 Clang15            Pass: 100%/1   | Total:  4m 01s | Avg:  4m 01s | Max:  4m 01s
      🟩 Clang16            Pass: 100%/1   | Total:  4m 15s | Avg:  4m 15s | Max:  4m 15s
      🟩 Clang17            Pass: 100%/1   | Total:  3m 54s | Avg:  3m 54s | Max:  3m 54s
      🟩 Clang18            Pass: 100%/4   | Total: 26m 36s | Avg:  6m 39s | Max: 15m 48s
      🟩 GCC10              Pass: 100%/1   | Total:  4m 03s | Avg:  4m 03s | Max:  4m 03s
      🟩 GCC11              Pass: 100%/1   | Total:  4m 12s | Avg:  4m 12s | Max:  4m 12s
      🟩 GCC12              Pass: 100%/2   | Total: 19m 12s | Avg:  9m 36s | Max: 15m 16s
      🟩 GCC13              Pass: 100%/4   | Total: 13m 43s | Avg:  3m 25s | Max:  3m 33s
      🟩 MSVC14.36          Pass: 100%/1   | Total: 10m 07s | Avg: 10m 07s | Max: 10m 07s | Hits: 383%/261   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 12m 50s | Avg: 12m 50s | Max: 12m 50s | Hits: 383%/261   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 12m 26s | Avg:  6m 13s | Max:  6m 16s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total: 42m 52s | Avg:  5m 21s | Max: 15m 48s
      🟩 GCC                Pass: 100%/8   | Total: 41m 10s | Avg:  5m 08s | Max: 15m 16s
      🟩 MSVC               Pass: 100%/2   | Total: 22m 57s | Avg: 11m 28s | Max: 12m 50s | Hits: 383%/522   
      🟩 NVHPC              Pass: 100%/2   | Total: 12m 26s | Avg:  6m 13s | Max:  6m 16s
    🟩 gpu
      🟩 v100               Pass: 100%/20  | Total:  1h 59m | Avg:  5m 58s | Max: 15m 48s | Hits: 383%/522   
    🟩 jobs
      🟩 Build              Pass: 100%/18  | Total:  1h 28m | Avg:  4m 54s | Max: 12m 50s | Hits: 383%/522   
      🟩 Test               Pass: 100%/2   | Total: 31m 04s | Avg: 15m 32s | Max: 15m 48s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  3m 14s | Avg:  3m 14s | Max:  3m 14s
      🟩 90a                Pass: 100%/1   | Total:  3m 30s | Avg:  3m 30s | Max:  3m 30s
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 16m 12s | Avg:  4m 03s | Max:  6m 10s
      🟩 20                 Pass: 100%/16  | Total:  1h 43m | Avg:  6m 27s | Max: 15m 48s | Hits: 383%/522   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 10m 02s | Avg: 5m 01s | Max: 7m 45s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  7m 45s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  7m 45s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  7m 45s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  7m 45s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  7m 45s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  7m 45s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  7m 45s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 17s | Avg:  2m 17s | Max:  2m 17s
      🟩 Test               Pass: 100%/1   | Total:  7m 45s | Avg:  7m 45s | Max:  7m 45s
    
  • 🟩 python: Pass: 100%/1 | Total: 56m 05s | Avg: 56m 05s | Max: 56m 05s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 56m 05s | Avg: 56m 05s | Max: 56m 05s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 56m 05s | Avg: 56m 05s | Max: 56m 05s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 56m 05s | Avg: 56m 05s | Max: 56m 05s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 56m 05s | Avg: 56m 05s | Max: 56m 05s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 56m 05s | Avg: 56m 05s | Max: 56m 05s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 56m 05s | Avg: 56m 05s | Max: 56m 05s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 56m 05s | Avg: 56m 05s | Max: 56m 05s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 56m 05s | Avg: 56m 05s | Max: 56m 05s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 152)

# Runner
110 linux-amd64-cpu16
17 linux-amd64-gpu-v100-latest-1
14 windows-amd64-cpu16
10 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1

@bernhardmgruber bernhardmgruber force-pushed the ptx_update_existing_instr branch from 2333ef1 to c865298 Compare January 30, 2025 08:10
@bernhardmgruber bernhardmgruber marked this pull request as ready for review January 30, 2025 08:10
@bernhardmgruber bernhardmgruber requested review from a team as code owners January 30, 2025 08:10
@NVIDIA NVIDIA deleted a comment from copy-pr-bot bot Jan 30, 2025
Copy link
Contributor

🟩 CI finished in 1h 32m: Pass: 100%/152 | Total: 3d 16h | Avg: 34m 45s | Max: 1h 17m | Hits: 250%/21579
  • 🟩 cub: Pass: 100%/44 | Total: 1d 15h | Avg: 54m 23s | Max: 1h 15m | Hits: 38%/3552

    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  1d 13h | Avg: 54m 04s | Max:  1h 15m | Hits:  38%/3552  
      🟩 arm64              Pass: 100%/2   | Total:  2h 02m | Avg:  1h 01m | Max:  1h 01m
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  5h 06m | Avg:  1h 01m | Max:  1h 03m | Hits:  38%/888   
      🟩 12.5               Pass: 100%/2   | Total:  2h 24m | Avg:  1h 12m | Max:  1h 15m
      🟩 12.6               Pass: 100%/37  | Total:  1d 08h | Avg: 52m 30s | Max:  1h 14m | Hits:  39%/2664  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 06m
      🟩 nvcc12.0           Pass: 100%/5   | Total:  5h 06m | Avg:  1h 01m | Max:  1h 03m | Hits:  38%/888   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 24m | Avg:  1h 12m | Max:  1h 15m
      🟩 nvcc12.6           Pass: 100%/35  | Total:  1d 06h | Avg: 51m 48s | Max:  1h 14m | Hits:  39%/2664  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 06m
      🟩 nvcc               Pass: 100%/42  | Total:  1d 13h | Avg: 53m 54s | Max:  1h 15m | Hits:  38%/3552  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 52m | Avg: 58m 00s | Max:  1h 02m
      🟩 Clang15            Pass: 100%/2   | Total:  1h 59m | Avg: 59m 59s | Max:  1h 03m
      🟩 Clang16            Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 04m
      🟩 Clang17            Pass: 100%/2   | Total:  1h 56m | Avg: 58m 15s | Max: 58m 59s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 55m | Avg: 50m 49s | Max:  1h 06m
      🟩 GCC7               Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 02m
      🟩 GCC8               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
      🟩 GCC9               Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 04m
      🟩 GCC10              Pass: 100%/2   | Total:  2h 03m | Avg:  1h 01m | Max:  1h 04m
      🟩 GCC11              Pass: 100%/2   | Total:  1h 56m | Avg: 58m 21s | Max:  1h 01m
      🟩 GCC12              Pass: 100%/4   | Total:  2h 56m | Avg: 44m 11s | Max:  1h 02m
      🟩 GCC13              Pass: 100%/8   | Total:  4h 48m | Avg: 36m 03s | Max:  1h 01m
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 14m | Hits:  39%/1776  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 24m | Avg:  1h 12m | Max:  1h 14m | Hits:  38%/1776  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 24m | Avg:  1h 12m | Max:  1h 15m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 15h 52m | Avg: 56m 00s | Max:  1h 06m
      🟩 GCC                Pass: 100%/21  | Total: 16h 56m | Avg: 48m 25s | Max:  1h 04m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 40m | Avg:  1h 10m | Max:  1h 14m | Hits:  38%/3552  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 24m | Avg:  1h 12m | Max:  1h 15m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 55m 25s | Avg: 27m 42s | Max: 29m 22s
      🟩 rtxa6000           Pass: 100%/8   | Total:  4h 09m | Avg: 31m 13s | Max:  1h 04m
      🟩 v100               Pass: 100%/34  | Total:  1d 10h | Avg:  1h 01m | Max:  1h 15m | Hits:  38%/3552  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 13h | Avg:  1h 00m | Max:  1h 15m | Hits:  38%/3552  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 20m 11s | Avg: 20m 11s | Max: 20m 11s
      🟩 GraphCapture       Pass: 100%/1   | Total: 15m 27s | Avg: 15m 27s | Max: 15m 27s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 21m | Avg: 27m 00s | Max: 29m 22s
      🟩 TestGPU            Pass: 100%/2   | Total: 40m 01s | Avg: 20m 00s | Max: 20m 42s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 55m 25s | Avg: 27m 42s | Max: 29m 22s
      🟩 90a                Pass: 100%/1   | Total: 29m 37s | Avg: 29m 37s | Max: 29m 37s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 20h 41m | Avg:  1h 02m | Max:  1h 15m | Hits:  39%/2664  
      🟩 20                 Pass: 100%/24  | Total: 19h 12m | Avg: 48m 00s | Max:  1h 14m | Hits:  38%/888   
    
  • 🟩 libcudacxx: Pass: 100%/43 | Total: 16h 35m | Avg: 23m 08s | Max: 52m 38s | Hits: 457%/10121

    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total: 15h 51m | Avg: 23m 12s | Max: 52m 38s | Hits: 457%/10121 
      🟩 arm64              Pass: 100%/2   | Total: 43m 11s | Avg: 21m 35s | Max: 21m 46s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 54m | Avg: 22m 49s | Max: 31m 24s | Hits: 380%/2485  
      🟩 12.5               Pass: 100%/2   | Total:  1h 06m | Avg: 33m 16s | Max: 35m 14s
      🟩 12.6               Pass: 100%/36  | Total: 13h 34m | Avg: 22m 37s | Max: 52m 38s | Hits: 482%/7636  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/4   | Total:  1h 12m | Avg: 18m 12s | Max: 23m 08s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 54m | Avg: 22m 49s | Max: 31m 24s | Hits: 380%/2485  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 06m | Avg: 33m 16s | Max: 35m 14s
      🟩 nvcc12.6           Pass: 100%/32  | Total: 12h 21m | Avg: 23m 10s | Max: 52m 38s | Hits: 482%/7636  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/4   | Total:  1h 12m | Avg: 18m 12s | Max: 23m 08s
      🟩 nvcc               Pass: 100%/39  | Total: 15h 22m | Avg: 23m 38s | Max: 52m 38s | Hits: 457%/10121 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 18m | Avg: 19m 42s | Max: 22m 10s
      🟩 Clang15            Pass: 100%/2   | Total: 38m 47s | Avg: 19m 23s | Max: 22m 59s
      🟩 Clang16            Pass: 100%/2   | Total: 46m 43s | Avg: 23m 21s | Max: 23m 22s
      🟩 Clang17            Pass: 100%/2   | Total: 45m 45s | Avg: 22m 52s | Max: 24m 52s
      🟩 Clang18            Pass: 100%/8   | Total:  3h 11m | Avg: 23m 57s | Max: 52m 38s
      🟩 GCC7               Pass: 100%/2   | Total: 42m 31s | Avg: 21m 15s | Max: 21m 16s
      🟩 GCC8               Pass: 100%/1   | Total: 21m 24s | Avg: 21m 24s | Max: 21m 24s
      🟩 GCC9               Pass: 100%/2   | Total: 41m 09s | Avg: 20m 34s | Max: 20m 37s
      🟩 GCC10              Pass: 100%/2   | Total: 39m 54s | Avg: 19m 57s | Max: 20m 51s
      🟩 GCC11              Pass: 100%/2   | Total: 45m 49s | Avg: 22m 54s | Max: 23m 21s
      🟩 GCC12              Pass: 100%/2   | Total: 41m 06s | Avg: 20m 33s | Max: 21m 34s
      🟩 GCC13              Pass: 100%/8   | Total:  2h 35m | Avg: 19m 29s | Max: 48m 38s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 04m | Avg: 32m 13s | Max: 33m 03s | Hits: 442%/4980  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  1h 14m | Avg: 37m 12s | Max: 38m 52s | Hits: 471%/5141  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 06m | Avg: 33m 16s | Max: 35m 14s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/18  | Total:  6h 41m | Avg: 22m 19s | Max: 52m 38s
      🟩 GCC                Pass: 100%/19  | Total:  6h 27m | Avg: 20m 24s | Max: 48m 38s
      🟩 MSVC               Pass: 100%/4   | Total:  2h 18m | Avg: 34m 43s | Max: 38m 52s | Hits: 457%/10121 
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 06m | Avg: 33m 16s | Max: 35m 14s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/6   | Total:  2h 54m | Avg: 29m 06s | Max: 52m 38s
      🟩 v100               Pass: 100%/37  | Total: 13h 40m | Avg: 22m 10s | Max: 38m 52s | Hits: 457%/10121 
    🟩 jobs
      🟩 Build              Pass: 100%/38  | Total: 14h 18m | Avg: 22m 35s | Max: 38m 52s | Hits: 457%/10121 
      🟩 NVRTC              Pass: 100%/2   | Total: 33m 06s | Avg: 16m 33s | Max: 18m 26s
      🟩 Test               Pass: 100%/2   | Total:  1h 41m | Avg: 50m 38s | Max: 52m 38s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  1m 59s | Avg:  1m 59s | Max:  1m 59s
    🟩 sm
      🟩 75                 Pass: 100%/2   | Total: 33m 06s | Avg: 16m 33s | Max: 18m 26s
      🟩 90                 Pass: 100%/1   | Total: 14m 28s | Avg: 14m 28s | Max: 14m 28s
      🟩 90a                Pass: 100%/2   | Total: 25m 49s | Avg: 12m 54s | Max: 14m 11s
    🟩 std
      🟩 17                 Pass: 100%/21  | Total:  8h 03m | Avg: 23m 01s | Max: 35m 33s | Hits: 442%/7475  
      🟩 20                 Pass: 100%/21  | Total:  8h 29m | Avg: 24m 16s | Max: 52m 38s | Hits: 499%/2646  
    
  • 🟩 thrust: Pass: 100%/42 | Total: 1d 02h | Avg: 37m 47s | Max: 1h 17m | Hits: 79%/7384

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 44m 39s | Avg: 22m 19s | Max: 33m 35s
    🟩 cpu
      🟩 amd64              Pass: 100%/40  | Total:  1d 01h | Avg: 37m 58s | Max:  1h 17m | Hits:  79%/7384  
      🟩 arm64              Pass: 100%/2   | Total:  1h 08m | Avg: 34m 04s | Max: 35m 27s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  3h 36m | Avg: 43m 23s | Max:  1h 08m | Hits:  72%/1846  
      🟩 12.5               Pass: 100%/2   | Total:  2h 28m | Avg:  1h 14m | Max:  1h 17m
      🟩 12.6               Pass: 100%/35  | Total: 20h 22m | Avg: 34m 55s | Max:  1h 14m | Hits:  81%/5538  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 00m | Avg: 30m 07s | Max: 31m 03s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 36m | Avg: 43m 23s | Max:  1h 08m | Hits:  72%/1846  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 28m | Avg:  1h 14m | Max:  1h 17m
      🟩 nvcc12.6           Pass: 100%/33  | Total: 19h 22m | Avg: 35m 12s | Max:  1h 14m | Hits:  81%/5538  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 00m | Avg: 30m 07s | Max: 31m 03s
      🟩 nvcc               Pass: 100%/40  | Total:  1d 01h | Avg: 38m 10s | Max:  1h 17m | Hits:  79%/7384  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 28m | Avg: 37m 03s | Max: 38m 14s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 16m | Avg: 38m 07s | Max: 38m 08s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 15m | Avg: 37m 47s | Max: 39m 12s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 15m | Avg: 37m 39s | Max: 41m 32s
      🟩 Clang18            Pass: 100%/7   | Total:  3h 02m | Avg: 26m 03s | Max: 36m 33s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 12m | Avg: 36m 18s | Max: 37m 05s
      🟩 GCC8               Pass: 100%/1   | Total: 34m 39s | Avg: 34m 39s | Max: 34m 39s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 15m | Avg: 37m 30s | Max: 37m 39s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 16m | Avg: 38m 23s | Max: 40m 04s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 13m | Avg: 36m 44s | Max: 36m 47s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 16m | Avg: 38m 08s | Max: 38m 26s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 10m | Avg: 23m 50s | Max: 36m 21s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 08m | Hits:  85%/3692  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 26m | Avg:  1h 13m | Max:  1h 14m | Hits:  72%/3692  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 28m | Avg:  1h 14m | Max:  1h 17m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  9h 17m | Avg: 32m 48s | Max: 41m 32s
      🟩 GCC                Pass: 100%/19  | Total:  9h 59m | Avg: 31m 33s | Max: 40m 04s
      🟩 MSVC               Pass: 100%/4   | Total:  4h 42m | Avg:  1h 10m | Max:  1h 14m | Hits:  79%/7384  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 28m | Avg:  1h 14m | Max:  1h 17m
    🟩 gpu
      🟩 rtx4090            Pass: 100%/8   | Total:  2h 29m | Avg: 18m 42s | Max: 34m 49s
      🟩 v100               Pass: 100%/34  | Total: 23h 57m | Avg: 42m 16s | Max:  1h 17m | Hits:  79%/7384  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 01h | Avg: 41m 35s | Max:  1h 17m | Hits:  79%/7384  
      🟩 TestCPU            Pass: 100%/2   | Total: 15m 44s | Avg:  7m 52s | Max:  8m 10s
      🟩 TestGPU            Pass: 100%/3   | Total: 32m 56s | Avg: 10m 58s | Max: 11m 23s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 22m 05s | Avg: 22m 05s | Max: 22m 05s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 14h 31m | Avg: 43m 33s | Max:  1h 17m | Hits:  81%/5538  
      🟩 20                 Pass: 100%/20  | Total: 11h 11m | Avg: 33m 34s | Max:  1h 14m | Hits:  72%/1846  
    
  • 🟩 cudax: Pass: 100%/20 | Total: 4h 34m | Avg: 13m 42s | Max: 17m 52s | Hits: 121%/522

    🟩 cpu
      🟩 amd64              Pass: 100%/16  | Total:  3h 40m | Avg: 13m 45s | Max: 17m 52s | Hits: 121%/522   
      🟩 arm64              Pass: 100%/4   | Total: 54m 06s | Avg: 13m 31s | Max: 14m 39s
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total: 10m 10s | Avg: 10m 10s | Max: 10m 10s | Hits: 112%/261   
      🟩 12.5               Pass: 100%/2   | Total: 18m 05s | Avg:  9m 02s | Max:  9m 05s
      🟩 12.6               Pass: 100%/17  | Total:  4h 05m | Avg: 14m 27s | Max: 17m 52s | Hits: 131%/261   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total: 10m 10s | Avg: 10m 10s | Max: 10m 10s | Hits: 112%/261   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 18m 05s | Avg:  9m 02s | Max:  9m 05s
      🟩 nvcc12.6           Pass: 100%/17  | Total:  4h 05m | Avg: 14m 27s | Max: 17m 52s | Hits: 131%/261   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/20  | Total:  4h 34m | Avg: 13m 42s | Max: 17m 52s | Hits: 121%/522   
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total: 15m 14s | Avg: 15m 14s | Max: 15m 14s
      🟩 Clang15            Pass: 100%/1   | Total: 16m 52s | Avg: 16m 52s | Max: 16m 52s
      🟩 Clang16            Pass: 100%/1   | Total: 17m 52s | Avg: 17m 52s | Max: 17m 52s
      🟩 Clang17            Pass: 100%/1   | Total: 15m 57s | Avg: 15m 57s | Max: 15m 57s
      🟩 Clang18            Pass: 100%/4   | Total: 55m 23s | Avg: 13m 50s | Max: 16m 47s
      🟩 GCC10              Pass: 100%/1   | Total: 15m 50s | Avg: 15m 50s | Max: 15m 50s
      🟩 GCC11              Pass: 100%/1   | Total: 17m 13s | Avg: 17m 13s | Max: 17m 13s
      🟩 GCC12              Pass: 100%/2   | Total: 29m 09s | Avg: 14m 34s | Max: 16m 56s
      🟩 GCC13              Pass: 100%/4   | Total: 49m 52s | Avg: 12m 28s | Max: 14m 26s
      🟩 MSVC14.36          Pass: 100%/1   | Total: 10m 10s | Avg: 10m 10s | Max: 10m 10s | Hits: 112%/261   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 12m 31s | Avg: 12m 31s | Max: 12m 31s | Hits: 131%/261   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 18m 05s | Avg:  9m 02s | Max:  9m 05s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total:  2h 01m | Avg: 15m 09s | Max: 17m 52s
      🟩 GCC                Pass: 100%/8   | Total:  1h 52m | Avg: 14m 00s | Max: 17m 13s
      🟩 MSVC               Pass: 100%/2   | Total: 22m 41s | Avg: 11m 20s | Max: 12m 31s | Hits: 121%/522   
      🟩 NVHPC              Pass: 100%/2   | Total: 18m 05s | Avg:  9m 02s | Max:  9m 05s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 57m 28s | Avg: 14m 22s | Max: 16m 56s
      🟩 v100               Pass: 100%/16  | Total:  3h 36m | Avg: 13m 32s | Max: 17m 52s | Hits: 121%/522   
    🟩 jobs
      🟩 Build              Pass: 100%/18  | Total:  4h 10m | Avg: 13m 54s | Max: 17m 52s | Hits: 121%/522   
      🟩 Test               Pass: 100%/2   | Total: 23m 45s | Avg: 11m 52s | Max: 12m 13s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total: 10m 23s | Avg: 10m 23s | Max: 10m 23s
      🟩 90a                Pass: 100%/1   | Total: 12m 27s | Avg: 12m 27s | Max: 12m 27s
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 44m 29s | Avg: 11m 07s | Max: 12m 36s
      🟩 20                 Pass: 100%/16  | Total:  3h 49m | Avg: 14m 21s | Max: 17m 52s | Hits: 121%/522   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 7m 18s | Avg: 3m 39s | Max: 5m 04s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  7m 18s | Avg:  3m 39s | Max:  5m 04s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  7m 18s | Avg:  3m 39s | Max:  5m 04s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  7m 18s | Avg:  3m 39s | Max:  5m 04s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  7m 18s | Avg:  3m 39s | Max:  5m 04s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  7m 18s | Avg:  3m 39s | Max:  5m 04s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  7m 18s | Avg:  3m 39s | Max:  5m 04s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  7m 18s | Avg:  3m 39s | Max:  5m 04s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 14s | Avg:  2m 14s | Max:  2m 14s
      🟩 Test               Pass: 100%/1   | Total:  5m 04s | Avg:  5m 04s | Max:  5m 04s
    
  • 🟩 python: Pass: 100%/1 | Total: 26m 15s | Avg: 26m 15s | Max: 26m 15s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 26m 15s | Avg: 26m 15s | Max: 26m 15s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 26m 15s | Avg: 26m 15s | Max: 26m 15s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 26m 15s | Avg: 26m 15s | Max: 26m 15s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 26m 15s | Avg: 26m 15s | Max: 26m 15s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 26m 15s | Avg: 26m 15s | Max: 26m 15s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 26m 15s | Avg: 26m 15s | Max: 26m 15s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 26m 15s | Avg: 26m 15s | Max: 26m 15s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 26m 15s | Avg: 26m 15s | Max: 26m 15s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 152)

# Runner
110 linux-amd64-cpu16
14 windows-amd64-cpu16
10 linux-arm64-cpu16
8 linux-amd64-gpu-rtx2080-latest-1
6 linux-amd64-gpu-rtxa6000-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
1 linux-amd64-gpu-h100-latest-1

@bernhardmgruber bernhardmgruber merged commit 5ce5d28 into NVIDIA:main Jan 30, 2025
164 of 168 checks passed
@bernhardmgruber bernhardmgruber deleted the ptx_update_existing_instr branch January 30, 2025 09:52
Copy link
Contributor

Git push to origin failed for branch/2.8.x with exitcode 128

bernhardmgruber added a commit that referenced this pull request Jan 31, 2025
* mbarrier.expect_tx: Add missing source and test
It was already documented(!)

* cp.async.bulk.tensor: Add .{gather,scatter}4
* fence: Add .sync_restrict, .proxy.async.sync_restrict

Co-authored-by: Allard Hendriksen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants