Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds support for large number of segments and large number of items to DeviceSegmentedRadixSort #3402

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

elstehle
Copy link
Collaborator

Description

Closes #3133
Closes #3245

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Copy link

copy-pr-bot bot commented Jan 15, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@elstehle
Copy link
Collaborator Author

/ok to test

@elstehle elstehle marked this pull request as ready for review January 16, 2025 05:56
@elstehle elstehle requested review from a team as code owners January 16, 2025 05:56
@elstehle elstehle mentioned this pull request Jan 16, 2025
25 tasks
@elstehle elstehle force-pushed the enh/large-seg-support-seg-radix-sort branch from 6236d63 to 102f93c Compare January 16, 2025 06:10
Copy link
Contributor

🟨 CI finished in 1h 50m: Pass: 98%/78 | Total: 1d 06h | Avg: 23m 29s | Max: 1h 01m | Hits: 400%/12760
  • 🟨 cub: Pass: 97%/38 | Total: 23h 26m | Avg: 37m 01s | Max: 1h 01m | Hits: 525%/3540

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  97%/36  | Total: 22h 03m | Avg: 36m 46s | Max:  1h 01m | Hits: 525%/3540  
      🟩 arm64              Pass: 100%/2   | Total:  1h 22m | Avg: 41m 25s | Max: 41m 32s
    🔍 ctk: 12.6 🔍
      🟩 12.0               Pass: 100%/5   | Total:  2h 53m | Avg: 34m 40s | Max: 37m 21s | Hits: 533%/885   
      🟩 12.5               Pass: 100%/2   | Total:  1h 18m | Avg: 39m 06s | Max: 40m 14s
      🔍 12.6               Pass:  96%/31  | Total: 19h 15m | Avg: 37m 15s | Max:  1h 01m | Hits: 522%/2655  
    🔍 cudacxx: nvcc12.6 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 43m | Avg: 51m 31s | Max: 52m 24s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 53m | Avg: 34m 40s | Max: 37m 21s | Hits: 533%/885   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 18m | Avg: 39m 06s | Max: 40m 14s
      🔍 nvcc12.6           Pass:  96%/29  | Total: 17h 32m | Avg: 36m 16s | Max:  1h 01m | Hits: 522%/2655  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 43m | Avg: 51m 31s | Max: 52m 24s
      🔍 nvcc               Pass:  97%/36  | Total: 21h 43m | Avg: 36m 12s | Max:  1h 01m | Hits: 525%/3540  
    🔍 cxx: GCC13 🔍
      🟩 Clang14            Pass: 100%/4   | Total:  2h 23m | Avg: 35m 51s | Max: 37m 44s
      🟩 Clang15            Pass: 100%/1   | Total: 34m 55s | Avg: 34m 55s | Max: 34m 55s
      🟩 Clang16            Pass: 100%/1   | Total: 34m 36s | Avg: 34m 36s | Max: 34m 36s
      🟩 Clang17            Pass: 100%/1   | Total: 35m 13s | Avg: 35m 13s | Max: 35m 13s
      🟩 Clang18            Pass: 100%/7   | Total:  4h 41m | Avg: 40m 11s | Max: 52m 24s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 11m | Avg: 35m 43s | Max: 37m 21s
      🟩 GCC8               Pass: 100%/1   | Total: 34m 30s | Avg: 34m 30s | Max: 34m 30s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 09m | Avg: 34m 51s | Max: 35m 02s
      🟩 GCC10              Pass: 100%/1   | Total: 36m 30s | Avg: 36m 30s | Max: 36m 30s
      🟩 GCC11              Pass: 100%/1   | Total: 34m 23s | Avg: 34m 23s | Max: 34m 23s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 18m | Avg: 26m 00s | Max: 39m 50s
      🔍 GCC13              Pass:  87%/8   | Total:  4h 29m | Avg: 33m 41s | Max: 51m 09s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 31m | Avg: 45m 54s | Max:  1h 01m | Hits: 527%/1770  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  1h 53m | Avg: 56m 33s | Max: 58m 12s | Hits: 522%/1770  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 18m | Avg: 39m 06s | Max: 40m 14s
    🔍 cxx_family: GCC 🔍
      🟩 Clang              Pass: 100%/14  | Total:  8h 49m | Avg: 37m 49s | Max: 52m 24s
      🔍 GCC                Pass:  94%/18  | Total:  9h 54m | Avg: 33m 00s | Max: 51m 09s
      🟩 MSVC               Pass: 100%/4   | Total:  3h 24m | Avg: 51m 13s | Max:  1h 01m | Hits: 525%/3540  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 18m | Avg: 39m 06s | Max: 40m 14s
    🔍 gpu: v100 🔍
      🟩 h100               Pass: 100%/2   | Total: 38m 12s | Avg: 19m 06s | Max: 31m 12s
      🔍 v100               Pass:  97%/36  | Total: 22h 48m | Avg: 38m 00s | Max:  1h 01m | Hits: 525%/3540  
    🔍 jobs: TestGPU 🔍
      🟩 Build              Pass: 100%/31  | Total: 19h 25m | Avg: 37m 35s | Max:  1h 01m | Hits: 525%/3540  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 51m 09s | Avg: 51m 09s | Max: 51m 09s
      🟩 GraphCapture       Pass: 100%/1   | Total: 33m 48s | Avg: 33m 48s | Max: 33m 48s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 45m | Avg: 35m 03s | Max: 37m 29s
      🔍 TestGPU            Pass:  50%/2   | Total: 51m 32s | Avg: 25m 46s | Max: 26m 07s
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/14  | Total:  9h 15m | Avg: 39m 40s | Max:  1h 01m | Hits: 526%/2655  
      🔍 20                 Pass:  95%/24  | Total: 14h 11m | Avg: 35m 28s | Max: 58m 12s | Hits: 522%/885   
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 38m 12s | Avg: 19m 06s | Max: 31m 12s
      🟩 90a                Pass: 100%/1   | Total:  7m 11s | Avg:  7m 11s | Max:  7m 11s
    
  • 🟩 thrust: Pass: 100%/37 | Total: 6h 28m | Avg: 10m 30s | Max: 38m 11s | Hits: 352%/9220

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 19m 09s | Avg:  9m 34s | Max: 12m 55s
    🟩 cpu
      🟩 amd64              Pass: 100%/35  | Total:  6h 19m | Avg: 10m 50s | Max: 38m 11s | Hits: 352%/9220  
      🟩 arm64              Pass: 100%/2   | Total:  9m 33s | Avg:  4m 46s | Max:  4m 55s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 51m 21s | Avg: 10m 16s | Max: 31m 15s | Hits: 353%/1844  
      🟩 12.5               Pass: 100%/2   | Total: 28m 45s | Avg: 14m 22s | Max: 14m 24s
      🟩 12.6               Pass: 100%/30  | Total:  5h 08m | Avg: 10m 17s | Max: 38m 11s | Hits: 352%/7376  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  9m 51s | Avg:  4m 55s | Max:  4m 59s
      🟩 nvcc12.0           Pass: 100%/5   | Total: 51m 21s | Avg: 10m 16s | Max: 31m 15s | Hits: 353%/1844  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 28m 45s | Avg: 14m 22s | Max: 14m 24s
      🟩 nvcc12.6           Pass: 100%/28  | Total:  4h 58m | Avg: 10m 40s | Max: 38m 11s | Hits: 352%/7376  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 51s | Avg:  4m 55s | Max:  4m 59s
      🟩 nvcc               Pass: 100%/35  | Total:  6h 18m | Avg: 10m 49s | Max: 38m 11s | Hits: 352%/9220  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 21m 31s | Avg:  5m 22s | Max:  5m 36s
      🟩 Clang15            Pass: 100%/1   | Total:  5m 25s | Avg:  5m 25s | Max:  5m 25s
      🟩 Clang16            Pass: 100%/1   | Total:  5m 22s | Avg:  5m 22s | Max:  5m 22s
      🟩 Clang17            Pass: 100%/1   | Total:  5m 17s | Avg:  5m 17s | Max:  5m 17s
      🟩 Clang18            Pass: 100%/7   | Total: 46m 17s | Avg:  6m 36s | Max: 12m 23s
      🟩 GCC7               Pass: 100%/2   | Total: 10m 28s | Avg:  5m 14s | Max:  5m 49s
      🟩 GCC8               Pass: 100%/1   | Total:  5m 03s | Avg:  5m 03s | Max:  5m 03s
      🟩 GCC9               Pass: 100%/2   | Total: 10m 14s | Avg:  5m 07s | Max:  5m 25s
      🟩 GCC10              Pass: 100%/1   | Total:  5m 21s | Avg:  5m 21s | Max:  5m 21s
      🟩 GCC11              Pass: 100%/1   | Total:  5m 57s | Avg:  5m 57s | Max:  5m 57s
      🟩 GCC12              Pass: 100%/1   | Total:  6m 18s | Avg:  6m 18s | Max:  6m 18s
      🟩 GCC13              Pass: 100%/8   | Total:  1h 01m | Avg:  7m 40s | Max: 13m 06s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 00m | Avg: 30m 26s | Max: 31m 15s | Hits: 352%/3688  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  1h 50m | Avg: 36m 48s | Max: 38m 11s | Hits: 352%/5532  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 28m 45s | Avg: 14m 22s | Max: 14m 24s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/14  | Total:  1h 23m | Avg:  5m 59s | Max: 12m 23s
      🟩 GCC                Pass: 100%/16  | Total:  1h 44m | Avg:  6m 33s | Max: 13m 06s
      🟩 MSVC               Pass: 100%/5   | Total:  2h 51m | Avg: 34m 15s | Max: 38m 11s | Hits: 352%/9220  
      🟩 NVHPC              Pass: 100%/2   | Total: 28m 45s | Avg: 14m 22s | Max: 14m 24s
    🟩 gpu
      🟩 v100               Pass: 100%/37  | Total:  6h 28m | Avg: 10m 30s | Max: 38m 11s | Hits: 352%/9220  
    🟩 jobs
      🟩 Build              Pass: 100%/31  | Total:  4h 57m | Avg:  9m 35s | Max: 38m 11s | Hits: 349%/7376  
      🟩 TestCPU            Pass: 100%/3   | Total: 53m 13s | Avg: 17m 44s | Max: 37m 08s | Hits: 365%/1844  
      🟩 TestGPU            Pass: 100%/3   | Total: 38m 24s | Avg: 12m 48s | Max: 13m 06s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total:  4m 33s | Avg:  4m 33s | Max:  4m 33s
    🟩 std
      🟩 17                 Pass: 100%/14  | Total:  2h 43m | Avg: 11m 39s | Max: 35m 07s | Hits: 352%/5532  
      🟩 20                 Pass: 100%/21  | Total:  3h 26m | Avg:  9m 49s | Max: 38m 11s | Hits: 353%/3688  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 9m 18s | Avg: 4m 39s | Max: 7m 26s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 26s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 26s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 26s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 26s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 26s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 26s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 26s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  1m 52s | Avg:  1m 52s | Max:  1m 52s
      🟩 Test               Pass: 100%/1   | Total:  7m 26s | Avg:  7m 26s | Max:  7m 26s
    
  • 🟩 python: Pass: 100%/1 | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 27m 17s | Avg: 27m 17s | Max: 27m 17s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 78)

# Runner
53 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@elstehle elstehle force-pushed the enh/large-seg-support-seg-radix-sort branch from 77790f5 to 1785fb4 Compare January 23, 2025 11:25
Copy link
Contributor

🟩 CI finished in 3h 04m: Pass: 100%/78 | Total: 1d 15h | Avg: 30m 39s | Max: 1h 06m | Hits: 385%/12708
  • 🟩 cub: Pass: 100%/38 | Total: 1d 04h | Avg: 45m 13s | Max: 1h 06m | Hits: 488%/3528

    🟩 cpu
      🟩 amd64              Pass: 100%/36  | Total:  1d 02h | Avg: 44m 45s | Max:  1h 06m | Hits: 488%/3528  
      🟩 arm64              Pass: 100%/2   | Total:  1h 47m | Avg: 53m 31s | Max: 56m 02s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 07m | Avg: 49m 24s | Max:  1h 01m | Hits: 488%/882   
      🟩 12.5               Pass: 100%/2   | Total:  1h 45m | Avg: 52m 30s | Max: 54m 48s
      🟩 12.6               Pass: 100%/31  | Total: 22h 46m | Avg: 44m 04s | Max:  1h 06m | Hits: 488%/2646  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 51m | Avg: 55m 48s | Max: 56m 25s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 07m | Avg: 49m 24s | Max:  1h 01m | Hits: 488%/882   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 45m | Avg: 52m 30s | Max: 54m 48s
      🟩 nvcc12.6           Pass: 100%/29  | Total: 20h 54m | Avg: 43m 16s | Max:  1h 06m | Hits: 488%/2646  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 51m | Avg: 55m 48s | Max: 56m 25s
      🟩 nvcc               Pass: 100%/36  | Total:  1d 02h | Avg: 44m 38s | Max:  1h 06m | Hits: 488%/3528  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 04m | Avg: 46m 06s | Max: 48m 34s
      🟩 Clang15            Pass: 100%/1   | Total: 43m 02s | Avg: 43m 02s | Max: 43m 02s
      🟩 Clang16            Pass: 100%/1   | Total: 43m 59s | Avg: 43m 59s | Max: 43m 59s
      🟩 Clang17            Pass: 100%/1   | Total: 43m 49s | Avg: 43m 49s | Max: 43m 49s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 21m | Avg: 45m 53s | Max: 56m 25s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 32m | Avg: 46m 10s | Max: 47m 53s
      🟩 GCC8               Pass: 100%/1   | Total: 43m 11s | Avg: 43m 11s | Max: 43m 11s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 32m | Avg: 46m 07s | Max: 48m 39s
      🟩 GCC10              Pass: 100%/1   | Total: 43m 09s | Avg: 43m 09s | Max: 43m 09s
      🟩 GCC11              Pass: 100%/1   | Total: 43m 50s | Avg: 43m 50s | Max: 43m 50s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 38m | Avg: 32m 53s | Max: 48m 07s
      🟩 GCC13              Pass: 100%/8   | Total:  5h 10m | Avg: 38m 45s | Max: 55m 35s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 02m | Avg:  1h 01m | Max:  1h 01m | Hits: 488%/1764  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 06m | Hits: 488%/1764  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 45m | Avg: 52m 30s | Max: 54m 48s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/14  | Total: 10h 36m | Avg: 45m 28s | Max: 56m 25s
      🟩 GCC                Pass: 100%/18  | Total: 12h 03m | Avg: 40m 11s | Max: 55m 35s
      🟩 MSVC               Pass: 100%/4   | Total:  4h 13m | Avg:  1h 03m | Max:  1h 06m | Hits: 488%/3528  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 45m | Avg: 52m 30s | Max: 54m 48s
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 50m 32s | Avg: 25m 16s | Max: 31m 30s
      🟩 v100               Pass: 100%/36  | Total:  1d 03h | Avg: 46m 20s | Max:  1h 06m | Hits: 488%/3528  
    🟩 jobs
      🟩 Build              Pass: 100%/31  | Total:  1d 00h | Avg: 47m 37s | Max:  1h 06m | Hits: 488%/3528  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 55m 35s | Avg: 55m 35s | Max: 55m 35s
      🟩 GraphCapture       Pass: 100%/1   | Total: 33m 25s | Avg: 33m 25s | Max: 33m 25s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 38m | Avg: 32m 53s | Max: 36m 40s
      🟩 TestGPU            Pass: 100%/2   | Total: 54m 22s | Avg: 27m 11s | Max: 27m 40s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 50m 32s | Avg: 25m 16s | Max: 31m 30s
      🟩 90a                Pass: 100%/1   | Total: 18m 23s | Avg: 18m 23s | Max: 18m 23s
    🟩 std
      🟩 17                 Pass: 100%/14  | Total: 11h 53m | Avg: 50m 59s | Max:  1h 05m | Hits: 488%/2646  
      🟩 20                 Pass: 100%/24  | Total: 16h 44m | Avg: 41m 51s | Max:  1h 06m | Hits: 488%/882   
    
  • 🟩 thrust: Pass: 100%/37 | Total: 10h 07m | Avg: 16m 25s | Max: 39m 03s | Hits: 345%/9180

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 23m 51s | Avg: 11m 55s | Max: 12m 32s
    🟩 cpu
      🟩 amd64              Pass: 100%/35  | Total:  9h 43m | Avg: 16m 40s | Max: 39m 03s | Hits: 345%/9180  
      🟩 arm64              Pass: 100%/2   | Total: 23m 51s | Avg: 11m 55s | Max: 12m 19s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 25m | Avg: 17m 02s | Max: 34m 34s | Hits: 340%/1836  
      🟩 12.5               Pass: 100%/2   | Total: 55m 49s | Avg: 27m 54s | Max: 27m 59s
      🟩 12.6               Pass: 100%/30  | Total:  7h 46m | Avg: 15m 33s | Max: 39m 03s | Hits: 346%/7344  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 24m 11s | Avg: 12m 05s | Max: 12m 43s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 25m | Avg: 17m 02s | Max: 34m 34s | Hits: 340%/1836  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 55m 49s | Avg: 27m 54s | Max: 27m 59s
      🟩 nvcc12.6           Pass: 100%/28  | Total:  7h 22m | Avg: 15m 47s | Max: 39m 03s | Hits: 346%/7344  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 24m 11s | Avg: 12m 05s | Max: 12m 43s
      🟩 nvcc               Pass: 100%/35  | Total:  9h 43m | Avg: 16m 40s | Max: 39m 03s | Hits: 345%/9180  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 50m 48s | Avg: 12m 42s | Max: 13m 25s
      🟩 Clang15            Pass: 100%/1   | Total: 13m 06s | Avg: 13m 06s | Max: 13m 06s
      🟩 Clang16            Pass: 100%/1   | Total: 14m 09s | Avg: 14m 09s | Max: 14m 09s
      🟩 Clang17            Pass: 100%/1   | Total: 13m 32s | Avg: 13m 32s | Max: 13m 32s
      🟩 Clang18            Pass: 100%/7   | Total:  1h 26m | Avg: 12m 23s | Max: 15m 14s
      🟩 GCC7               Pass: 100%/2   | Total: 26m 55s | Avg: 13m 27s | Max: 13m 42s
      🟩 GCC8               Pass: 100%/1   | Total: 13m 10s | Avg: 13m 10s | Max: 13m 10s
      🟩 GCC9               Pass: 100%/2   | Total: 27m 08s | Avg: 13m 34s | Max: 14m 10s
      🟩 GCC10              Pass: 100%/1   | Total: 14m 10s | Avg: 14m 10s | Max: 14m 10s
      🟩 GCC11              Pass: 100%/1   | Total: 13m 31s | Avg: 13m 31s | Max: 13m 31s
      🟩 GCC12              Pass: 100%/1   | Total: 13m 09s | Avg: 13m 09s | Max: 13m 09s
      🟩 GCC13              Pass: 100%/8   | Total:  1h 30m | Avg: 11m 20s | Max: 13m 01s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 09m | Avg: 34m 48s | Max: 35m 02s | Hits: 340%/3672  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  1h 45m | Avg: 35m 00s | Max: 39m 03s | Hits: 348%/5508  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 55m 49s | Avg: 27m 54s | Max: 27m 59s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/14  | Total:  2h 58m | Avg: 12m 44s | Max: 15m 14s
      🟩 GCC                Pass: 100%/16  | Total:  3h 18m | Avg: 12m 25s | Max: 14m 10s
      🟩 MSVC               Pass: 100%/5   | Total:  2h 54m | Avg: 34m 55s | Max: 39m 03s | Hits: 345%/9180  
      🟩 NVHPC              Pass: 100%/2   | Total: 55m 49s | Avg: 27m 54s | Max: 27m 59s
    🟩 gpu
      🟩 v100               Pass: 100%/37  | Total: 10h 07m | Avg: 16m 25s | Max: 39m 03s | Hits: 345%/9180  
    🟩 jobs
      🟩 Build              Pass: 100%/31  | Total:  8h 40m | Avg: 16m 47s | Max: 39m 03s | Hits: 340%/7344  
      🟩 TestCPU            Pass: 100%/3   | Total: 46m 08s | Avg: 15m 22s | Max: 30m 31s | Hits: 365%/1836  
      🟩 TestGPU            Pass: 100%/3   | Total: 40m 47s | Avg: 13m 35s | Max: 15m 14s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total:  8m 13s | Avg:  8m 13s | Max:  8m 13s
    🟩 std
      🟩 17                 Pass: 100%/14  | Total:  4h 23m | Avg: 18m 49s | Max: 35m 27s | Hits: 340%/5508  
      🟩 20                 Pass: 100%/21  | Total:  5h 20m | Avg: 15m 14s | Max: 39m 03s | Hits: 352%/3672  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 12m 24s | Avg: 6m 12s | Max: 10m 17s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 12m 24s | Avg:  6m 12s | Max: 10m 17s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total: 12m 24s | Avg:  6m 12s | Max: 10m 17s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total: 12m 24s | Avg:  6m 12s | Max: 10m 17s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 12m 24s | Avg:  6m 12s | Max: 10m 17s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 12m 24s | Avg:  6m 12s | Max: 10m 17s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 12m 24s | Avg:  6m 12s | Max: 10m 17s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total: 12m 24s | Avg:  6m 12s | Max: 10m 17s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 07s | Avg:  2m 07s | Max:  2m 07s
      🟩 Test               Pass: 100%/1   | Total: 10m 17s | Avg: 10m 17s | Max: 10m 17s
    
  • 🟩 python: Pass: 100%/1 | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 52m 58s | Avg: 52m 58s | Max: 52m 58s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 78)

# Runner
53 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@elstehle
Copy link
Collaborator Author

pre-commit.ci autofix

Copy link

copy-pr-bot bot commented Jan 27, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Review
1 participant