Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[STF] Executable CUDA graphs caching policies #3868

Merged
merged 19 commits into from
Feb 24, 2025

Conversation

caugonnet
Copy link
Contributor

Facilities to choose whether executable CUDA graphs should be cached or not, and query what was the cache behaviour. This may lead to reporting inefficient use of CUDA graphs, or give user better control. (users may be tools relying on CUDA graphs)

Description

closes

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

… graphs should be cached or not, and query what was the cache behaviour.
@caugonnet caugonnet added the stf Sequential Task Flow programming model label Feb 20, 2025
@caugonnet caugonnet self-assigned this Feb 20, 2025
Copy link

copy-pr-bot bot commented Feb 20, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@caugonnet caugonnet changed the title [STF] [STF] Executable CUDA graphs caching policies Feb 20, 2025
@@ -233,6 +233,16 @@ class graph_ctx : public backend_ctx<graph_ctx>
return *_graph;
}

void graph_set_cache_policy(executable_graph_cache_policy<::std::function<bool(size_t)>> policy) override
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a mere bool value sufficient instead of this ?

@caugonnet
Copy link
Contributor Author

/ok to test

Copy link
Contributor

🟨 CI finished in 18m 47s: Pass: 18%/22 | Total: 1h 19m | Avg: 3m 35s | Max: 10m 54s | Hits: 73%/1234
  • 🟨 cudax: Pass: 18%/22 | Total: 1h 19m | Avg: 3m 35s | Max: 10m 54s | Hits: 73%/1234

    🔍 ctk: 12.8 🔍
      🟩 12.0               Pass: 100%/1   | Total: 10m 30s | Avg: 10m 30s | Max: 10m 30s | Hits:  61%/262   
      🟩 12.5               Pass: 100%/2   | Total: 12m 58s | Avg:  6m 29s | Max:  6m 35s | Hits:  82%/710   
      🔍 12.8               Pass:   5%/19  | Total: 55m 36s | Avg:  2m 55s | Max: 10m 54s | Hits:  61%/262   
    🔍 cudacxx: nvcc12.8 🔍
      🟩 nvcc12.0           Pass: 100%/1   | Total: 10m 30s | Avg: 10m 30s | Max: 10m 30s | Hits:  61%/262   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 12m 58s | Avg:  6m 29s | Max:  6m 35s | Hits:  82%/710   
      🔍 nvcc12.8           Pass:   5%/19  | Total: 55m 36s | Avg:  2m 55s | Max: 10m 54s | Hits:  61%/262   
    🟨 cxx
      🟥 Clang14            Pass:   0%/1   | Total:  2m 53s | Avg:  2m 53s | Max:  2m 53s
      🟥 Clang15            Pass:   0%/1   | Total:  2m 58s | Avg:  2m 58s | Max:  2m 58s
      🟥 Clang16            Pass:   0%/1   | Total:  2m 52s | Avg:  2m 52s | Max:  2m 52s
      🟥 Clang17            Pass:   0%/1   | Total:  3m 08s | Avg:  3m 08s | Max:  3m 08s
      🟥 Clang18            Pass:   0%/4   | Total:  8m 13s | Avg:  2m 03s | Max:  3m 31s
      🟥 GCC10              Pass:   0%/1   | Total:  3m 20s | Avg:  3m 20s | Max:  3m 20s
      🟥 GCC11              Pass:   0%/1   | Total:  3m 35s | Avg:  3m 35s | Max:  3m 35s
      🟥 GCC12              Pass:   0%/2   | Total:  3m 23s | Avg:  1m 41s | Max:  3m 23s
      🟥 GCC13              Pass:   0%/6   | Total: 14m 20s | Avg:  2m 23s | Max:  3m 08s
      🟩 MSVC14.39          Pass: 100%/1   | Total: 10m 30s | Avg: 10m 30s | Max: 10m 30s | Hits:  61%/262   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 10m 54s | Avg: 10m 54s | Max: 10m 54s | Hits:  61%/262   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 12m 58s | Avg:  6m 29s | Max:  6m 35s | Hits:  82%/710   
    🟨 cxx_family
      🟥 Clang              Pass:   0%/8   | Total: 20m 04s | Avg:  2m 30s | Max:  3m 31s
      🟥 GCC                Pass:   0%/10  | Total: 24m 38s | Avg:  2m 27s | Max:  3m 35s
      🟩 MSVC               Pass: 100%/2   | Total: 21m 24s | Avg: 10m 42s | Max: 10m 54s | Hits:  61%/524   
      🟩 NVHPC              Pass: 100%/2   | Total: 12m 58s | Avg:  6m 29s | Max:  6m 35s | Hits:  82%/710   
    🟨 cudacxx_family
      🟨 nvcc               Pass:  18%/22  | Total:  1h 19m | Avg:  3m 35s | Max: 10m 54s | Hits:  73%/1234  
    🟨 cpu
      🟨 amd64              Pass:  22%/18  | Total:  1h 08m | Avg:  3m 49s | Max: 10m 54s | Hits:  73%/1234  
      🟥 arm64              Pass:   0%/4   | Total: 10m 07s | Avg:  2m 31s | Max:  2m 46s
    🟨 gpu
      🟥 h100               Pass:   0%/2   | Total:  2m 56s | Avg:  1m 28s | Max:  2m 56s
      🟨 rtx2080            Pass:  20%/20  | Total:  1h 16m | Avg:  3m 48s | Max: 10m 54s | Hits:  73%/1234  
    🟨 jobs
      🟨 Build              Pass:  21%/19  | Total:  1h 19m | Avg:  4m 09s | Max: 10m 54s | Hits:  73%/1234  
      🟥 Test               Pass:   0%/3  
    🟥 sm
      🟥 90                 Pass:   0%/3   | Total:  5m 47s | Avg:  1m 55s | Max:  2m 56s
      🟥 90a                Pass:   0%/1   | Total:  3m 08s | Avg:  3m 08s | Max:  3m 08s
    🟨 std
      🟨 17                 Pass:  25%/4   | Total: 14m 12s | Avg:  3m 33s | Max:  6m 23s | Hits:  82%/355   
      🟨 20                 Pass:  16%/18  | Total:  1h 04m | Avg:  3m 36s | Max: 10m 54s | Hits:  69%/879   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 22)

# Runner
13 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

@caugonnet
Copy link
Contributor Author

/ok to test

Copy link
Contributor

🟨 CI finished in 17m 44s: Pass: 9%/22 | Total: 1h 29m | Avg: 4m 03s | Max: 17m 44s | Hits: 61%/524
  • 🟨 cudax: Pass: 9%/22 | Total: 1h 29m | Avg: 4m 03s | Max: 17m 44s | Hits: 61%/524

    🟨 ctk
      🟩 12.0               Pass: 100%/1   | Total: 10m 02s | Avg: 10m 02s | Max: 10m 02s | Hits:  61%/262   
      🟥 12.5               Pass:   0%/2   | Total: 25m 39s | Avg: 12m 49s | Max: 17m 44s
      🟨 12.8               Pass:   5%/19  | Total: 53m 29s | Avg:  2m 48s | Max: 10m 17s | Hits:  61%/262   
    🟨 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total: 10m 02s | Avg: 10m 02s | Max: 10m 02s | Hits:  61%/262   
      🟥 nvcc12.5           Pass:   0%/2   | Total: 25m 39s | Avg: 12m 49s | Max: 17m 44s
      🟨 nvcc12.8           Pass:   5%/19  | Total: 53m 29s | Avg:  2m 48s | Max: 10m 17s | Hits:  61%/262   
    🟨 cxx
      🟥 Clang14            Pass:   0%/1   | Total:  2m 57s | Avg:  2m 57s | Max:  2m 57s
      🟥 Clang15            Pass:   0%/1   | Total:  3m 04s | Avg:  3m 04s | Max:  3m 04s
      🟥 Clang16            Pass:   0%/1   | Total:  3m 01s | Avg:  3m 01s | Max:  3m 01s
      🟥 Clang17            Pass:   0%/1   | Total:  3m 04s | Avg:  3m 04s | Max:  3m 04s
      🟥 Clang18            Pass:   0%/4   | Total:  7m 50s | Avg:  1m 57s | Max:  3m 06s
      🟥 GCC10              Pass:   0%/1   | Total:  3m 16s | Avg:  3m 16s | Max:  3m 16s
      🟥 GCC11              Pass:   0%/1   | Total:  3m 04s | Avg:  3m 04s | Max:  3m 04s
      🟥 GCC12              Pass:   0%/2   | Total:  3m 15s | Avg:  1m 37s | Max:  3m 15s
      🟥 GCC13              Pass:   0%/6   | Total: 13m 41s | Avg:  2m 16s | Max:  2m 56s
      🟩 MSVC14.39          Pass: 100%/1   | Total: 10m 02s | Avg: 10m 02s | Max: 10m 02s | Hits:  61%/262   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 10m 17s | Avg: 10m 17s | Max: 10m 17s | Hits:  61%/262   
      🟥 NVHPC24.7          Pass:   0%/2   | Total: 25m 39s | Avg: 12m 49s | Max: 17m 44s
    🟨 cxx_family
      🟥 Clang              Pass:   0%/8   | Total: 19m 56s | Avg:  2m 29s | Max:  3m 06s
      🟥 GCC                Pass:   0%/10  | Total: 23m 16s | Avg:  2m 19s | Max:  3m 16s
      🟩 MSVC               Pass: 100%/2   | Total: 20m 19s | Avg: 10m 09s | Max: 10m 17s | Hits:  61%/524   
      🟥 NVHPC              Pass:   0%/2   | Total: 25m 39s | Avg: 12m 49s | Max: 17m 44s
    🟨 cudacxx_family
      🟨 nvcc               Pass:   9%/22  | Total:  1h 29m | Avg:  4m 03s | Max: 17m 44s | Hits:  61%/524   
    🟨 cpu
      🟨 amd64              Pass:  11%/18  | Total:  1h 19m | Avg:  4m 24s | Max: 17m 44s | Hits:  61%/524   
      🟥 arm64              Pass:   0%/4   | Total:  9m 55s | Avg:  2m 28s | Max:  2m 39s
    🟨 gpu
      🟥 h100               Pass:   0%/2   | Total:  2m 56s | Avg:  1m 28s | Max:  2m 56s
      🟨 rtx2080            Pass:  10%/20  | Total:  1h 26m | Avg:  4m 18s | Max: 17m 44s | Hits:  61%/524   
    🟨 jobs
      🟨 Build              Pass:  10%/19  | Total:  1h 29m | Avg:  4m 41s | Max: 17m 44s | Hits:  61%/524   
      🟥 Test               Pass:   0%/3  
    🟥 sm
      🟥 90                 Pass:   0%/3   | Total:  5m 45s | Avg:  1m 55s | Max:  2m 56s
      🟥 90a                Pass:   0%/1   | Total:  2m 45s | Avg:  2m 45s | Max:  2m 45s
    🟨 std
      🟥 17                 Pass:   0%/4   | Total: 25m 32s | Avg:  6m 23s | Max: 17m 44s
      🟨 20                 Pass:  11%/18  | Total:  1h 03m | Avg:  3m 32s | Max: 10m 17s | Hits:  61%/524   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 22)

# Runner
13 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

@caugonnet
Copy link
Contributor Author

/ok to test

Copy link
Contributor

🟨 CI finished in 51m 57s: Pass: 90%/22 | Total: 4h 50m | Avg: 13m 12s | Max: 18m 02s | Hits: 64%/10608
  • 🟨 cudax: Pass: 90%/22 | Total: 4h 50m | Avg: 13m 12s | Max: 18m 02s | Hits: 64%/10608

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  88%/18  | Total:  4h 00m | Avg: 13m 20s | Max: 18m 02s | Hits:  66%/8368  
      🟩 arm64              Pass: 100%/4   | Total: 50m 21s | Avg: 12m 35s | Max: 13m 43s | Hits:  57%/2240  
    🚨 ctk: 12.5 🚨
      🟩 12.0               Pass: 100%/1   | Total: 10m 08s | Avg: 10m 08s | Max: 10m 08s | Hits:  61%/262   
      🔥 12.5               Pass:   0%/2   | Total: 25m 13s | Avg: 12m 36s | Max: 16m 43s
      🟩 12.8               Pass: 100%/19  | Total:  4h 15m | Avg: 13m 25s | Max: 18m 02s | Hits:  64%/10346 
    🚨 cudacxx: nvcc12.5 🚨
      🟩 nvcc12.0           Pass: 100%/1   | Total: 10m 08s | Avg: 10m 08s | Max: 10m 08s | Hits:  61%/262   
      🔥 nvcc12.5           Pass:   0%/2   | Total: 25m 13s | Avg: 12m 36s | Max: 16m 43s
      🟩 nvcc12.8           Pass: 100%/19  | Total:  4h 15m | Avg: 13m 25s | Max: 18m 02s | Hits:  64%/10346 
    🚨 cxx: NVHPC24.7 🚨
      🟩 Clang14            Pass: 100%/1   | Total: 14m 27s | Avg: 14m 27s | Max: 14m 27s | Hits:  58%/562   
      🟩 Clang15            Pass: 100%/1   | Total: 15m 15s | Avg: 15m 15s | Max: 15m 15s | Hits:  58%/560   
      🟩 Clang16            Pass: 100%/1   | Total: 14m 15s | Avg: 14m 15s | Max: 14m 15s | Hits:  58%/560   
      🟩 Clang17            Pass: 100%/1   | Total: 14m 05s | Avg: 14m 05s | Max: 14m 05s | Hits:  58%/560   
      🟩 Clang18            Pass: 100%/4   | Total: 56m 40s | Avg: 14m 10s | Max: 18m 02s | Hits:  68%/2240  
      🟩 GCC10              Pass: 100%/1   | Total: 14m 02s | Avg: 14m 02s | Max: 14m 02s | Hits:  57%/562   
      🟩 GCC11              Pass: 100%/1   | Total: 14m 31s | Avg: 14m 31s | Max: 14m 31s | Hits:  57%/560   
      🟩 GCC12              Pass: 100%/2   | Total: 29m 10s | Avg: 14m 35s | Max: 16m 46s | Hits:  78%/1120  
      🟩 GCC13              Pass: 100%/6   | Total:  1h 12m | Avg: 12m 04s | Max: 13m 46s | Hits:  64%/3360  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 10m 08s | Avg: 10m 08s | Max: 10m 08s | Hits:  61%/262   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 10m 15s | Avg: 10m 15s | Max: 10m 15s | Hits:  61%/262   
      🔥 NVHPC24.7          Pass:   0%/2   | Total: 25m 13s | Avg: 12m 36s | Max: 16m 43s
    🚨 cxx_family: NVHPC 🚨
      🟩 Clang              Pass: 100%/8   | Total:  1h 54m | Avg: 14m 20s | Max: 18m 02s | Hits:  63%/4482  
      🟩 GCC                Pass: 100%/10  | Total:  2h 10m | Avg: 13m 01s | Max: 16m 46s | Hits:  66%/5602  
      🟩 MSVC               Pass: 100%/2   | Total: 20m 23s | Avg: 10m 11s | Max: 10m 15s | Hits:  61%/524   
      🔥 NVHPC              Pass:   0%/2   | Total: 25m 13s | Avg: 12m 36s | Max: 16m 43s
    🔍 gpu: rtx2080 🔍
      🟩 h100               Pass: 100%/2   | Total: 25m 19s | Avg: 12m 39s | Max: 13m 46s | Hits:  78%/1120  
      🔍 rtx2080            Pass:  90%/20  | Total:  4h 25m | Avg: 13m 15s | Max: 18m 02s | Hits:  63%/9488  
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  89%/19  | Total:  4h 06m | Avg: 12m 57s | Max: 16m 46s | Hits:  58%/8928  
      🟩 Test               Pass: 100%/3   | Total: 44m 12s | Avg: 14m 44s | Max: 18m 02s | Hits:  99%/1680  
    🟨 cudacxx_family
      🟨 nvcc               Pass:  90%/22  | Total:  4h 50m | Avg: 13m 12s | Max: 18m 02s | Hits:  64%/10608 
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 35m 42s | Avg: 11m 54s | Max: 13m 46s | Hits:  71%/1680  
      🟩 90a                Pass: 100%/1   | Total: 10m 51s | Avg: 10m 51s | Max: 10m 51s | Hits:  57%/560   
    🟨 std
      🟨 17                 Pass:  75%/4   | Total: 51m 11s | Avg: 12m 47s | Max: 16m 43s | Hits:  57%/1680  
      🟨 20                 Pass:  94%/18  | Total:  3h 59m | Avg: 13m 17s | Max: 18m 02s | Hits:  65%/8928  
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 22)

# Runner
13 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

@caugonnet
Copy link
Contributor Author

/ok to test

Copy link
Contributor

🟨 CI finished in 32m 12s: Pass: 90%/22 | Total: 4h 49m | Avg: 13m 08s | Max: 17m 08s | Hits: 64%/10626
  • 🟨 cudax: Pass: 90%/22 | Total: 4h 49m | Avg: 13m 08s | Max: 17m 08s | Hits: 64%/10626

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  88%/18  | Total:  3h 58m | Avg: 13m 13s | Max: 17m 08s | Hits:  66%/8382  
      🟩 arm64              Pass: 100%/4   | Total: 51m 00s | Avg: 12m 45s | Max: 13m 42s | Hits:  57%/2244  
    🚨 ctk: 12.5 🚨
      🟩 12.0               Pass: 100%/1   | Total: 11m 07s | Avg: 11m 07s | Max: 11m 07s | Hits:  61%/262   
      🔥 12.5               Pass:   0%/2   | Total: 24m 55s | Avg: 12m 27s | Max: 17m 08s
      🟩 12.8               Pass: 100%/19  | Total:  4h 13m | Avg: 13m 19s | Max: 17m 00s | Hits:  64%/10364 
    🚨 cudacxx: nvcc12.5 🚨
      🟩 nvcc12.0           Pass: 100%/1   | Total: 11m 07s | Avg: 11m 07s | Max: 11m 07s | Hits:  61%/262   
      🔥 nvcc12.5           Pass:   0%/2   | Total: 24m 55s | Avg: 12m 27s | Max: 17m 08s
      🟩 nvcc12.8           Pass: 100%/19  | Total:  4h 13m | Avg: 13m 19s | Max: 17m 00s | Hits:  64%/10364 
    🚨 cxx: NVHPC24.7 🚨
      🟩 Clang14            Pass: 100%/1   | Total: 13m 17s | Avg: 13m 17s | Max: 13m 17s | Hits:  58%/563   
      🟩 Clang15            Pass: 100%/1   | Total: 14m 58s | Avg: 14m 58s | Max: 14m 58s | Hits:  58%/561   
      🟩 Clang16            Pass: 100%/1   | Total: 15m 45s | Avg: 15m 45s | Max: 15m 45s | Hits:  58%/561   
      🟩 Clang17            Pass: 100%/1   | Total: 14m 46s | Avg: 14m 46s | Max: 14m 46s | Hits:  58%/561   
      🟩 Clang18            Pass: 100%/4   | Total: 52m 20s | Avg: 13m 05s | Max: 15m 50s | Hits:  68%/2244  
      🟩 GCC10              Pass: 100%/1   | Total: 13m 49s | Avg: 13m 49s | Max: 13m 49s | Hits:  57%/563   
      🟩 GCC11              Pass: 100%/1   | Total: 14m 23s | Avg: 14m 23s | Max: 14m 23s | Hits:  57%/561   
      🟩 GCC12              Pass: 100%/2   | Total: 30m 08s | Avg: 15m 04s | Max: 17m 00s | Hits:  78%/1122  
      🟩 GCC13              Pass: 100%/6   | Total:  1h 13m | Avg: 12m 12s | Max: 14m 08s | Hits:  64%/3366  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 11m 07s | Avg: 11m 07s | Max: 11m 07s | Hits:  61%/262   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 10m 23s | Avg: 10m 23s | Max: 10m 23s | Hits:  61%/262   
      🔥 NVHPC24.7          Pass:   0%/2   | Total: 24m 55s | Avg: 12m 27s | Max: 17m 08s
    🚨 cxx_family: NVHPC 🚨
      🟩 Clang              Pass: 100%/8   | Total:  1h 51m | Avg: 13m 53s | Max: 15m 50s | Hits:  63%/4490  
      🟩 GCC                Pass: 100%/10  | Total:  2h 11m | Avg: 13m 09s | Max: 17m 00s | Hits:  66%/5612  
      🟩 MSVC               Pass: 100%/2   | Total: 21m 30s | Avg: 10m 45s | Max: 11m 07s | Hits:  61%/524   
      🔥 NVHPC              Pass:   0%/2   | Total: 24m 55s | Avg: 12m 27s | Max: 17m 08s
    🔍 gpu: rtx2080 🔍
      🟩 h100               Pass: 100%/2   | Total: 25m 48s | Avg: 12m 54s | Max: 14m 08s | Hits:  78%/1122  
      🔍 rtx2080            Pass:  90%/20  | Total:  4h 23m | Avg: 13m 09s | Max: 17m 08s | Hits:  63%/9504  
    🔍 jobs: Build 🔍
      🔍 Build              Pass:  89%/19  | Total:  4h 10m | Avg: 13m 10s | Max: 17m 08s | Hits:  58%/8943  
      🟩 Test               Pass: 100%/3   | Total: 38m 49s | Avg: 12m 56s | Max: 14m 08s | Hits:  99%/1683  
    🟨 cudacxx_family
      🟨 nvcc               Pass:  90%/22  | Total:  4h 49m | Avg: 13m 08s | Max: 17m 08s | Hits:  64%/10626 
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 36m 26s | Avg: 12m 08s | Max: 14m 08s | Hits:  71%/1683  
      🟩 90a                Pass: 100%/1   | Total: 10m 47s | Avg: 10m 47s | Max: 10m 47s | Hits:  57%/561   
    🟨 std
      🟨 17                 Pass:  75%/4   | Total: 52m 07s | Avg: 13m 01s | Max: 17m 08s | Hits:  57%/1683  
      🟨 20                 Pass:  94%/18  | Total:  3h 57m | Avg: 13m 10s | Max: 17m 00s | Hits:  66%/8943  
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 22)

# Runner
13 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

@caugonnet
Copy link
Contributor Author

/ok to test

Copy link
Contributor

🟩 CI finished in 27m 56s: Pass: 100%/22 | Total: 1h 56m | Avg: 5m 16s | Max: 12m 12s | Hits: 96%/11338
  • 🟩 cudax: Pass: 100%/22 | Total: 1h 56m | Avg: 5m 16s | Max: 12m 12s | Hits: 96%/11338

    🟩 cpu
      🟩 amd64              Pass: 100%/18  | Total:  1h 45m | Avg:  5m 50s | Max: 12m 12s | Hits:  96%/9094  
      🟩 arm64              Pass: 100%/4   | Total: 10m 51s | Avg:  2m 42s | Max:  2m 48s | Hits:  99%/2244  
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total: 10m 02s | Avg: 10m 02s | Max: 10m 02s | Hits:  61%/262   
      🟩 12.5               Pass: 100%/2   | Total: 13m 57s | Avg:  6m 58s | Max:  7m 05s | Hits:  82%/712   
      🟩 12.8               Pass: 100%/19  | Total:  1h 32m | Avg:  4m 50s | Max: 12m 12s | Hits:  98%/10364 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total: 10m 02s | Avg: 10m 02s | Max: 10m 02s | Hits:  61%/262   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 13m 57s | Avg:  6m 58s | Max:  7m 05s | Hits:  82%/712   
      🟩 nvcc12.8           Pass: 100%/19  | Total:  1h 32m | Avg:  4m 50s | Max: 12m 12s | Hits:  98%/10364 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/22  | Total:  1h 56m | Avg:  5m 16s | Max: 12m 12s | Hits:  96%/11338 
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total:  3m 29s | Avg:  3m 29s | Max:  3m 29s | Hits: 100%/563   
      🟩 Clang15            Pass: 100%/1   | Total:  3m 26s | Avg:  3m 26s | Max:  3m 26s | Hits: 100%/561   
      🟩 Clang16            Pass: 100%/1   | Total:  3m 42s | Avg:  3m 42s | Max:  3m 42s | Hits: 100%/561   
      🟩 Clang17            Pass: 100%/1   | Total:  3m 17s | Avg:  3m 17s | Max:  3m 17s | Hits: 100%/561   
      🟩 Clang18            Pass: 100%/4   | Total: 20m 48s | Avg:  5m 12s | Max: 12m 12s | Hits: 100%/2244  
      🟩 GCC10              Pass: 100%/1   | Total:  3m 06s | Avg:  3m 06s | Max:  3m 06s | Hits:  99%/563   
      🟩 GCC11              Pass: 100%/1   | Total:  3m 20s | Avg:  3m 20s | Max:  3m 20s | Hits:  99%/561   
      🟩 GCC12              Pass: 100%/2   | Total: 15m 38s | Avg:  7m 49s | Max: 12m 08s | Hits:  99%/1122  
      🟩 GCC13              Pass: 100%/6   | Total: 25m 50s | Avg:  4m 18s | Max: 11m 19s | Hits:  99%/3366  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 10m 02s | Avg: 10m 02s | Max: 10m 02s | Hits:  61%/262   
      🟩 MSVC14.42          Pass: 100%/1   | Total:  9m 29s | Avg:  9m 29s | Max:  9m 29s | Hits:  61%/262   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 13m 57s | Avg:  6m 58s | Max:  7m 05s | Hits:  82%/712   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total: 34m 42s | Avg:  4m 20s | Max: 12m 12s | Hits: 100%/4490  
      🟩 GCC                Pass: 100%/10  | Total: 47m 54s | Avg:  4m 47s | Max: 12m 08s | Hits:  99%/5612  
      🟩 MSVC               Pass: 100%/2   | Total: 19m 31s | Avg:  9m 45s | Max: 10m 02s | Hits:  61%/524   
      🟩 NVHPC              Pass: 100%/2   | Total: 13m 57s | Avg:  6m 58s | Max:  7m 05s | Hits:  82%/712   
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 14m 23s | Avg:  7m 11s | Max: 11m 19s | Hits:  99%/1122  
      🟩 rtx2080            Pass: 100%/20  | Total:  1h 41m | Avg:  5m 05s | Max: 12m 12s | Hits:  96%/10216 
    🟩 jobs
      🟩 Build              Pass: 100%/19  | Total:  1h 20m | Avg:  4m 13s | Max: 10m 02s | Hits:  96%/9655  
      🟩 Test               Pass: 100%/3   | Total: 35m 39s | Avg: 11m 53s | Max: 12m 12s | Hits:  99%/1683  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 17m 18s | Avg:  5m 46s | Max: 11m 19s | Hits:  99%/1683  
      🟩 90a                Pass: 100%/1   | Total:  2m 57s | Avg:  2m 57s | Max:  2m 57s | Hits:  99%/561   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 15m 24s | Avg:  3m 51s | Max:  7m 05s | Hits:  96%/2039  
      🟩 20                 Pass: 100%/18  | Total:  1h 40m | Avg:  5m 35s | Max: 12m 12s | Hits:  96%/9299  
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 22)

# Runner
13 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

@caugonnet caugonnet marked this pull request as ready for review February 22, 2025 11:05
@caugonnet caugonnet requested review from a team as code owners February 22, 2025 11:05
@caugonnet caugonnet requested a review from griwes February 22, 2025 11:05
@caugonnet
Copy link
Contributor Author

/ok to test

@caugonnet
Copy link
Contributor Author

/ok to test

Copy link
Contributor

🟩 CI finished in 28m 08s: Pass: 100%/22 | Total: 4h 07m | Avg: 11m 16s | Max: 14m 42s | Hits: 70%/11338
  • 🟩 cudax: Pass: 100%/22 | Total: 4h 07m | Avg: 11m 16s | Max: 14m 42s | Hits: 70%/11338

    🟩 cpu
      🟩 amd64              Pass: 100%/18  | Total:  3h 23m | Avg: 11m 18s | Max: 14m 42s | Hits:  72%/9094  
      🟩 arm64              Pass: 100%/4   | Total: 44m 25s | Avg: 11m 06s | Max: 11m 53s | Hits:  63%/2244  
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total: 10m 59s | Avg: 10m 59s | Max: 10m 59s | Hits:  61%/262   
      🟩 12.5               Pass: 100%/2   | Total: 10m 19s | Avg:  5m 09s | Max:  5m 11s | Hits:  96%/712   
      🟩 12.8               Pass: 100%/19  | Total:  3h 46m | Avg: 11m 55s | Max: 14m 42s | Hits:  69%/10364 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total: 10m 59s | Avg: 10m 59s | Max: 10m 59s | Hits:  61%/262   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 10m 19s | Avg:  5m 09s | Max:  5m 11s | Hits:  96%/712   
      🟩 nvcc12.8           Pass: 100%/19  | Total:  3h 46m | Avg: 11m 55s | Max: 14m 42s | Hits:  69%/10364 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/22  | Total:  4h 07m | Avg: 11m 16s | Max: 14m 42s | Hits:  70%/11338 
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total: 11m 44s | Avg: 11m 44s | Max: 11m 44s | Hits:  63%/563   
      🟩 Clang15            Pass: 100%/1   | Total: 13m 25s | Avg: 13m 25s | Max: 13m 25s | Hits:  63%/561   
      🟩 Clang16            Pass: 100%/1   | Total: 13m 11s | Avg: 13m 11s | Max: 13m 11s | Hits:  63%/561   
      🟩 Clang17            Pass: 100%/1   | Total: 13m 16s | Avg: 13m 16s | Max: 13m 16s | Hits:  63%/561   
      🟩 Clang18            Pass: 100%/4   | Total: 46m 47s | Avg: 11m 41s | Max: 13m 23s | Hits:  72%/2244  
      🟩 GCC10              Pass: 100%/1   | Total: 13m 17s | Avg: 13m 17s | Max: 13m 17s | Hits:  63%/563   
      🟩 GCC11              Pass: 100%/1   | Total: 13m 48s | Avg: 13m 48s | Max: 13m 48s | Hits:  63%/561   
      🟩 GCC12              Pass: 100%/2   | Total: 26m 47s | Avg: 13m 23s | Max: 14m 42s | Hits:  81%/1122  
      🟩 GCC13              Pass: 100%/6   | Total:  1h 03m | Avg: 10m 38s | Max: 11m 53s | Hits:  69%/3366  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 10m 59s | Avg: 10m 59s | Max: 10m 59s | Hits:  61%/262   
      🟩 MSVC14.42          Pass: 100%/1   | Total: 10m 32s | Avg: 10m 32s | Max: 10m 32s | Hits:  61%/262   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 10m 19s | Avg:  5m 09s | Max:  5m 11s | Hits:  96%/712   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total:  1h 38m | Avg: 12m 17s | Max: 13m 25s | Hits:  68%/4490  
      🟩 GCC                Pass: 100%/10  | Total:  1h 57m | Avg: 11m 46s | Max: 14m 42s | Hits:  70%/5612  
      🟩 MSVC               Pass: 100%/2   | Total: 21m 31s | Avg: 10m 45s | Max: 10m 59s | Hits:  61%/524   
      🟩 NVHPC              Pass: 100%/2   | Total: 10m 19s | Avg:  5m 09s | Max:  5m 11s | Hits:  96%/712   
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 21m 50s | Avg: 10m 55s | Max: 11m 24s | Hits:  81%/1122  
      🟩 rtx2080            Pass: 100%/20  | Total:  3h 46m | Avg: 11m 18s | Max: 14m 42s | Hits:  69%/10216 
    🟩 jobs
      🟩 Build              Pass: 100%/19  | Total:  3h 32m | Avg: 11m 11s | Max: 14m 42s | Hits:  65%/9655  
      🟩 Test               Pass: 100%/3   | Total: 35m 18s | Avg: 11m 46s | Max: 12m 05s | Hits:  99%/1683  
    🟩 sm
      🟩 90                 Pass: 100%/3   | Total: 31m 08s | Avg: 10m 22s | Max: 11m 24s | Hits:  75%/1683  
      🟩 90a                Pass: 100%/1   | Total:  9m 54s | Avg:  9m 54s | Max:  9m 54s | Hits:  63%/561   
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 35m 58s | Avg:  8m 59s | Max: 10m 57s | Hits:  69%/2039  
      🟩 20                 Pass: 100%/18  | Total:  3h 31m | Avg: 11m 46s | Max: 14m 42s | Hits:  71%/9299  
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 22)

# Runner
13 linux-amd64-cpu16
4 linux-arm64-cpu16
2 windows-amd64-cpu16
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

@caugonnet caugonnet merged commit b1e002f into NVIDIA:main Feb 24, 2025
35 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stf Sequential Task Flow programming model
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants