Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend CUB benchmarking documentation #2831

Merged
merged 11 commits into from
Nov 15, 2024

Conversation

bernhardmgruber
Copy link
Contributor

@bernhardmgruber bernhardmgruber commented Nov 15, 2024

Here is an update of the public CUB benchmarking documentation from private notes I have been maintaining for a few months now.

I could build the documentation locally and tested that all links work.

@bernhardmgruber
Copy link
Contributor Author

@gonidelis since you have to read and understand all of this, you may just review it as well ;)

@bernhardmgruber bernhardmgruber marked this pull request as ready for review November 15, 2024 15:24
@bernhardmgruber bernhardmgruber requested review from a team as code owners November 15, 2024 15:24
Copy link
Contributor

🟩 CI finished in 2h 09m: Pass: 100%/400 | Total: 2d 04h | Avg: 7m 55s | Max: 2h 07m | Hits: 87%/25890
  • 🟩 libcudacxx: Pass: 100%/118 | Total: 20h 37m | Avg: 10m 29s | Max: 2h 07m | Hits: 66%/9500

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total: 19h 59m | Avg: 10m 54s | Max:  2h 07m | Hits:  66%/9500  
      🟩 arm64              Pass: 100%/8   | Total: 37m 50s | Avg:  4m 43s | Max: 13m 27s
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  3h 33m | Avg: 14m 14s | Max:  2h 07m | Hits:  34%/2181  
      🟩 11.8               Pass: 100%/3   | Total:  1h 01m | Avg: 20m 38s | Max: 25m 57s
      🟩 12.5               Pass: 100%/4   | Total:  1h 30m | Avg: 22m 32s | Max: 33m 55s
      🟩 12.6               Pass: 100%/96  | Total: 14h 31m | Avg:  9m 04s | Max: 48m 59s | Hits:  76%/7319  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/12  | Total:  2h 37m | Avg: 13m 05s | Max: 21m 25s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  3h 33m | Avg: 14m 14s | Max:  2h 07m | Hits:  34%/2181  
      🟩 nvcc11.8           Pass: 100%/3   | Total:  1h 01m | Avg: 20m 38s | Max: 25m 57s
      🟩 nvcc12.5           Pass: 100%/4   | Total:  1h 30m | Avg: 22m 32s | Max: 33m 55s
      🟩 nvcc12.6           Pass: 100%/84  | Total: 11h 54m | Avg:  8m 30s | Max: 48m 59s | Hits:  76%/7319  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/12  | Total:  2h 37m | Avg: 13m 05s | Max: 21m 25s
      🟩 nvcc               Pass: 100%/106 | Total: 18h 00m | Avg: 10m 11s | Max:  2h 07m | Hits:  66%/9500  
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 28m 23s | Avg:  4m 43s | Max:  6m 00s
      🟩 Clang10            Pass: 100%/3   | Total: 16m 50s | Avg:  5m 36s | Max:  6m 07s
      🟩 Clang11            Pass: 100%/4   | Total: 18m 01s | Avg:  4m 30s | Max:  4m 46s
      🟩 Clang12            Pass: 100%/4   | Total: 33m 50s | Avg:  8m 27s | Max: 20m 56s
      🟩 Clang13            Pass: 100%/4   | Total: 33m 23s | Avg:  8m 20s | Max: 20m 46s
      🟩 Clang14            Pass: 100%/4   | Total: 16m 57s | Avg:  4m 14s | Max:  4m 35s
      🟩 Clang15            Pass: 100%/4   | Total: 17m 59s | Avg:  4m 29s | Max:  4m 53s
      🟩 Clang16            Pass: 100%/4   | Total: 18m 25s | Avg:  4m 36s | Max:  4m 56s
      🟩 Clang17            Pass: 100%/4   | Total: 40m 41s | Avg: 10m 10s | Max: 26m 45s
      🟩 Clang18            Pass: 100%/18  | Total:  3h 23m | Avg: 11m 18s | Max: 21m 25s
      🟩 GCC6               Pass: 100%/2   | Total:  5m 41s | Avg:  2m 50s | Max:  3m 06s
      🟩 GCC7               Pass: 100%/6   | Total:  1h 15m | Avg: 12m 31s | Max: 27m 33s
      🟩 GCC8               Pass: 100%/6   | Total: 20m 12s | Avg:  3m 22s | Max:  4m 08s
      🟩 GCC9               Pass: 100%/6   | Total: 33m 15s | Avg:  5m 32s | Max: 17m 27s
      🟩 GCC10              Pass: 100%/4   | Total: 14m 57s | Avg:  3m 44s | Max:  4m 13s
      🟩 GCC11              Pass: 100%/7   | Total:  1h 36m | Avg: 13m 46s | Max: 25m 57s
      🟩 GCC12              Pass: 100%/4   | Total: 26m 31s | Avg:  6m 37s | Max: 14m 29s
      🟩 GCC13              Pass: 100%/17  | Total:  3h 41m | Avg: 13m 02s | Max: 48m 59s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 39m 18s | Avg: 13m 06s | Max: 29m 06s
      🟩 MSVC14.16          Pass: 100%/1   | Total:  2h 07m | Avg:  2h 07m | Max:  2h 07m | Hits:  34%/2181  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 46m 13s | Avg: 23m 06s | Max: 34m 49s | Hits:  63%/4725  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 12m 10s | Avg: 12m 10s | Max: 12m 10s | Hits:  98%/2594  
      🟩 NVHPC24.7          Pass: 100%/4   | Total:  1h 30m | Avg: 22m 32s | Max: 33m 55s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/55  | Total:  7h 07m | Avg:  7m 46s | Max: 26m 45s
      🟩 GCC                Pass: 100%/52  | Total:  8h 13m | Avg:  9m 29s | Max: 48m 59s
      🟩 Intel              Pass: 100%/3   | Total: 39m 18s | Avg: 13m 06s | Max: 29m 06s
      🟩 MSVC               Pass: 100%/4   | Total:  3h 05m | Avg: 46m 28s | Max:  2h 07m | Hits:  66%/9500  
      🟩 NVHPC              Pass: 100%/4   | Total:  1h 30m | Avg: 22m 32s | Max: 33m 55s
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total: 20h 37m | Avg: 10m 29s | Max:  2h 07m | Hits:  66%/9500  
    🟩 jobs
      🟩 Build              Pass: 100%/110 | Total: 17h 15m | Avg:  9m 24s | Max:  2h 07m | Hits:  66%/9500  
      🟩 NVRTC              Pass: 100%/4   | Total:  2h 21m | Avg: 35m 17s | Max: 48m 59s
      🟩 Test               Pass: 100%/3   | Total: 58m 12s | Avg: 19m 24s | Max: 22m 29s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  2m 12s | Avg:  2m 12s | Max:  2m 12s
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  1h 01m | Avg: 20m 38s | Max: 25m 57s
      🟩 90                 Pass: 100%/4   | Total: 42m 06s | Avg: 10m 31s | Max: 12m 54s
      🟩 90a                Pass: 100%/8   | Total:  1h 01m | Avg:  7m 44s | Max: 14m 06s
    🟩 std
      🟩 11                 Pass: 100%/32  | Total:  3h 58m | Avg:  7m 26s | Max: 27m 33s
      🟩 14                 Pass: 100%/32  | Total:  6h 33m | Avg: 12m 16s | Max:  2h 07m | Hits:  67%/4465  
      🟩 17                 Pass: 100%/30  | Total:  5h 34m | Avg: 11m 09s | Max: 38m 21s | Hits:  30%/2441  
      🟩 20                 Pass: 100%/23  | Total:  4h 29m | Avg: 11m 42s | Max: 48m 59s | Hits:  98%/2594  
    
  • 🟩 cub: Pass: 100%/110 | Total: 13h 27m | Avg: 7m 20s | Max: 37m 01s | Hits: 99%/2964

    🟩 cpu
      🟩 amd64              Pass: 100%/102 | Total: 12h 49m | Avg:  7m 32s | Max: 37m 01s | Hits:  99%/2964  
      🟩 arm64              Pass: 100%/8   | Total: 38m 22s | Avg:  4m 47s | Max:  5m 17s
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  1h 13m | Avg:  4m 55s | Max: 12m 44s | Hits:  99%/741   
      🟩 11.8               Pass: 100%/3   | Total: 16m 20s | Avg:  5m 26s | Max:  5m 56s
      🟩 12.5               Pass: 100%/4   | Total: 35m 35s | Avg:  8m 53s | Max:  9m 28s
      🟩 12.6               Pass: 100%/88  | Total: 11h 22m | Avg:  7m 45s | Max: 37m 01s | Hits:  99%/2223  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/4   | Total: 17m 29s | Avg:  4m 22s | Max:  4m 31s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 13m | Avg:  4m 55s | Max: 12m 44s | Hits:  99%/741   
      🟩 nvcc11.8           Pass: 100%/3   | Total: 16m 20s | Avg:  5m 26s | Max:  5m 56s
      🟩 nvcc12.5           Pass: 100%/4   | Total: 35m 35s | Avg:  8m 53s | Max:  9m 28s
      🟩 nvcc12.6           Pass: 100%/84  | Total: 11h 04m | Avg:  7m 54s | Max: 37m 01s | Hits:  99%/2223  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/4   | Total: 17m 29s | Avg:  4m 22s | Max:  4m 31s
      🟩 nvcc               Pass: 100%/106 | Total: 13h 10m | Avg:  7m 27s | Max: 37m 01s | Hits:  99%/2964  
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 43m 39s | Avg:  7m 16s | Max: 13m 03s
      🟩 Clang10            Pass: 100%/3   | Total: 25m 28s | Avg:  8m 29s | Max:  8m 47s
      🟩 Clang11            Pass: 100%/4   | Total: 28m 46s | Avg:  7m 11s | Max:  8m 06s
      🟩 Clang12            Pass: 100%/4   | Total: 23m 38s | Avg:  5m 54s | Max:  8m 12s
      🟩 Clang13            Pass: 100%/4   | Total: 26m 57s | Avg:  6m 44s | Max:  8m 18s
      🟩 Clang14            Pass: 100%/4   | Total: 20m 54s | Avg:  5m 13s | Max:  5m 32s
      🟩 Clang15            Pass: 100%/4   | Total: 21m 43s | Avg:  5m 25s | Max:  6m 17s
      🟩 Clang16            Pass: 100%/4   | Total: 21m 41s | Avg:  5m 25s | Max:  5m 42s
      🟩 Clang17            Pass: 100%/4   | Total: 22m 24s | Avg:  5m 36s | Max:  5m 55s
      🟩 Clang18            Pass: 100%/11  | Total:  1h 35m | Avg:  8m 40s | Max: 28m 49s
      🟩 GCC6               Pass: 100%/2   | Total:  8m 29s | Avg:  4m 14s | Max:  4m 22s
      🟩 GCC7               Pass: 100%/6   | Total: 27m 19s | Avg:  4m 33s | Max:  5m 14s
      🟩 GCC8               Pass: 100%/6   | Total: 29m 25s | Avg:  4m 54s | Max:  5m 33s
      🟩 GCC9               Pass: 100%/6   | Total: 30m 26s | Avg:  5m 04s | Max:  6m 07s
      🟩 GCC10              Pass: 100%/4   | Total: 20m 58s | Avg:  5m 14s | Max:  5m 31s
      🟩 GCC11              Pass: 100%/7   | Total: 37m 32s | Avg:  5m 21s | Max:  5m 56s
      🟩 GCC12              Pass: 100%/4   | Total: 21m 44s | Avg:  5m 26s | Max:  5m 38s
      🟩 GCC13              Pass: 100%/16  | Total:  3h 21m | Avg: 12m 36s | Max: 37m 01s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 18m 53s | Avg:  6m 17s | Max:  6m 31s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 12m 44s | Avg: 12m 44s | Max: 12m 44s | Hits:  99%/741   
      🟩 MSVC14.29          Pass: 100%/2   | Total: 20m 31s | Avg: 10m 15s | Max: 10m 36s | Hits:  99%/1482  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 12m 01s | Avg: 12m 01s | Max: 12m 01s | Hits:  99%/741   
      🟩 NVHPC24.7          Pass: 100%/4   | Total: 35m 35s | Avg:  8m 53s | Max:  9m 28s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/48  | Total:  5h 30m | Avg:  6m 53s | Max: 28m 49s
      🟩 GCC                Pass: 100%/51  | Total:  6h 17m | Avg:  7m 24s | Max: 37m 01s
      🟩 Intel              Pass: 100%/3   | Total: 18m 53s | Avg:  6m 17s | Max:  6m 31s
      🟩 MSVC               Pass: 100%/4   | Total: 45m 16s | Avg: 11m 19s | Max: 12m 44s | Hits:  99%/2964  
      🟩 NVHPC              Pass: 100%/4   | Total: 35m 35s | Avg:  8m 53s | Max:  9m 28s
    🟩 gpu
      🟩 v100               Pass: 100%/110 | Total: 13h 27m | Avg:  7m 20s | Max: 37m 01s | Hits:  99%/2964  
    🟩 jobs
      🟩 Build              Pass: 100%/102 | Total: 10h 00m | Avg:  5m 53s | Max: 13m 03s | Hits:  99%/2964  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 21m 42s | Avg: 21m 42s | Max: 21m 42s
      🟩 GraphCapture       Pass: 100%/1   | Total: 18m 33s | Avg: 18m 33s | Max: 18m 33s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 08m | Avg: 22m 57s | Max: 25m 32s
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 38m | Avg: 32m 43s | Max: 37m 01s
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 16m 20s | Avg:  5m 26s | Max:  5m 56s
      🟩 90a                Pass: 100%/4   | Total: 16m 56s | Avg:  4m 14s | Max:  4m 22s
    🟩 std
      🟩 11                 Pass: 100%/30  | Total:  3h 24m | Avg:  6m 48s | Max: 32m 21s
      🟩 14                 Pass: 100%/29  | Total:  2h 57m | Avg:  6m 08s | Max: 13m 03s | Hits:  99%/1482  
      🟩 17                 Pass: 100%/27  | Total:  2h 43m | Avg:  6m 04s | Max:  9m 55s | Hits:  99%/741   
      🟩 20                 Pass: 100%/24  | Total:  4h 21m | Avg: 10m 54s | Max: 37m 01s | Hits:  99%/741   
    
  • 🟩 thrust: Pass: 100%/109 | Total: 12h 58m | Avg: 7m 08s | Max: 22m 35s | Hits: 99%/13180

    🟩 cpu
      🟩 amd64              Pass: 100%/101 | Total: 12h 15m | Avg:  7m 16s | Max: 22m 35s | Hits:  99%/13180 
      🟩 arm64              Pass: 100%/8   | Total: 43m 07s | Avg:  5m 23s | Max:  5m 58s
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  1h 26m | Avg:  5m 47s | Max: 19m 28s | Hits:  99%/2636  
      🟩 11.8               Pass: 100%/3   | Total: 17m 38s | Avg:  5m 52s | Max:  6m 15s
      🟩 12.5               Pass: 100%/4   | Total:  1h 08m | Avg: 17m 04s | Max: 18m 53s
      🟩 12.6               Pass: 100%/87  | Total: 10h 05m | Avg:  6m 57s | Max: 22m 35s | Hits:  99%/10544 
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/4   | Total: 21m 30s | Avg:  5m 22s | Max:  5m 39s
      🟩 nvcc11.1           Pass: 100%/15  | Total:  1h 26m | Avg:  5m 47s | Max: 19m 28s | Hits:  99%/2636  
      🟩 nvcc11.8           Pass: 100%/3   | Total: 17m 38s | Avg:  5m 52s | Max:  6m 15s
      🟩 nvcc12.5           Pass: 100%/4   | Total:  1h 08m | Avg: 17m 04s | Max: 18m 53s
      🟩 nvcc12.6           Pass: 100%/83  | Total:  9h 43m | Avg:  7m 02s | Max: 22m 35s | Hits:  99%/10544 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/4   | Total: 21m 30s | Avg:  5m 22s | Max:  5m 39s
      🟩 nvcc               Pass: 100%/105 | Total: 12h 36m | Avg:  7m 12s | Max: 22m 35s | Hits:  99%/13180 
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total: 36m 22s | Avg:  6m 03s | Max:  7m 38s
      🟩 Clang10            Pass: 100%/3   | Total: 21m 55s | Avg:  7m 18s | Max:  7m 34s
      🟩 Clang11            Pass: 100%/4   | Total: 24m 20s | Avg:  6m 05s | Max:  6m 17s
      🟩 Clang12            Pass: 100%/4   | Total: 23m 45s | Avg:  5m 56s | Max:  6m 16s
      🟩 Clang13            Pass: 100%/4   | Total: 23m 34s | Avg:  5m 53s | Max:  6m 15s
      🟩 Clang14            Pass: 100%/4   | Total: 23m 34s | Avg:  5m 53s | Max:  6m 07s
      🟩 Clang15            Pass: 100%/4   | Total: 24m 03s | Avg:  6m 00s | Max:  6m 31s
      🟩 Clang16            Pass: 100%/4   | Total: 24m 32s | Avg:  6m 08s | Max:  6m 31s
      🟩 Clang17            Pass: 100%/4   | Total: 23m 51s | Avg:  5m 57s | Max:  6m 23s
      🟩 Clang18            Pass: 100%/11  | Total:  1h 07m | Avg:  6m 08s | Max: 11m 22s
      🟩 GCC6               Pass: 100%/2   | Total:  8m 48s | Avg:  4m 24s | Max:  4m 25s
      🟩 GCC7               Pass: 100%/6   | Total: 30m 47s | Avg:  5m 07s | Max:  5m 59s
      🟩 GCC8               Pass: 100%/6   | Total: 33m 09s | Avg:  5m 31s | Max:  6m 09s
      🟩 GCC9               Pass: 100%/6   | Total: 31m 37s | Avg:  5m 16s | Max:  6m 03s
      🟩 GCC10              Pass: 100%/4   | Total: 24m 58s | Avg:  6m 14s | Max:  6m 35s
      🟩 GCC11              Pass: 100%/7   | Total: 42m 21s | Avg:  6m 03s | Max:  6m 38s
      🟩 GCC12              Pass: 100%/4   | Total: 25m 37s | Avg:  6m 24s | Max:  6m 39s
      🟩 GCC13              Pass: 100%/14  | Total:  1h 41m | Avg:  7m 16s | Max: 15m 00s
      🟩 Intel2023.2.0      Pass: 100%/3   | Total: 22m 25s | Avg:  7m 28s | Max:  7m 42s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 19m 28s | Avg: 19m 28s | Max: 19m 28s | Hits:  99%/2636  
      🟩 MSVC14.29          Pass: 100%/2   | Total: 33m 18s | Avg: 16m 39s | Max: 16m 45s | Hits:  99%/5272  
      🟩 MSVC14.39          Pass: 100%/2   | Total: 42m 05s | Avg: 21m 02s | Max: 22m 35s | Hits:  99%/5272  
      🟩 NVHPC24.7          Pass: 100%/4   | Total:  1h 08m | Avg: 17m 04s | Max: 18m 53s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/48  | Total:  4h 53m | Avg:  6m 06s | Max: 11m 22s
      🟩 GCC                Pass: 100%/49  | Total:  4h 59m | Avg:  6m 06s | Max: 15m 00s
      🟩 Intel              Pass: 100%/3   | Total: 22m 25s | Avg:  7m 28s | Max:  7m 42s
      🟩 MSVC               Pass: 100%/5   | Total:  1h 34m | Avg: 18m 58s | Max: 22m 35s | Hits:  99%/13180 
      🟩 NVHPC              Pass: 100%/4   | Total:  1h 08m | Avg: 17m 04s | Max: 18m 53s
    🟩 gpu
      🟩 v100               Pass: 100%/109 | Total: 12h 58m | Avg:  7m 08s | Max: 22m 35s | Hits:  99%/13180 
    🟩 jobs
      🟩 Build              Pass: 100%/102 | Total: 11h 30m | Avg:  6m 46s | Max: 19m 30s | Hits:  99%/10544 
      🟩 TestCPU            Pass: 100%/4   | Total: 47m 13s | Avg: 11m 48s | Max: 22m 35s | Hits:  99%/2636  
      🟩 TestGPU            Pass: 100%/3   | Total: 40m 23s | Avg: 13m 27s | Max: 15m 00s
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total: 17m 38s | Avg:  5m 52s | Max:  6m 15s
      🟩 90a                Pass: 100%/4   | Total: 20m 23s | Avg:  5m 05s | Max:  5m 21s
    🟩 std
      🟩 11                 Pass: 100%/30  | Total:  3h 05m | Avg:  6m 10s | Max: 15m 48s
      🟩 14                 Pass: 100%/29  | Total:  3h 24m | Avg:  7m 02s | Max: 19m 28s | Hits:  99%/5272  
      🟩 17                 Pass: 100%/27  | Total:  3h 07m | Avg:  6m 56s | Max: 17m 19s | Hits:  99%/2636  
      🟩 20                 Pass: 100%/23  | Total:  3h 21m | Avg:  8m 44s | Max: 22m 35s | Hits:  99%/5272  
    
  • 🟩 cudax: Pass: 100%/54 | Total: 4h 56m | Avg: 5m 29s | Max: 39m 32s | Hits: 90%/246

    🟩 cpu
      🟩 amd64              Pass: 100%/50  | Total:  4h 46m | Avg:  5m 43s | Max: 39m 32s | Hits:  90%/246   
      🟩 arm64              Pass: 100%/4   | Total: 10m 00s | Avg:  2m 30s | Max:  2m 34s
    🟩 ctk
      🟩 12.0               Pass: 100%/19  | Total:  1h 55m | Avg:  6m 05s | Max: 39m 32s | Hits:  90%/123   
      🟩 12.5               Pass: 100%/2   | Total: 10m 22s | Avg:  5m 11s | Max:  5m 11s
      🟩 12.6               Pass: 100%/33  | Total:  2h 50m | Avg:  5m 09s | Max: 34m 42s | Hits:  90%/123   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/19  | Total:  1h 55m | Avg:  6m 05s | Max: 39m 32s | Hits:  90%/123   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 10m 22s | Avg:  5m 11s | Max:  5m 11s
      🟩 nvcc12.6           Pass: 100%/33  | Total:  2h 50m | Avg:  5m 09s | Max: 34m 42s | Hits:  90%/123   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/54  | Total:  4h 56m | Avg:  5m 29s | Max: 39m 32s | Hits:  90%/246   
    🟩 cxx
      🟩 Clang9             Pass: 100%/2   | Total:  6m 49s | Avg:  3m 24s | Max:  3m 45s
      🟩 Clang10            Pass: 100%/2   | Total:  6m 45s | Avg:  3m 22s | Max:  3m 54s
      🟩 Clang11            Pass: 100%/4   | Total: 12m 12s | Avg:  3m 03s | Max:  3m 18s
      🟩 Clang12            Pass: 100%/4   | Total: 12m 33s | Avg:  3m 08s | Max:  3m 15s
      🟩 Clang13            Pass: 100%/4   | Total: 12m 18s | Avg:  3m 04s | Max:  3m 15s
      🟩 Clang14            Pass: 100%/4   | Total: 48m 44s | Avg: 12m 11s | Max: 39m 32s
      🟩 Clang15            Pass: 100%/2   | Total:  6m 00s | Avg:  3m 00s | Max:  3m 02s
      🟩 Clang16            Pass: 100%/4   | Total: 11m 02s | Avg:  2m 45s | Max:  2m 59s
      🟩 Clang17            Pass: 100%/2   | Total:  6m 11s | Avg:  3m 05s | Max:  3m 11s
      🟩 Clang18            Pass: 100%/2   | Total: 24m 54s | Avg: 12m 27s | Max: 21m 53s
      🟩 GCC9               Pass: 100%/2   | Total:  5m 55s | Avg:  2m 57s | Max:  3m 04s
      🟩 GCC10              Pass: 100%/4   | Total: 12m 27s | Avg:  3m 06s | Max:  3m 20s
      🟩 GCC11              Pass: 100%/4   | Total: 11m 41s | Avg:  2m 55s | Max:  3m 14s
      🟩 GCC12              Pass: 100%/7   | Total:  1h 27m | Avg: 12m 27s | Max: 34m 42s
      🟩 GCC13              Pass: 100%/3   | Total:  7m 27s | Avg:  2m 29s | Max:  2m 33s
      🟩 MSVC14.36          Pass: 100%/1   | Total:  6m 42s | Avg:  6m 42s | Max:  6m 42s | Hits:  90%/123   
      🟩 MSVC14.39          Pass: 100%/1   | Total:  7m 16s | Avg:  7m 16s | Max:  7m 16s | Hits:  90%/123   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 10m 22s | Avg:  5m 11s | Max:  5m 11s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/30  | Total:  2h 27m | Avg:  4m 54s | Max: 39m 32s
      🟩 GCC                Pass: 100%/20  | Total:  2h 04m | Avg:  6m 14s | Max: 34m 42s
      🟩 MSVC               Pass: 100%/2   | Total: 13m 58s | Avg:  6m 59s | Max:  7m 16s | Hits:  90%/246   
      🟩 NVHPC              Pass: 100%/2   | Total: 10m 22s | Avg:  5m 11s | Max:  5m 11s
    🟩 gpu
      🟩 v100               Pass: 100%/54  | Total:  4h 56m | Avg:  5m 29s | Max: 39m 32s | Hits:  90%/246   
    🟩 jobs
      🟩 Build              Pass: 100%/49  | Total:  2h 40m | Avg:  3m 16s | Max:  7m 16s | Hits:  90%/246   
      🟩 Test               Pass: 100%/5   | Total:  2h 16m | Avg: 27m 14s | Max: 39m 32s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  2m 46s | Avg:  2m 46s | Max:  2m 46s
      🟩 90a                Pass: 100%/1   | Total:  2m 33s | Avg:  2m 33s | Max:  2m 33s
    🟩 std
      🟩 17                 Pass: 100%/29  | Total:  2h 05m | Avg:  4m 18s | Max: 22m 19s
      🟩 20                 Pass: 100%/25  | Total:  2h 51m | Avg:  6m 51s | Max: 39m 32s | Hits:  90%/246   
    
  • 🟩 cccl: Pass: 100%/6 | Total: 26m 19s | Avg: 4m 23s | Max: 4m 55s

    🟩 cpu
      🟩 amd64              Pass: 100%/6   | Total: 26m 19s | Avg:  4m 23s | Max:  4m 55s
    🟩 ctk
      🟩 11.1               Pass: 100%/2   | Total:  7m 15s | Avg:  3m 37s | Max:  3m 49s
      🟩 12.0               Pass: 100%/2   | Total:  9m 27s | Avg:  4m 43s | Max:  4m 55s
      🟩 12.6               Pass: 100%/2   | Total:  9m 37s | Avg:  4m 48s | Max:  4m 54s
    🟩 cudacxx
      🟩 nvcc11.1           Pass: 100%/2   | Total:  7m 15s | Avg:  3m 37s | Max:  3m 49s
      🟩 nvcc12.0           Pass: 100%/2   | Total:  9m 27s | Avg:  4m 43s | Max:  4m 55s
      🟩 nvcc12.6           Pass: 100%/2   | Total:  9m 37s | Avg:  4m 48s | Max:  4m 54s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/6   | Total: 26m 19s | Avg:  4m 23s | Max:  4m 55s
    🟩 cxx
      🟩 Clang9             Pass: 100%/1   | Total:  3m 49s | Avg:  3m 49s | Max:  3m 49s
      🟩 Clang14            Pass: 100%/1   | Total:  4m 55s | Avg:  4m 55s | Max:  4m 55s
      🟩 Clang18            Pass: 100%/1   | Total:  4m 54s | Avg:  4m 54s | Max:  4m 54s
      🟩 GCC6               Pass: 100%/1   | Total:  3m 26s | Avg:  3m 26s | Max:  3m 26s
      🟩 GCC12              Pass: 100%/1   | Total:  4m 32s | Avg:  4m 32s | Max:  4m 32s
      🟩 GCC13              Pass: 100%/1   | Total:  4m 43s | Avg:  4m 43s | Max:  4m 43s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/3   | Total: 13m 38s | Avg:  4m 32s | Max:  4m 55s
      🟩 GCC                Pass: 100%/3   | Total: 12m 41s | Avg:  4m 13s | Max:  4m 43s
    🟩 gpu
      🟩 v100               Pass: 100%/6   | Total: 26m 19s | Avg:  4m 23s | Max:  4m 55s
    🟩 jobs
      🟩 Infra              Pass: 100%/6   | Total: 26m 19s | Avg:  4m 23s | Max:  4m 55s
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 10m 02s | Avg: 5m 01s | Max: 8m 05s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  8m 05s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  8m 05s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  8m 05s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  8m 05s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  8m 05s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  8m 05s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  8m 05s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  1m 57s | Avg:  1m 57s | Max:  1m 57s
      🟩 Test               Pass: 100%/1   | Total:  8m 05s | Avg:  8m 05s | Max:  8m 05s
    
  • 🟩 python: Pass: 100%/1 | Total: 15m 25s | Avg: 15m 25s | Max: 15m 25s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 15m 25s | Avg: 15m 25s | Max: 15m 25s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 15m 25s | Avg: 15m 25s | Max: 15m 25s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 15m 25s | Avg: 15m 25s | Max: 15m 25s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 15m 25s | Avg: 15m 25s | Max: 15m 25s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 15m 25s | Avg: 15m 25s | Max: 15m 25s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 15m 25s | Avg: 15m 25s | Max: 15m 25s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 15m 25s | Avg: 15m 25s | Max: 15m 25s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 15m 25s | Avg: 15m 25s | Max: 15m 25s
    

👃 Inspect Changes

Modifications in project?

Project
+/- CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
+/- CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 400)

# Runner
326 linux-amd64-cpu16
31 linux-amd64-gpu-v100-latest-1
28 linux-arm64-cpu16
15 windows-amd64-cpu16

@bernhardmgruber bernhardmgruber merged commit 1dd5bc7 into NVIDIA:main Nov 15, 2024
418 checks passed
@bernhardmgruber bernhardmgruber deleted the bench_docs branch November 15, 2024 19:03
trxcllnt pushed a commit to trxcllnt/cccl that referenced this pull request Nov 23, 2024
* Document predefined benchmark typelists
* Show benchmark guide before tuning guide
* Extend CUB benchmark guide
* Rework and extend tuning section
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants