Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement some CUDA API calls for async_memory_pool #2455

Merged
merged 1 commit into from
Oct 1, 2024

Conversation

miscco
Copy link
Collaborator

@miscco miscco commented Sep 25, 2024

Those API calls are quite usefull so we should expose them as part of the interface

@miscco miscco requested a review from a team as a code owner September 25, 2024 07:22
Copy link
Contributor

🟩 CI finished in 12m 28s: Pass: 100%/52 | Total: 2h 43m | Avg: 3m 08s | Max: 11m 07s | Hits: 82%/222
  • 🟩 cudax: Pass: 100%/52 | Total: 2h 43m | Avg: 3m 08s | Max: 11m 07s | Hits: 82%/222

    🟩 cpu
      🟩 amd64              Pass: 100%/48  | Total:  2h 33m | Avg:  3m 11s | Max: 11m 07s | Hits:  82%/222   
      🟩 arm64              Pass: 100%/4   | Total: 10m 28s | Avg:  2m 37s | Max:  3m 19s
    🟩 ctk
      🟩 12.0               Pass: 100%/19  | Total:  1h 01m | Avg:  3m 15s | Max: 10m 02s | Hits:  82%/111   
      🟩 12.6               Pass: 100%/33  | Total:  1h 41m | Avg:  3m 05s | Max: 11m 07s | Hits:  82%/111   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/19  | Total:  1h 01m | Avg:  3m 15s | Max: 10m 02s | Hits:  82%/111   
      🟩 nvcc12.6           Pass: 100%/33  | Total:  1h 41m | Avg:  3m 05s | Max: 11m 07s | Hits:  82%/111   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/52  | Total:  2h 43m | Avg:  3m 08s | Max: 11m 07s | Hits:  82%/222   
    🟩 cxx
      🟩 Clang9             Pass: 100%/2   | Total:  5m 33s | Avg:  2m 46s | Max:  2m 50s
      🟩 Clang10            Pass: 100%/2   | Total:  6m 26s | Avg:  3m 13s | Max:  3m 25s
      🟩 Clang11            Pass: 100%/4   | Total: 11m 28s | Avg:  2m 52s | Max:  3m 13s
      🟩 Clang12            Pass: 100%/4   | Total: 10m 55s | Avg:  2m 43s | Max:  2m 51s
      🟩 Clang13            Pass: 100%/4   | Total: 10m 45s | Avg:  2m 41s | Max:  2m 48s
      🟩 Clang14            Pass: 100%/4   | Total: 12m 36s | Avg:  3m 09s | Max:  4m 35s
      🟩 Clang15            Pass: 100%/2   | Total:  5m 21s | Avg:  2m 40s | Max:  2m 42s
      🟩 Clang16            Pass: 100%/4   | Total: 11m 06s | Avg:  2m 46s | Max:  3m 19s
      🟩 Clang17            Pass: 100%/2   | Total:  5m 39s | Avg:  2m 49s | Max:  2m 52s
      🟩 Clang18            Pass: 100%/2   | Total:  6m 58s | Avg:  3m 29s | Max:  4m 09s
      🟩 GCC9               Pass: 100%/2   | Total:  5m 31s | Avg:  2m 45s | Max:  2m 47s
      🟩 GCC10              Pass: 100%/4   | Total: 10m 43s | Avg:  2m 40s | Max:  2m 53s
      🟩 GCC11              Pass: 100%/4   | Total:  9m 58s | Avg:  2m 29s | Max:  2m 32s
      🟩 GCC12              Pass: 100%/7   | Total: 22m 27s | Avg:  3m 12s | Max:  4m 01s
      🟩 GCC13              Pass: 100%/3   | Total:  7m 08s | Avg:  2m 22s | Max:  2m 28s
      🟩 MSVC14.36          Pass: 100%/1   | Total: 10m 02s | Avg: 10m 02s | Max: 10m 02s | Hits:  82%/111   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 11m 07s | Avg: 11m 07s | Max: 11m 07s | Hits:  82%/111   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/30  | Total:  1h 26m | Avg:  2m 53s | Max:  4m 35s
      🟩 GCC                Pass: 100%/20  | Total: 55m 47s | Avg:  2m 47s | Max:  4m 01s
      🟩 MSVC               Pass: 100%/2   | Total: 21m 09s | Avg: 10m 34s | Max: 11m 07s | Hits:  82%/222   
    🟩 gpu
      🟩 v100               Pass: 100%/52  | Total:  2h 43m | Avg:  3m 08s | Max: 11m 07s | Hits:  82%/222   
    🟩 jobs
      🟩 Build              Pass: 100%/47  | Total:  2h 23m | Avg:  3m 03s | Max: 11m 07s | Hits:  82%/222   
      🟩 Test               Pass: 100%/5   | Total: 20m 10s | Avg:  4m 02s | Max:  4m 35s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  2m 17s | Avg:  2m 17s | Max:  2m 17s
      🟩 90a                Pass: 100%/1   | Total:  2m 28s | Avg:  2m 28s | Max:  2m 28s
    🟩 std
      🟩 17                 Pass: 100%/28  | Total:  1h 17m | Avg:  2m 46s | Max:  4m 01s
      🟩 20                 Pass: 100%/24  | Total:  1h 26m | Avg:  3m 35s | Max: 11m 07s | Hits:  82%/222   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda
CUDA C Core Library

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda
CUDA C Core Library

🏃‍ Runner counts (total jobs: 52)

# Runner
41 linux-amd64-cpu16
5 linux-amd64-gpu-v100-latest-1
4 linux-arm64-cpu16
2 windows-amd64-cpu16

@miscco miscco force-pushed the extend_cuda_mempool_api branch from b88fc90 to de97fff Compare September 25, 2024 07:37
@miscco miscco requested a review from pciolkosz September 25, 2024 07:37
@miscco miscco added feature request New feature or request. CUDA Next Feature intended for the Cuda Next experimental library labels Sep 25, 2024
@miscco miscco force-pushed the extend_cuda_mempool_api branch from de97fff to ceb3f5c Compare September 25, 2024 07:48
Copy link
Contributor

🟩 CI finished in 12m 12s: Pass: 100%/52 | Total: 2h 19m | Avg: 2m 41s | Max: 10m 55s | Hits: 82%/222
  • 🟩 cudax: Pass: 100%/52 | Total: 2h 19m | Avg: 2m 41s | Max: 10m 55s | Hits: 82%/222

    🟩 cpu
      🟩 amd64              Pass: 100%/48  | Total:  2h 11m | Avg:  2m 43s | Max: 10m 55s | Hits:  82%/222   
      🟩 arm64              Pass: 100%/4   | Total:  8m 37s | Avg:  2m 09s | Max:  3m 26s
    🟩 ctk
      🟩 12.0               Pass: 100%/19  | Total: 52m 23s | Avg:  2m 45s | Max: 10m 55s | Hits:  82%/111   
      🟩 12.6               Pass: 100%/33  | Total:  1h 27m | Avg:  2m 38s | Max: 10m 37s | Hits:  82%/111   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/19  | Total: 52m 23s | Avg:  2m 45s | Max: 10m 55s | Hits:  82%/111   
      🟩 nvcc12.6           Pass: 100%/33  | Total:  1h 27m | Avg:  2m 38s | Max: 10m 37s | Hits:  82%/111   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/52  | Total:  2h 19m | Avg:  2m 41s | Max: 10m 55s | Hits:  82%/222   
    🟩 cxx
      🟩 Clang9             Pass: 100%/2   | Total:  4m 37s | Avg:  2m 18s | Max:  2m 25s
      🟩 Clang10            Pass: 100%/2   | Total:  4m 29s | Avg:  2m 14s | Max:  2m 20s
      🟩 Clang11            Pass: 100%/4   | Total:  9m 09s | Avg:  2m 17s | Max:  2m 27s
      🟩 Clang12            Pass: 100%/4   | Total:  8m 37s | Avg:  2m 09s | Max:  2m 10s
      🟩 Clang13            Pass: 100%/4   | Total:  8m 58s | Avg:  2m 14s | Max:  2m 26s
      🟩 Clang14            Pass: 100%/4   | Total: 10m 26s | Avg:  2m 36s | Max:  3m 52s
      🟩 Clang15            Pass: 100%/2   | Total:  5m 05s | Avg:  2m 32s | Max:  2m 41s
      🟩 Clang16            Pass: 100%/4   | Total:  9m 47s | Avg:  2m 26s | Max:  3m 26s
      🟩 Clang17            Pass: 100%/2   | Total:  4m 30s | Avg:  2m 15s | Max:  2m 16s
      🟩 Clang18            Pass: 100%/2   | Total:  6m 13s | Avg:  3m 06s | Max:  3m 54s
      🟩 GCC9               Pass: 100%/2   | Total:  4m 10s | Avg:  2m 05s | Max:  2m 11s
      🟩 GCC10              Pass: 100%/4   | Total:  8m 16s | Avg:  2m 04s | Max:  2m 09s
      🟩 GCC11              Pass: 100%/4   | Total:  8m 24s | Avg:  2m 06s | Max:  2m 18s
      🟩 GCC12              Pass: 100%/7   | Total: 20m 05s | Avg:  2m 52s | Max:  3m 43s
      🟩 GCC13              Pass: 100%/3   | Total:  5m 19s | Avg:  1m 46s | Max:  2m 01s
      🟩 MSVC14.36          Pass: 100%/1   | Total: 10m 55s | Avg: 10m 55s | Max: 10m 55s | Hits:  82%/111   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 10m 37s | Avg: 10m 37s | Max: 10m 37s | Hits:  82%/111   
    🟩 cxx_family
      🟩 Clang              Pass: 100%/30  | Total:  1h 11m | Avg:  2m 23s | Max:  3m 54s
      🟩 GCC                Pass: 100%/20  | Total: 46m 14s | Avg:  2m 18s | Max:  3m 43s
      🟩 MSVC               Pass: 100%/2   | Total: 21m 32s | Avg: 10m 46s | Max: 10m 55s | Hits:  82%/222   
    🟩 gpu
      🟩 v100               Pass: 100%/52  | Total:  2h 19m | Avg:  2m 41s | Max: 10m 55s | Hits:  82%/222   
    🟩 jobs
      🟩 Build              Pass: 100%/47  | Total:  2h 00m | Avg:  2m 34s | Max: 10m 55s | Hits:  82%/222   
      🟩 Test               Pass: 100%/5   | Total: 18m 41s | Avg:  3m 44s | Max:  3m 54s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  2m 07s | Avg:  2m 07s | Max:  2m 07s
      🟩 90a                Pass: 100%/1   | Total:  2m 01s | Avg:  2m 01s | Max:  2m 01s
    🟩 std
      🟩 17                 Pass: 100%/28  | Total:  1h 03m | Avg:  2m 16s | Max:  3m 37s
      🟩 20                 Pass: 100%/24  | Total:  1h 16m | Avg:  3m 10s | Max: 10m 55s | Hits:  82%/222   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda
CUDA C Core Library

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
pycuda
CUDA C Core Library

🏃‍ Runner counts (total jobs: 52)

# Runner
41 linux-amd64-cpu16
5 linux-amd64-gpu-v100-latest-1
4 linux-arm64-cpu16
2 windows-amd64-cpu16

Copy link
Contributor

@pciolkosz pciolkosz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we would like to have strongly typed pool attributes like with device attributes, but I also don't think it's a high priority right now.

{
cudax::mr::async_memory_pool pool{current_device};

{ // cudaMemPoolReuseFollowEventDependencies
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this repeated test be a template (even a lambda)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that some attributes do not have setters, some need an allocation some dont

@miscco miscco merged commit 808f9c2 into NVIDIA:main Oct 1, 2024
68 checks passed
@miscco miscco deleted the extend_cuda_mempool_api branch October 1, 2024 08:45
fbusato pushed a commit to fbusato/cccl that referenced this pull request Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CUDA Next Feature intended for the Cuda Next experimental library feature request New feature or request.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants