-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CUDAX] Introduce pinned memory pool and move pinned memory resource to use it on new CUDA versions #3975
base: main
Are you sure you want to change the base?
Conversation
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
/ok to test |
🟥 CI finished in 25m 26s: Pass: 0%/22 | Total: 55m 39s | Avg: 2m 31s | Max: 8m 15s
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
🏃 Runner counts (total jobs: 22)
# | Runner |
---|---|
13 | linux-amd64-cpu16 |
4 | linux-arm64-cpu16 |
2 | windows-amd64-cpu16 |
2 | linux-amd64-gpu-rtx2080-latest-1 |
1 | linux-amd64-gpu-h100-latest-1 |
/ok to test |
🟨 CI finished in 22m 53s: Pass: 77%/22 | Total: 2h 24m | Avg: 6m 35s | Max: 14m 26s | Hits: 86%/9499
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
🏃 Runner counts (total jobs: 22)
# | Runner |
---|---|
13 | linux-amd64-cpu16 |
4 | linux-arm64-cpu16 |
2 | windows-amd64-cpu16 |
2 | linux-amd64-gpu-rtx2080-latest-1 |
1 | linux-amd64-gpu-h100-latest-1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursory glance
@@ -219,9 +219,9 @@ try | |||
|
|||
printf("Enabling peer access between GPU%d and GPU%d...\n", peers[0].get(), peers[1].get()); | |||
cudax::device_memory_resource dev0_resource(peers[0]); | |||
dev0_resource.enable_peer_access_from(peers[1]); | |||
dev0_resource.enable_access_from(peers[1]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move that rename into a separate PR?
inline all_devices::operator ::std::vector<device_ref>() const | ||
{ | ||
return ::std::vector<device_ref>(begin(), end()); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any benefit to not defining these functions inline?
//! @param __device_id The id of the device for which to query support. | ||
//! @throws cuda_error if \c cudaDeviceGetAttribute failed. | ||
//! @returns true if \c cudaDevAttrMemoryPoolsSupported is not zero. | ||
inline void __device_supports_stream_ordered_allocations(const int __device_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I came up with that name, but its bad. Because this does not retunrs anything.
This should rather be prefixed with something like __check
or __verify
// Construct on NUMA node 0 only for now | ||
__pool_properties.location.type = ::cudaMemLocationTypeHostNuma; | ||
__pool_properties.location.id = __id; | ||
#else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add comments to the conditional compilations
Draft, todo description