-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[STF] stackable stf resources #2674
base: main
Are you sure you want to change the base?
Conversation
1a454b0
to
dcb75d2
Compare
/ok to test |
1 similar comment
/ok to test |
🟩 CI finished in 23m 04s: Pass: 100%/54 | Total: 4h 36m | Avg: 5m 06s | Max: 17m 44s | Hits: 89%/224
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
🏃 Runner counts (total jobs: 54)
# | Runner |
---|---|
43 | linux-amd64-cpu16 |
5 | linux-amd64-gpu-v100-latest-1 |
4 | linux-arm64-cpu16 |
2 | windows-amd64-cpu16 |
/ok to test |
🟩 CI finished in 1h 12m: Pass: 100%/54 | Total: 4h 28m | Avg: 4m 58s | Max: 23m 23s | Hits: 89%/224
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
🏃 Runner counts (total jobs: 54)
# | Runner |
---|---|
43 | linux-amd64-cpu16 |
5 | linux-amd64-gpu-v100-latest-1 |
4 | linux-arm64-cpu16 |
2 | windows-amd64-cpu16 |
cudax/include/cuda/experimental/__stf/utility/stackable_ctx.cuh
Outdated
Show resolved
Hide resolved
cudax/include/cuda/experimental/__stf/utility/stackable_ctx.cuh
Outdated
Show resolved
Hide resolved
cudax/include/cuda/experimental/__stf/utility/stackable_ctx.cuh
Outdated
Show resolved
Hide resolved
ff0ba38
to
fb3de98
Compare
* This is a helper routine which can be used to launch graphs, for example. Using the stream after finalize() | ||
* results in undefined behavior. | ||
*/ | ||
auto pick_dstream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we do not need to expose that
/ok to test |
🟩 CI finished in 40m 53s: Pass: 100%/20 | Total: 3h 17m | Avg: 9m 53s | Max: 24m 35s | Hits: 582%/312
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
🏃 Runner counts (total jobs: 20)
# | Runner |
---|---|
12 | linux-amd64-cpu16 |
4 | linux-arm64-cpu16 |
2 | windows-amd64-cpu16 |
2 | linux-amd64-gpu-v100-latest-1 |
…ethod of the context is called (if logical data's pop() was not called explicitly).
/ok to test |
🟩 CI finished in 42m 04s: Pass: 100%/20 | Total: 4h 07m | Avg: 12m 22s | Max: 22m 12s | Hits: 582%/312
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
python | |
CCCL C Parallel Library | |
Catch2Helper |
🏃 Runner counts (total jobs: 20)
# | Runner |
---|---|
12 | linux-amd64-cpu16 |
4 | linux-arm64-cpu16 |
2 | windows-amd64-cpu16 |
2 | linux-amd64-gpu-v100-latest-1 |
…in the wrong directory : so we record the full path so that it works at runtime
…ut we still generate the output accordingly
…and more comments
/ok to test |
🟨 CI finished in 22m 20s: Pass: 9%/22 | Total: 1h 20m | Avg: 3m 38s | Max: 9m 51s | Hits: 95%/564
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
stdpar | |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
stdpar | |
python | |
CCCL C Parallel Library | |
Catch2Helper |
🏃 Runner counts (total jobs: 22)
# | Runner |
---|---|
13 | linux-amd64-cpu16 |
4 | linux-arm64-cpu16 |
2 | windows-amd64-cpu16 |
2 | linux-amd64-gpu-rtx2080-latest-1 |
1 | linux-amd64-gpu-h100-latest-1 |
/ok to test |
1 similar comment
/ok to test |
🟨 CI finished in 33m 29s: Pass: 18%/22 | Total: 5h 03m | Avg: 13m 46s | Max: 33m 26s | Hits: 68%/1762
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
stdpar | |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
stdpar | |
python | |
CCCL C Parallel Library | |
Catch2Helper |
🏃 Runner counts (total jobs: 22)
# | Runner |
---|---|
13 | linux-amd64-cpu16 |
4 | linux-arm64-cpu16 |
2 | windows-amd64-cpu16 |
2 | linux-amd64-gpu-rtx2080-latest-1 |
1 | linux-amd64-gpu-h100-latest-1 |
/ok to test |
🟨 CI finished in 1h 11m: Pass: 86%/22 | Total: 5h 51m | Avg: 15m 58s | Max: 31m 41s | Hits: 60%/10781
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
stdpar | |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
Thrust | |
+/- | CUDA Experimental |
stdpar | |
python | |
CCCL C Parallel Library | |
Catch2Helper |
🏃 Runner counts (total jobs: 22)
# | Runner |
---|---|
13 | linux-amd64-cpu16 |
4 | linux-arm64-cpu16 |
2 | windows-amd64-cpu16 |
2 | linux-amd64-gpu-rtx2080-latest-1 |
1 | linux-amd64-gpu-h100-latest-1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a separate PR
@@ -947,11 +959,13 @@ public: | |||
} | |||
|
|||
template <typename T> | |||
frozen_logical_data<T> freeze(cuda::experimental::stf::logical_data<T> d, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
separate PR for freeze improvement
Description
This introduces helper methods to improve how we nest contexts to better leverage CUDA Graphs
Checklist