-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PTX: Remove internal instructions #3583
Merged
bernhardmgruber
merged 3 commits into
NVIDIA:main
from
bernhardmgruber:ptx_remove_internal
Jan 30, 2025
Merged
PTX: Remove internal instructions #3583
bernhardmgruber
merged 3 commits into
NVIDIA:main
from
bernhardmgruber:ptx_remove_internal
Jan 30, 2025
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
bernhardmgruber
commented
Jan 29, 2025
•
edited
Loading
edited
- Merge after PTX: Update generated files with Blackwell instructions #3568
/ok to test |
1 task
miscco
approved these changes
Jan 29, 2025
🟨 CI finished in 3h 04m: Pass: 99%/152 | Total: 1d 05h | Avg: 11m 48s | Max: 1h 04m | Hits: 515%/21523
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
CUDA Experimental | |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 152)
# | Runner |
---|---|
110 | linux-amd64-cpu16 |
17 | linux-amd64-gpu-v100-latest-1 |
14 | windows-amd64-cpu16 |
10 | linux-arm64-cpu16 |
1 | linux-amd64-gpu-h100-latest-1 |
This is not supposed to be exposed in CCCL.
Not ready for inclusion yet. This needs to handle the optional extra output mask as well.
This has compiler bugs. We should use intrinsics instead.
bernhardmgruber
force-pushed
the
ptx_remove_internal
branch
from
January 29, 2025 22:09
d82e96c
to
092fdc6
Compare
🟨 CI finished in 4h 45m: Pass: 98%/152 | Total: 3d 04h | Avg: 30m 16s | Max: 1h 19m | Hits: 411%/21523
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
CUDA Experimental | |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 152)
# | Runner |
---|---|
110 | linux-amd64-cpu16 |
17 | linux-amd64-gpu-v100-latest-1 |
14 | windows-amd64-cpu16 |
10 | linux-arm64-cpu16 |
1 | linux-amd64-gpu-h100-latest-1 |
🟩 CI finished in 9h 55m: Pass: 100%/152 | Total: 3d 05h | Avg: 30m 32s | Max: 1h 19m | Hits: 411%/21523
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
Thrust | |
CUDA Experimental | |
python | |
CCCL C Parallel Library | |
Catch2Helper |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | python |
+/- | CCCL C Parallel Library |
+/- | Catch2Helper |
🏃 Runner counts (total jobs: 152)
# | Runner |
---|---|
110 | linux-amd64-cpu16 |
17 | linux-amd64-gpu-v100-latest-1 |
14 | windows-amd64-cpu16 |
10 | linux-arm64-cpu16 |
1 | linux-amd64-gpu-h100-latest-1 |
Backport failed for Please cherry-pick the changes locally. git fetch origin branch/2.8.x
git worktree add -d .worktree/backport-3583-to-branch/2.8.x origin/branch/2.8.x
cd .worktree/backport-3583-to-branch/2.8.x
git checkout -b backport-3583-to-branch/2.8.x
ancref=$(git merge-base d21e0c9804ad63d23950c8b0a2462e5b7ebc8701 092fdc691acbe3197e54791ffc21bb51d30598ac)
git cherry-pick -x $ancref..092fdc691acbe3197e54791ffc21bb51d30598ac |
bernhardmgruber
added a commit
that referenced
this pull request
Jan 31, 2025
* barrier.cluster.aligned: Remove This is not supposed to be exposed in CCCL. * elect.sync: Remove Not ready for inclusion yet. This needs to handle the optional extra output mask as well. * mapa: Remove This has compiler bugs. We should use intrinsics instead. Co-authored-by: Allard Hendriksen <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.