-
Notifications
You must be signed in to change notification settings - Fork 321
improve fp8 blockwise gemm perf #2784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2784
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 1 Cancelled JobAs of commit c288547 with merge base 1526dfe ( NEW FAILURE - The following job has failed:
CANCELLED JOB - The following job was cancelled. Please retry:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
stack-info: PR: #2784, branch: danielvegamyhre/stack/43
a42133f
to
5d45f01
Compare
stack-info: PR: #2784, branch: danielvegamyhre/stack/43
5d45f01
to
da736d3
Compare
stack-info: PR: #2784, branch: danielvegamyhre/stack/43
da736d3
to
c288547
Compare
@vkuzo I'm OOO Monday and Tuesday, let's cancel our meeting this week. (I'm not sure if meetings are autodeclined for PTO so I wanted to reach out to make sure) |
confirmed test failure is unrelated to change and talked with feature owner about fixing |
Stacked PRs:
improve fp8 blockwise gemm perf
Summary
fp8 blockwise 1x128_128x128 gemm
output = input @ weight.t()
grad_input = grad_output @ weight
fp8 blockwise 1x128_128x1 gemm
grad_weight = grad_output_t @ input