We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
As reshared by @FL33TW00D at https://x.com/fleetwood___/status/1894754562210165029, having subgroups support in transformers.js would be huge for performance.
I'm filing this feature request to engage conversation and discuss how this can be achieved now that WebGPU subgroups have shipped in Chrome 134: https://developer.chrome.com/blog/new-in-webgpu-134#improve_machine-learning_workloads_with_subgroups
Note that some work has been started in Apache TVM as well in apache/tvm#17699
Performance, performance, and performance.
I'd be happy to help answering questions about how subgroups are implemented in Chromium.
The text was updated successfully, but these errors were encountered:
Exciting! 🚀 Let me loop in @guschmue to the discussion to see where we can add support for this 💪
Sorry, something went wrong.
@guschmue @xenova FWIW Apache TVM is currently adding support for subgroupShuffle(), subgroupShuffleUp(), and subgroupShuffleDown().
subgroupShuffle()
subgroupShuffleUp()
subgroupShuffleDown()
@guschmue Did you have a chance to figure out where it makes sense to add WebGPU subgroups support to Transformers.js?
@guschmue Another consideration is the eventual switch over to the native WebGPU EP - perhaps we can align efforts on that front?
Also, cc @FL33TW00D it could be great to integrate your work on optimizing LayerNorm w/ subgroups (https://fleetwood.dev/posts/layernorm-as-fast-as-possible) here. 👀 What do you think?
FYI According to microsoft/onnxruntime@8eb5513, ONNX runtime see a 3x perf increase on Metal with subgroup matrices.
No branches or pull requests
Feature request
As reshared by @FL33TW00D at https://x.com/fleetwood___/status/1894754562210165029, having subgroups support in transformers.js would be huge for performance.
I'm filing this feature request to engage conversation and discuss how this can be achieved now that WebGPU subgroups have shipped in Chrome 134: https://developer.chrome.com/blog/new-in-webgpu-134#improve_machine-learning_workloads_with_subgroups
Note that some work has been started in Apache TVM as well in apache/tvm#17699
Motivation
Performance, performance, and performance.
Your contribution
I'd be happy to help answering questions about how subgroups are implemented in Chromium.
The text was updated successfully, but these errors were encountered: