Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graph: backend: dnnl: not use decomp sdpa kernel for dynamic quantized cases #2458

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

wzt1997
Copy link
Contributor

@wzt1997 wzt1997 commented Jan 21, 2025

Description

Recently we supported a SDPA with compressed K using per_channel quantization case, which will run into the decomposed sdpa primitive kernel unexpectedly with THREADPOOL cpu runtime( as it passes the check).
As we do not aim to support dynamic quantized cases in decomposed sdpa kernel, the PR enhanced the restriction to avoid such case.

@wzt1997 wzt1997 added the component:graph-api Codeowner: @oneapi-src/onednn-graph label Jan 21, 2025
@wzt1997 wzt1997 self-assigned this Jan 21, 2025
@wzt1997 wzt1997 requested a review from a team as a code owner January 21, 2025 06:14
@wzt1997 wzt1997 force-pushed the zhitao/fix-decomp-sdp-with-dynamic-dequant branch from ff023d7 to 8855900 Compare January 21, 2025 06:26
@wzt1997
Copy link
Contributor Author

wzt1997 commented Jan 21, 2025

make test
set test_scope=NIGHTLY
disable benchdnn_all
enable benchdnn_graph
enable build_cpu_runtime_thrpool_eigen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:graph-api Codeowner: @oneapi-src/onednn-graph
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants