Skip to content

Commit

Permalink
graph: backend: dnnl: avoid decomp sdpa kernel for dynamic quantization
Browse files Browse the repository at this point in the history
  • Loading branch information
wzt1997 committed Jan 21, 2025
1 parent 22037c4 commit 8855900
Showing 1 changed file with 2 additions and 5 deletions.
7 changes: 2 additions & 5 deletions src/graph/backend/dnnl/kernels/sdp_decomp_config.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -457,12 +457,9 @@ impl::status_t sdp_decomp_config_t::record_input_offset(
graph::op_kind::SoftMax};
for (const auto &cur_op : sg->get_ops()) {
const auto &op_kind = cur_op->get_kind();
VCHECK_SDP_DECOMP(
!(op_kind == graph::op_kind::DynamicDequantize
&& cur_op->get_attr<std::string>(op_attr::qtype)
== "per_group"),
VCHECK_SDP_DECOMP(op_kind != graph::op_kind::DynamicDequantize,
status::unimplemented,
"Not support per_group DynamicDequantize");
"Decomposed kernel does not support dynamic quantization");
// both mm1 and mm2 are found.
if (mm1 && mm2) break;
if (op_kind != graph::op_kind::MatMul) continue;
Expand Down

0 comments on commit 8855900

Please sign in to comment.