Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GPU] Revert jit::gemm handling for f32 src, f16 fpmath #2467

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kealan-barbieri
Copy link
Contributor

Description

Revert to sending cases with f32 activations, fp16 fpmath setting to ref, as these cases cause additional register pressure from f32->f16 reorder that isnt supported by f16 gemm strategies:

onednn_verbose,v1,info,oneDNN v3.7.0 (commit 80b61e36646c9d1ab64d439a5bf1ea0966c6f0d9)
onednn_verbose,v1,info,cpu,runtime:OpenMP,nthr:224
onednn_verbose,v1,info,cpu,isa:Intel AVX-512 with float16, Intel DL Boost and bfloat16 support 
onednn_verbose,v1,info,gpu,runtime:OpenCL
onednn_verbose,v1,info,gpu,engine,opencl device count:1 
onednn_verbose,v1,info,gpu,engine,0,name:Intel(R) Data Center GPU Max 1100,driver_version:24.39.31294,binary_kernels:enabled
onednn_verbose,v1,info,experimental features are enabled
onednn_verbose,v1,info,use batch_normalization stats one pass is enabled
onednn_verbose,v1,info,GPU convolution v2 is disabled
onednn_verbose,v1,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,v1,primitive,error,gpu,jit::gemm,Insufficient registers in requested bundle,src/gpu/intel/jit/gemm/gen_gemm_kernel.cpp:959
Error: Function 'create_primitive' at (/nfs/pdx/home/skazakov/hal9000_skazakov/DNNL_JIRA/oneDNN/tests/benchdnn/dnnl_common.hpp:422) returned 'runtime_error'
Error: Function 'init_prim' at (/nfs/pdx/home/skazakov/hal9000_skazakov/DNNL_JIRA/oneDNN/tests/benchdnn/dnnl_common.hpp:475) returned '1'
Error: Function 'createit' at (/nfs/pdx/home/skazakov/hal9000_skazakov/DNNL_JIRA/oneDNN/tests/benchdnn/matmul/matmul.cpp:884) returned '1'
Error: Function 'create' at (/nfs/pdx/home/skazakov/hal9000_skazakov/DNNL_JIRA/oneDNN/tests/benchdnn/utils/task.hpp:49) returned '1'
0:UNTESTED_FAILED __REPRO: --matmul --engine=gpu --allow-enum-tags-only=false --dt=f32:u8:f32 --stag=ab --wtag=ba --dtag=ab --bia_dt=f32 --attr-scales=wei:per_oc --attr-zero-points=wei:per_oc:u8 --attr-scratchpad=user --attr-fpmath=f16:true 16384x512:512x512

Similar cases should use --fpmath=strict:true to leverage f32 jit:gemm strategies.

Fixes # MFDNN-13045

Checklist

General

  • Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
  • Have you formatted the code using clang-format?

Bug fixes

  • Have you included information on how to reproduce the issue (either in a github issue or in this PR)?
  • Have you added relevant regression tests?

@kealan-barbieri kealan-barbieri requested a review from a team as a code owner January 21, 2025 21:45
@github-actions github-actions bot added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Jan 21, 2025
if (fpmath_bf16
&& (utils::one_of(Type::f32, problem_.Ta, problem_.Tb)
|| (problem_.Ta.isF8() || problem_.Tb.isF8()))
if (fpmath_bf16 && (problem_.Ta.isF8() || problem_.Tb.isF8())
Copy link
Contributor

@rjoursler rjoursler Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue in the given case is that we should be matching the f32 input against [SB] and [SH], not B and H. Can we just extend the lambda inputs to add_mode_matches and drop these if statements? The add_mode_matches function already adds all matching strategy conversions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants