xe: ocl: add missing type enablement #2465

rjoursler · 2025-01-21T20:31:04Z

Fixes the following error observed on XeLP systems.

./benchdnn --matmul --engine=gpu --dt=f8_e5m2:s4:f8_e5m2 --stag=abc --attr-scales=src:common:2+wei:per_ocic:f8_e4m3:32x1 --attr-zero-points=wei:per_ocic
:s4:32x1 --attr-fpmath=bf16:true 7x6x32:7x32x64
onednn_verbose,v1,info,oneDNN v3.8.0 (commit 0eb6cd6d9e934905c6caa2db8cbe131da7af2231)
onednn_verbose,v1,info,cpu,runtime:OpenMP,nthr:24
onednn_verbose,v1,info,cpu,isa:Intel AVX2 with Intel DL Boost
onednn_verbose,v1,info,gpu,runtime:OpenCL
onednn_verbose,v1,info,gpu,engine,opencl device count:2
onednn_verbose,v1,info,gpu,engine,0,name:Intel(R) UHD Graphics 770 [0x4680],driver_version:22.49.25018,binary_kernels:enabled
onednn_verbose,v1,info,gpu,engine,1,name:Intel(R) UHD Graphics 770 [0x4680],driver_version:22.49.25018,binary_kernels:enabled
onednn_verbose,v1,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,v1,common,error,ocl,Error during the build of OpenCL program. Build log:
3:12187:25: error: implicit declaration of function 'cvt_f8_e4m3_to_hf' is invalid in OpenCL
            wei_scale = WEI_SCALES_TO_REF(wei_scales[wei_scale_off]);
                        ^
3:2435:44: note: expanded from macro 'WEI_SCALES_TO_REF'
#define WEI_SCALES_TO_REF(x) convert_float(cvt_f8_e4m3_to_hf(x))
                                           ^
3:12187:25: note: did you mean 'cvt_f8_e5m2_to_hf'?
3:2435:44: note: expanded from macro 'WEI_SCALES_TO_REF'
#define WEI_SCALES_TO_REF(x) convert_float(cvt_f8_e4m3_to_hf(x))
                                           ^
3:596:38: note: 'cvt_f8_e5m2_to_hf' declared here
half16 __attribute__((overloadable)) cvt_f8_e5m2_to_hf(uchar16 b) {
                                     ^
,src/gpu/intel/ocl/ocl_gpu_engine.cpp:174
onednn_verbose,v1,primitive,error,ocl,errcode -11,CL_BUILD_PROGRAM_FAILURE,src/gpu/intel/ocl/ocl_gpu_engine.cpp:279,src/gpu/intel/ocl/ocl_gpu_engine.cpp:279
0:UNTESTED_FAILED __REPRO: --matmul --engine=gpu --dt=f8_e5m2:s4:f8_e5m2 --stag=abc --attr-scales=src:common:2+wei:per_ocic:f8_e4m3:32x1 --attr-zero-points=wei:per_ocic:s4:32x1 --attr-fpmath=bf16:true 7x6x32:7x32x64
tests:1 passed:0 skipped:0 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:1 listed:0
total: 0.32s; fill: 0.00s (0%); compute_ref: 0.00s (0%); compare: 0.00s (0%);

rjoursler · 2025-01-21T20:35:29Z

make test
disable test_device_cpu
enable test_device_gpu

xe: ocl: add missing type enablement

3e1f060

rjoursler requested a review from a team as a code owner January 21, 2025 20:31

github-actions bot added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Jan 21, 2025

rjoursler mentioned this pull request Jan 21, 2025

[rls-v3.7] xe: ocl: add missing type enablement #2466

Merged

atkassen approved these changes Jan 21, 2025

View reviewed changes

Simonsays095 approved these changes Jan 21, 2025

View reviewed changes

kealan-barbieri approved these changes Jan 21, 2025

View reviewed changes

rjoursler merged commit 9fa46da into main Jan 22, 2025
4 of 5 checks passed

rjoursler deleted the rjoursle/fix_ref_matmul branch January 22, 2025 14:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xe: ocl: add missing type enablement #2465

xe: ocl: add missing type enablement #2465

rjoursler commented Jan 21, 2025

rjoursler commented Jan 21, 2025

xe: ocl: add missing type enablement #2465

xe: ocl: add missing type enablement #2465

Conversation

rjoursler commented Jan 21, 2025

rjoursler commented Jan 21, 2025