Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xe: ocl: add missing type enablement #2465

Merged
merged 1 commit into from
Jan 22, 2025
Merged

Conversation

rjoursler
Copy link
Contributor

Fixes the following error observed on XeLP systems.

./benchdnn --matmul --engine=gpu --dt=f8_e5m2:s4:f8_e5m2 --stag=abc --attr-scales=src:common:2+wei:per_ocic:f8_e4m3:32x1 --attr-zero-points=wei:per_ocic
:s4:32x1 --attr-fpmath=bf16:true 7x6x32:7x32x64
onednn_verbose,v1,info,oneDNN v3.8.0 (commit 0eb6cd6d9e934905c6caa2db8cbe131da7af2231)
onednn_verbose,v1,info,cpu,runtime:OpenMP,nthr:24
onednn_verbose,v1,info,cpu,isa:Intel AVX2 with Intel DL Boost
onednn_verbose,v1,info,gpu,runtime:OpenCL
onednn_verbose,v1,info,gpu,engine,opencl device count:2
onednn_verbose,v1,info,gpu,engine,0,name:Intel(R) UHD Graphics 770 [0x4680],driver_version:22.49.25018,binary_kernels:enabled
onednn_verbose,v1,info,gpu,engine,1,name:Intel(R) UHD Graphics 770 [0x4680],driver_version:22.49.25018,binary_kernels:enabled
onednn_verbose,v1,primitive,info,template:operation,engine,primitive,implementation,prop_kind,memory_descriptors,attributes,auxiliary,problem_desc,exec_time
onednn_verbose,v1,common,error,ocl,Error during the build of OpenCL program. Build log:
3:12187:25: error: implicit declaration of function 'cvt_f8_e4m3_to_hf' is invalid in OpenCL
            wei_scale = WEI_SCALES_TO_REF(wei_scales[wei_scale_off]);
                        ^
3:2435:44: note: expanded from macro 'WEI_SCALES_TO_REF'
#define WEI_SCALES_TO_REF(x) convert_float(cvt_f8_e4m3_to_hf(x))
                                           ^
3:12187:25: note: did you mean 'cvt_f8_e5m2_to_hf'?
3:2435:44: note: expanded from macro 'WEI_SCALES_TO_REF'
#define WEI_SCALES_TO_REF(x) convert_float(cvt_f8_e4m3_to_hf(x))
                                           ^
3:596:38: note: 'cvt_f8_e5m2_to_hf' declared here
half16 __attribute__((overloadable)) cvt_f8_e5m2_to_hf(uchar16 b) {
                                     ^
,src/gpu/intel/ocl/ocl_gpu_engine.cpp:174
onednn_verbose,v1,primitive,error,ocl,errcode -11,CL_BUILD_PROGRAM_FAILURE,src/gpu/intel/ocl/ocl_gpu_engine.cpp:279,src/gpu/intel/ocl/ocl_gpu_engine.cpp:279
0:UNTESTED_FAILED __REPRO: --matmul --engine=gpu --dt=f8_e5m2:s4:f8_e5m2 --stag=abc --attr-scales=src:common:2+wei:per_ocic:f8_e4m3:32x1 --attr-zero-points=wei:per_ocic:s4:32x1 --attr-fpmath=bf16:true 7x6x32:7x32x64
tests:1 passed:0 skipped:0 mistrusted:0 unimplemented:0 invalid_arguments:0 failed:1 listed:0
total: 0.32s; fill: 0.00s (0%); compute_ref: 0.00s (0%); compare: 0.00s (0%);

@rjoursler rjoursler requested a review from a team as a code owner January 21, 2025 20:31
@github-actions github-actions bot added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Jan 21, 2025
@rjoursler
Copy link
Contributor Author

make test
disable test_device_cpu
enable test_device_gpu

@rjoursler rjoursler merged commit 9fa46da into main Jan 22, 2025
4 of 5 checks passed
@rjoursler rjoursler deleted the rjoursle/fix_ref_matmul branch January 22, 2025 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants