-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_local_parallel_scan fails on pocl-cuda and intel-cpu #718
Comments
Reported it upstream yesterday: pocl/pocl#1157 I don't think it's our bug. |
|
Ah. Nvm. pocl/pocl#1157 is about the pyopencl scan. #600 is something that's been going on that @kaushikcfd promised he would fix at a point. |
Turns out we've never run GPU CI with pocl-cuda on Loopy. We should probably add that... |
Which version of the Intel CL runtime? |
|
I'm leaning towards those possibly being distinct issues. I don't trust Intel CL 2022.15.12.0.01_rel on account of intel/llvm#7877. pocl-cuda we'd have to troubleshoot, but on account of pocl/pocl#1157 (which affects a scan), it might be best to do so on pocl-cuda 3.0. |
Failure on pocl-cuda with
n=16
can be reproduced locally.With intel-cpu it is not reproduced locally, and is intermittent on CI. See https://github.com/inducer/loopy/actions/runs/3787208816/jobs/6438795872
Oclgrind, NVIDIA, pocl-pthread all work.
Wonder if this is a test issue or a compiler issue.
The text was updated successfully, but these errors were encountered: