You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's been a while and I'm still finding it hard to achieve some simple results with OpenCL, either OpenCL C, PyOpenCL or others (tried CUDA). I'd like to try OpenACC, but good luck finding it targeting OpenCL :)
Some general examples that also comment on memory, work items, work groups, queues and atomics if possible (apparently it's not possible to use a mutex or lock up regions if there's shared global memory of similar). Even if the examples don't make any sense for GPU computing; toy programs that are fun to experiment with and learn from could be useful.
For me specifically, others will of course have different interests:
Examples might be Bitcoin brainwallets (sha256 and ripemd, there's some secp256k1 code out there too) and the famous double-SHA256 to mine Bitcoin blocks might be interesting examples to help learn OpenCL a little more and specifically implement with PyOpenCL. Book examples are usually boring or obscure for the educated but uninitiated parallel processing learner. There are other fun examples that could receive a lot of attention; anything to do with data permutations e.g. sorting, hashing like the above example but perhaps simpler to start with ("launch as many kernels as possible to hash N passwords"), searching, shuffling memory around where this might be advantageous and fast on GPUs, data translations like transforms (but maybe not obscure mathematical examples), and of course some good old numerical examples, but those that are easier to follow and not intended for experts or academics doing research. Just for the fun and enjoyment of learning some techniques that might, with time and patience, speed things up significantly and provide insight into parallel and distributed computing.
Can we write some more docs and examples? I'm using OpenCL C, I have some modest and I think easier to grasp examples; when some code is cleaned up I could donate some examples that might be fun for people to try with PyOpenCL. And get up to speed, because there really isn't much out there. Great material stopped being produced years ago in my opnion, and CUDA seems to be all the rage.
Thank you, I am sorry if this sounded more like a frustrated rant.
The text was updated successfully, but these errors were encountered:
It's been a while and I'm still finding it hard to achieve some simple results with OpenCL, either OpenCL C, PyOpenCL or others (tried CUDA). I'd like to try OpenACC, but good luck finding it targeting OpenCL :)
Some general examples that also comment on memory, work items, work groups, queues and atomics if possible (apparently it's not possible to use a mutex or lock up regions if there's shared global memory of similar). Even if the examples don't make any sense for GPU computing; toy programs that are fun to experiment with and learn from could be useful.
For me specifically, others will of course have different interests:
Examples might be Bitcoin brainwallets (sha256 and ripemd, there's some secp256k1 code out there too) and the famous double-SHA256 to mine Bitcoin blocks might be interesting examples to help learn OpenCL a little more and specifically implement with PyOpenCL. Book examples are usually boring or obscure for the educated but uninitiated parallel processing learner. There are other fun examples that could receive a lot of attention; anything to do with data permutations e.g. sorting, hashing like the above example but perhaps simpler to start with ("launch as many kernels as possible to hash N passwords"), searching, shuffling memory around where this might be advantageous and fast on GPUs, data translations like transforms (but maybe not obscure mathematical examples), and of course some good old numerical examples, but those that are easier to follow and not intended for experts or academics doing research. Just for the fun and enjoyment of learning some techniques that might, with time and patience, speed things up significantly and provide insight into parallel and distributed computing.
Can we write some more docs and examples? I'm using OpenCL C, I have some modest and I think easier to grasp examples; when some code is cleaned up I could donate some examples that might be fun for people to try with PyOpenCL. And get up to speed, because there really isn't much out there. Great material stopped being produced years ago in my opnion, and CUDA seems to be all the rage.
Thank you, I am sorry if this sounded more like a frustrated rant.
The text was updated successfully, but these errors were encountered: