You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Should this just mean calling @tturbo on the whole iteration space (as is done for KernelAbstractions now) or should it also/only be possible to use these threads within Tullio's recursive threads-then-blocks algorithm?
Is there a non-confusing interface for this? Since @tullio aims to be concise it's nice not to need 5 keyword options every time.
The text was updated successfully, but these errors were encountered:
An additional consideration is that I haven't implemented anything like this in LoopVectorization yet, so Tullio's current implementation will get better performance beyond a certain size:
(Also, I made LV ramp thread use up more slowly since creating this plot, so I should probably rerun this benchmark to see how it looks now.)
I'll implement this eventually, but it'll be a while.
That's a nice graph. You can see that Tullio turns on threading too early (around 64 IIRC) on your machine -- the overhead of @spawn isn't paying for itself.
OK, so it sounds like the goal is to figure out how to use ThreadingUtilities or Polyester in place of @spawn.
One possible interface is like @tullio A[i] := exp(B[i]) threads=Polyester. There's already grad=Base / Dual / false. And if it's an orthogonal choice to whether to use @turbo then perhaps it shouldn't share a keyword.
LoopVectorization has changed two things since its interaction with Tullio was thought out:
@avx
->@turbo
, and@avxt
or@tturbo == @turbo thread=true
.The easy change would be to make the keyword here
turbo=true
etc.I believe the threading uses https://github.com/JuliaSIMD/Polyester.jl, and has lower overhead to launch threads than
Threads.@spawn
. But if I understand right, using both together can cause problems, e.g. JuliaSIMD/LoopVectorization.jl#221 or JuliaSIMD/ThreadingUtilities.jl#25. To allow but not require use of this, the questions are:@tturbo
on the whole iteration space (as is done for KernelAbstractions now) or should it also/only be possible to use these threads within Tullio's recursive threads-then-blocks algorithm?@tullio
aims to be concise it's nice not to need 5 keyword options every time.The text was updated successfully, but these errors were encountered: