Use LoopVectorization.jl's threads, sometimes? #113

mcabbott · 2021-06-30T16:45:05Z

LoopVectorization has changed two things since its interaction with Tullio was thought out:

a name change @avx -> @turbo, and
a multi-threading macro @avxt or @tturbo == @turbo thread=true.

The easy change would be to make the keyword here turbo=true etc.

I believe the threading uses https://github.com/JuliaSIMD/Polyester.jl, and has lower overhead to launch threads than Threads.@spawn. But if I understand right, using both together can cause problems, e.g. JuliaSIMD/LoopVectorization.jl#221 or JuliaSIMD/ThreadingUtilities.jl#25. To allow but not require use of this, the questions are:

Should this just mean calling @tturbo on the whole iteration space (as is done for KernelAbstractions now) or should it also/only be possible to use these threads within Tullio's recursive threads-then-blocks algorithm?
Is there a non-confusing interface for this? Since @tullio aims to be concise it's nice not to need 5 keyword options every time.

The text was updated successfully, but these errors were encountered:

chriselrod · 2021-06-30T16:52:57Z

Tullio's recursive threads-then-blocks algorithm?

An additional consideration is that I haven't implemented anything like this in LoopVectorization yet, so Tullio's current implementation will get better performance beyond a certain size:

(Also, I made LV ramp thread use up more slowly since creating this plot, so I should probably rerun this benchmark to see how it looks now.)
I'll implement this eventually, but it'll be a while.

mcabbott · 2021-06-30T17:03:09Z

That's a nice graph. You can see that Tullio turns on threading too early (around 64 IIRC) on your machine -- the overhead of @spawn isn't paying for itself.

OK, so it sounds like the goal is to figure out how to use ThreadingUtilities or Polyester in place of @spawn.

One possible interface is like @tullio A[i] := exp(B[i]) threads=Polyester. There's already grad=Base / Dual / false. And if it's an orthogonal choice to whether to use @turbo then perhaps it shouldn't share a keyword.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use LoopVectorization.jl's threads, sometimes? #113

Use LoopVectorization.jl's threads, sometimes? #113

mcabbott commented Jun 30, 2021 •

edited

Loading

chriselrod commented Jun 30, 2021 •

edited

Loading

mcabbott commented Jun 30, 2021 •

edited

Loading

Use LoopVectorization.jl's threads, sometimes? #113

Use LoopVectorization.jl's threads, sometimes? #113

Comments

mcabbott commented Jun 30, 2021 • edited Loading

chriselrod commented Jun 30, 2021 • edited Loading

mcabbott commented Jun 30, 2021 • edited Loading

mcabbott commented Jun 30, 2021 •

edited

Loading

chriselrod commented Jun 30, 2021 •

edited

Loading

mcabbott commented Jun 30, 2021 •

edited

Loading