gemm_tr, gemm_ad do not use matrixmultiply #717

Andlon · 2020-04-13T15:04:53Z

It seems that gemm_tr and gemm_ad currently do not leverage matrixmultiply for larger matrices. I noticed when I was profiling after making some changes to some of my performance-sensitive code. In fact, pre-computing let a_t = a.transpose() and calling gemm(1.0, &a_t, &b, 1.0) was significantly faster.

The text was updated successfully, but these errors were encountered:

sebcrozet · 2020-04-14T20:47:02Z

Hi!

That's right, they don't use matrixmultiply currently. I suppose we could make gemm_tr work with matrixmultiply by adjusting the row and col strides accordingly. The gemm_ad method on the other hand can't use matrixmultiply (except for f32 and f64 matrices in which case this is equivalent to gemm_tr) because it does not support complex numbers.

Andlon · 2020-04-16T08:29:53Z

Ah, I see. I had totally forgotten that matrixmultiply does not support complex numbers, and I moreover did not know that it doesn't native support transposition. Thanks for explaining!

jbncode · 2022-07-10T19:41:40Z

I assume nothing has happened on this issue? I'd be willing to try modifying gemm_tr to use matrixmultiply if nobody else wants to work on it. The performance difference is substantial, and also quite surprising to someone who doesn't know about it. Perhaps a warning in the documentation for the tr_mul and gemm_tr methods would be appropriate until this is fixed?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gemm_tr, gemm_ad do not use matrixmultiply #717

gemm_tr, gemm_ad do not use matrixmultiply #717

Andlon commented Apr 13, 2020

sebcrozet commented Apr 14, 2020

Andlon commented Apr 16, 2020

jbncode commented Jul 10, 2022

gemm_tr, gemm_ad do not use matrixmultiply #717

gemm_tr, gemm_ad do not use matrixmultiply #717

Comments

Andlon commented Apr 13, 2020

sebcrozet commented Apr 14, 2020

Andlon commented Apr 16, 2020

jbncode commented Jul 10, 2022