How to implement SDD Sparse Matrix Multiplication with high sparsity (96% ) but low block-level sparsity 13% #1766

feihugis · 2023-06-09T23:02:49Z

feihugis
Jun 9, 2023

I am trying to leverage the below code to implement a sparse matrix multiplication, where a is in the shape of [M=40, K=768], b is in the shape of [K=768, N=50016], and c is in the shape of [M=40, N=50016]. Almost 96% of c are zeros, but these zeros are not located in regular blocks. When BLOCK=16, the block-level sparsity is only 13%, which results in bad performance in the current triton.ops.blocksparse.matmul. Are there any suggestions for how to implement the matrix multiplication for this kind of cases? Thanks in advance!

op = triton.ops.blocksparse.matmul(layout, BLOCK=16, MODE='sdd', trans_a=TRANS_A, trans_b=TRANS_B, device="cuda")
c = op(a, b)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to implement SDD Sparse Matrix Multiplication with high sparsity (96% ) but low block-level sparsity 13% #1766

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

How to implement SDD Sparse Matrix Multiplication with high sparsity (96% ) but low block-level sparsity 13% #1766

feihugis Jun 9, 2023

Replies: 0 comments

feihugis
Jun 9, 2023