You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is more of a general question. I am trying to multiply the data of a memory buffer by multiple scalars. So data points [a, b, c, d, e], are all multiplied by scalar A, then scalar B, etc. However, I have not found a way to do this without calling the kernel twice (once for A and once for B). I know it is possible to broadcast the data points and do the multiplications in different compute tiles, but for the application I want to implement this does not work since the number of scalars is higher than the number of compute tiles.
Please let me know if this is possible, and if not what are the alternatives for this kind of design.
The text was updated successfully, but these errors were encountered:
Hi! For this sort of discussion it is always better to discuss around some code. For this, I would say you have to call the kernel twice or implement the computation twice within a larger kernel where you pass in multiple scalar values.
This is more of a general question. I am trying to multiply the data of a memory buffer by multiple scalars. So data points [a, b, c, d, e], are all multiplied by scalar A, then scalar B, etc. However, I have not found a way to do this without calling the kernel twice (once for A and once for B). I know it is possible to broadcast the data points and do the multiplications in different compute tiles, but for the application I want to implement this does not work since the number of scalars is higher than the number of compute tiles.
Please let me know if this is possible, and if not what are the alternatives for this kind of design.
The text was updated successfully, but these errors were encountered: