You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In all of our GPU calculations, we never utilize "shared memory" and, instead, rely heavily on storing data in the (fast but limited) "on-chip" and "thread local" GPU register. When the register fills up, data is spilled over to (very slow but more abundant) "off-chip" and "thread local" GPU local memory. However, there is a compromise that is GPU "shared memory" that, in my recent experience, can make computations 3x faster! Of course, coding using shared memory can increase the complexity of the codebase and thereby decreasing the long term readability/maintainability.
This GPU tutorial provides an EXCELLENT overview (especially Tutorial 5 was super enlightening)!
The text was updated successfully, but these errors were encountered:
In all of our GPU calculations, we never utilize "shared memory" and, instead, rely heavily on storing data in the (fast but limited) "on-chip" and "thread local" GPU register. When the register fills up, data is spilled over to (very slow but more abundant) "off-chip" and "thread local" GPU local memory. However, there is a compromise that is GPU "shared memory" that, in my recent experience, can make computations 3x faster! Of course, coding using shared memory can increase the complexity of the codebase and thereby decreasing the long term readability/maintainability.
This GPU tutorial provides an EXCELLENT overview (especially Tutorial 5 was super enlightening)!
The text was updated successfully, but these errors were encountered: