Add Ampere GEMM example using Cute and CUTLASS 3.x #1604

aacostadiaz · 2024-06-27T16:20:26Z

This pull request adds a GEMM example for the NVIDIA Ampere architecture, using Cute and CUTLASS 3.x. This example demonstrates how to create a GEMM kernel using the Cute components defined for SM80 and the Collective MMA and Collective Epilogue APIs provided in CUTLASS 3.x.

hwu36 · 2024-07-10T16:11:12Z

@thakkarV @ccecka

thakkarV · 2024-07-10T17:38:38Z

this looks like a copy paste of the version we already have in the test dir. may I ask why you want to have this in the examples dir? we already have example 59 which does this

aacostadiaz · 2024-07-11T10:09:09Z

this looks like a copy paste of the version we already have in the test dir. may I ask why you want to have this in the examples dir? we already have example 59 which does this

Hi @thakkarV,

Thank you for your feedback.

This example is meant to complement the version in the test directory and provide more visibility for users who may not explore that directory.

It serves as an entry-level example, similar to example 14 for Ampere with Cutlass 2.x and examples 48 and 49 for Hopper on Cutlass 3.x. In addition, it allows users to easily experiment with different GEMM configurations, enabling them to tune GEMM to their needs. I could not find an easy way to do that without implementing this example.

Example 59 is much more complicated than what I'm adding here. I don't think it targets the same users or use case.

I think it is a valuable contribution that we have found useful for us and could be for others.

thakkarV · 2024-07-15T14:43:11Z

@aacostadiaz do you no longer want this merged?

aacostadiaz · 2024-07-15T17:03:02Z

Hi @thakkarV, yes. I still would like this to get merged. I accidentally closed it, sorry.

github-actions · 2024-08-14T17:04:53Z

This PR has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates. This PR will be labeled inactive-90d if there is no activity in the next 60 days.

github-actions · 2024-11-12T17:04:54Z

This PR has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates.

Add Ampere GEMM example in Cute and CUTLASS 3.x

205e4f3

aacostadiaz closed this Jul 15, 2024

aacostadiaz deleted the aacosta/cute-example branch July 15, 2024 14:40

aacostadiaz restored the aacosta/cute-example branch July 15, 2024 16:59

aacostadiaz reopened this Jul 15, 2024

github-actions bot added the inactive-30d label Aug 14, 2024

github-actions bot added the inactive-90d label Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Ampere GEMM example using Cute and CUTLASS 3.x #1604

Add Ampere GEMM example using Cute and CUTLASS 3.x #1604

aacostadiaz commented Jun 27, 2024

hwu36 commented Jul 10, 2024

thakkarV commented Jul 10, 2024 •

edited

Loading

aacostadiaz commented Jul 11, 2024

thakkarV commented Jul 15, 2024

aacostadiaz commented Jul 15, 2024

github-actions bot commented Aug 14, 2024

github-actions bot commented Nov 12, 2024

Add Ampere GEMM example using Cute and CUTLASS 3.x #1604

Are you sure you want to change the base?

Add Ampere GEMM example using Cute and CUTLASS 3.x #1604

Conversation

aacostadiaz commented Jun 27, 2024

hwu36 commented Jul 10, 2024

thakkarV commented Jul 10, 2024 • edited Loading

aacostadiaz commented Jul 11, 2024

thakkarV commented Jul 15, 2024

aacostadiaz commented Jul 15, 2024

github-actions bot commented Aug 14, 2024

github-actions bot commented Nov 12, 2024

thakkarV commented Jul 10, 2024 •

edited

Loading