Skip to content

Conversation

danielvegamyhre
Copy link
Contributor

@danielvegamyhre danielvegamyhre commented Aug 26, 2025

Copy link

pytorch-bot bot commented Aug 26, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2879

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ No Failures

As of commit fb1628c with merge base 15a6de6 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 26, 2025
@danielvegamyhre danielvegamyhre force-pushed the danielvegamyhre/stack/60 branch from 36e6bc7 to ca146f1 Compare August 26, 2025 01:20
danielvegamyhre added a commit that referenced this pull request Aug 26, 2025
…or clarity

stack-info: PR: #2879, branch: danielvegamyhre/stack/60
Copy link
Contributor

@vkuzo vkuzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sg, but if you're using A_scales I'd say the corresponding name is something like A_data, not A_fp8.

@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/55 to main August 26, 2025 21:55
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/55 August 26, 2025 21:55
@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/55 to main August 26, 2025 22:01
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/55 August 26, 2025 22:01
@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/55 to main August 27, 2025 02:57
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/55 August 27, 2025 02:57
@danielvegamyhre
Copy link
Contributor Author

danielvegamyhre commented Aug 27, 2025

sg, but if you're using A_scales I'd say the corresponding name is something like A_data, not A_fp8.

Yeah, hmm, scales and data are both different fp8 formats so I can see how this still could be unclear. The thing is, we already have A input tensor, so A_data vs A both sound like the same thing, and it gets confusing. I considered changing "A" to "A_hp" but A_hp vs A_data is still confusing. I figured "A" vs "A_fp8" was the clearest distinction, but A_hp vs A_data could also work I guess.

@@ -291,13 +291,13 @@ def forward(
ctx.out_dtype = out_dtype
ctx.emulated = emulated

# A_mx shape: (M, K)
# A_fp8 shape: (M, K)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of A_f8, this isn't going to scale to other dtypes. A_data is IMO better and more representative of the MX format being composed of raw data and scale.

Copy link
Contributor

@vkuzo vkuzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stamping because prototype

@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/55 to main August 27, 2025 15:42
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/55 August 27, 2025 15:43
@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/55 to main August 27, 2025 15:53
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/55 August 27, 2025 15:53
@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/55 to main August 27, 2025 16:08
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/55 August 27, 2025 16:08
@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/55 to main August 27, 2025 16:23
@danielvegamyhre danielvegamyhre force-pushed the danielvegamyhre/stack/60 branch from ca146f1 to ac6f4f1 Compare August 27, 2025 16:23
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/55 August 27, 2025 16:23
@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/55 to main August 27, 2025 16:28
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/55 August 27, 2025 16:29
@danielvegamyhre danielvegamyhre added the topic: not user facing Use this tag if you don't want this PR to show up in release notes label Aug 27, 2025
@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/55 to main August 27, 2025 19:10
@danielvegamyhre danielvegamyhre changed the base branch from main to danielvegamyhre/stack/55 August 27, 2025 19:10
danielvegamyhre added a commit that referenced this pull request Aug 27, 2025
…or clarity

stack-info: PR: #2879, branch: danielvegamyhre/stack/60
@danielvegamyhre danielvegamyhre force-pushed the danielvegamyhre/stack/60 branch from ac6f4f1 to 9f8ac6b Compare August 27, 2025 20:39
@danielvegamyhre danielvegamyhre changed the base branch from danielvegamyhre/stack/55 to main August 27, 2025 20:39
@danielvegamyhre danielvegamyhre force-pushed the danielvegamyhre/stack/60 branch from 9f8ac6b to c7e1840 Compare August 27, 2025 20:54
@danielvegamyhre danielvegamyhre changed the title [mxfp8 moe training] refactor all var names with suffix _mx to _fp8 for clarity [mxfp8 moe training] refactor all var names with suffix _mx to _data for clarity Aug 27, 2025
…for clarity

stack-info: PR: #2879, branch: danielvegamyhre/stack/60
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. mx topic: not user facing Use this tag if you don't want this PR to show up in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants