-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Fix issues with Llama HF->NeoX conversion #1345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
instead of splitting by heads first for GQA - Fixes RMSNorm implementation by adding epsilon to the varience instead of adding it directly to RMS
Hi @aurelion-source |
Hi @aflah02, thanks for pointing out the issue with the NeoX -> HF conversion. I've updated and tested the script to fix the problem. |
Thanks @aurelion-source One question I have is if I use the fused rms norm kernel will it be compatible with the HF version? or is it equivalent to the neox version |
Yes, it should be compatible with HF. It uses apex's fused RMSNorm implementation. |
Resolves #1337 and #1342