Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove _attn_implementation in LlamaBidirectionalModel constructor #12364

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

oliverholworthy
Copy link

Important

The Update branch button must only be pressed in very rare occassions.
An outdated branch is never blocking the merge of a PR.
Please reach out to the automation team before pressing that button.

What does this PR do ?

Remove setting of _attn_implementation in the LlamaBidirectionalModel constructor.

The eager implementation is not as memory efficient as the sdpa implementation (which is selected by default) and encounters OOM errors when calibrating the model during quantization with long sequences (~3000 tokens) on a GPU with 48GB memory.

Removing this line enable the sdpa attention implementation to be used by default.
The choice of attention implementation can be changed by providing the attn_implementation parameter to the model when loaded with the from_pretrained method.

Collection: llm

Changelog

  • Remove setting of _attn_implementation in the LlamaBidirectionalModel constructor.

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

@akoumpa
Copy link
Member

akoumpa commented Feb 27, 2025

Hi @oliverholworthy , can you signoff your commit and push again?

To add your Signed-off-by line to every commit in this branch:

Ensure you have a local copy of your branch by [checking out the pull request locally via command line](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/checking-out-pull-requests-locally).
In your local branch, run: `git rebase HEAD~1 --signoff`
Force push your changes to overwrite the branch: `git push --force-with-lease origin oholworthy/embedding-attn-implementation`

Apologies for the trouble, but it's needed for CI. Thank you.

@akoumpa
Copy link
Member

akoumpa commented Feb 27, 2025

CC @suiyoubi

@oliverholworthy oliverholworthy force-pushed the oholworthy/embedding-attn-implementation branch from 3546df1 to 46420c9 Compare February 27, 2025 12:52
@oliverholworthy oliverholworthy force-pushed the oholworthy/embedding-attn-implementation branch from 46420c9 to 08830b4 Compare February 27, 2025 12:53
@oliverholworthy
Copy link
Author

@akoumpa I've added the signoff line to the commit and updated the branch

Copy link
Collaborator

@suiyoubi suiyoubi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this!

@suiyoubi suiyoubi enabled auto-merge (squash) February 28, 2025 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants