Skip to content

align xpu behavior w/ cuda #2551

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Conversation

yao-matrix
Copy link
Contributor

@yao-matrix yao-matrix commented May 21, 2025

  1. for lorafa and randlora: i can see peft requirement torch >=1.13, and in 1.13, torch already has a device agnostic torch.autocast, switch to use the device agnostic API to also cover xpu
  2. clean codes in tests folder to use device agnostic clean cache API. Before this PR, some test cases use device-agnostic clean cache API, some use torch.cuda.xx; after this PR, all use device-agnostic clean cache API
  3. enable gptqmodel multi-device test case on XPU, enable torchao test cases on XPU

Signed-off-by: YAO Matrix <[email protected]>
@yao-matrix
Copy link
Contributor Author

@githubnemo , pls help review, thx very much.

@yao-matrix
Copy link
Contributor Author

@IlyasMoutawwakil , could you pls help review? Thx

@yao-matrix
Copy link
Contributor Author

@IlyasMoutawwakil , do you know who can help review and merge the PR for peft repo? Thx very much.

@yao-matrix yao-matrix changed the title align xpu behavior w/ CUDA in lorafa align xpu behavior w/ cuda May 28, 2025
@@ -78,7 +78,7 @@ def require_torch_multi_gpu(test_case):
return test_case


def require_multi_accelerator(test_case):
def require_torch_multi_accelerator(test_case):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is actually for torch multi-accelerator, since it uses torch_device, change the name to reflect the fact and align the name convention w/ require_torch_multi_gpu.

Copy link
Member

@IlyasMoutawwakil IlyasMoutawwakil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one nit.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@githubnemo githubnemo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @IlyasMoutawwakil, it would be best to use a helper function to determine if bfloat16 is available.

LGTM otherwise :)

Signed-off-by: Matrix YAO <[email protected]>
Signed-off-by: Matrix YAO <[email protected]>
@yao-matrix
Copy link
Contributor Author

yao-matrix commented May 29, 2025

@IlyasMoutawwakil @githubnemo i've updated the PR per your comments, pls help review and comment. and i've checked the ci failure, seems not related to my changes, thx very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants