-
Notifications
You must be signed in to change notification settings - Fork 51
Onnx slim transform #536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Onnx slim transform #536
Conversation
Signed-off-by: Tanisha <[email protected]>
Signed-off-by: Tanisha <[email protected]>
Signed-off-by: Tanisha <[email protected]>
Signed-off-by: Tanisha <[email protected]>
Performance comparison with onnx-slim transformed model and original model for GPT2 model. |
Signed-off-by: Tanisha <[email protected]>
Please once test with the full model https://huggingface.co/meta-llama/Llama-2-7b-chat-hf. |
Apply ruff check and format @tchawada |
I have applied ruff check and format.
Regards Tanisha
…________________________________
From: Amit Raj ***@***.***>
Sent: Tuesday, August 12, 2025 3:17 PM
To: quic/efficient-transformers ***@***.***>
Cc: Tanisha Chawada ***@***.***>; Mention ***@***.***>
Subject: Re: [quic/efficient-transformers] Onnx slim transform (PR #536)
WARNING: This email originated from outside of Qualcomm. Please be wary of any links or attachments, and do not enable macros.
[https://avatars.githubusercontent.com/u/168538872?s=20&v=4]quic-amitraj left a comment (quic/efficient-transformers#536)<#536 (comment)>
Apply ruff check and format @tchawada<https://github.com/tchawada>
—
Reply to this email directly, view it on GitHub<#536 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BVI2PQ66QHJDI2VQQ2HXNUL3NGZ4LAVCNFSM6AAAAACDVIPW6CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCNZYGYYDKNBRGE>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please resolve the lint warnings
QEfficient/base/onnx_transforms.py
Outdated
@@ -37,6 +39,8 @@ class FP16ClipTransform(OnnxTransform): | |||
Clips the tensor values to be in FP16 range, but preserves -inf values. | |||
""" | |||
|
|||
print("FP16ClipTransform is applied") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to use logger to print any messages.
QEfficient/base/onnx_transforms.py
Outdated
onnx_slim_transform = kwargs.get("enable_onnx_slim_transform", False) | ||
temp_onnx_path = kwargs.get("temp_onnx_path", None) | ||
if onnx_slim_transform: | ||
print("onnx slim transform done") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove print
print("onnx slim transform done") | ||
transformed = True | ||
slimmed_model = onnxslim.slim(model) | ||
onnx.save(slimmed_model, temp_onnx_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add Type Checking or Validation Ensure temp_onnx_path is not None before saving
I will do the appropriate changes.
________________________________
From: Hem Agnihotri ***@***.***>
Sent: Wednesday, August 13, 2025 11:51 AM
To: quic/efficient-transformers ***@***.***>
Cc: Tanisha Chawada ***@***.***>; Mention ***@***.***>
Subject: Re: [quic/efficient-transformers] Onnx slim transform (PR #536)
WARNING: This email originated from outside of Qualcomm. Please be wary of any links or attachments, and do not enable macros.
@quic-hemagnih commented on this pull request.
________________________________
In QEfficient/base/onnx_transforms.py<#536 (comment)>:
+ *,
+ onnx_base_dir: Optional[str] = None,
+ **kwargs,
+ ) -> Tuple[ModelProto, bool]:
+ """
+ :param enable_onnx_slim_transform: If True, applies onnx-slim transformations.
+ """
+ # print(kwargs)
+ transformed = False
+ onnx_slim_transform = kwargs.get("enable_onnx_slim_transform", False)
+ temp_onnx_path = kwargs.get("temp_onnx_path", None)
+ if onnx_slim_transform:
+ print("onnx slim transform done")
+ transformed = True
+ slimmed_model = onnxslim.slim(model)
+ onnx.save(slimmed_model, temp_onnx_path)
Add Type Checking or Validation Ensure temp_onnx_path is not None before saving
—
Reply to this email directly, view it on GitHub<#536 (review)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BVI2PQYNFOARIKTG4PFZAND3NLKNNAVCNFSM6AAAAACDVIPW6CVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZTCMJUGA4TEMBTGE>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Signed-off-by: Tanisha <[email protected]>
Instead of adding onnx_slim_transform to every AutoModel class can we consider creating a transform configuration module that returns enabled/disabled transforms as dict. Apply transforms in the base class based on this config? this can be applicable for both pytorch and onnx transforms. |
Hi, I'm the author of onnxslim, thanks for using it, and onnxslim applies to evevy single onnx model, feel free to message me if you have any problem and looking forward to more cooperation and intergration for your porjects. |
Performance comparison with onnx-slim transformed model and original model for GPT2 model.