-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce elementwise extension size #1976
Conversation
…function Doing so reduces the binary size of elementwise operations extension Before: ``` (dev_dpctl) opavlyk@mtl-world:~/repos/dpctl$ ls -l dpctl/tensor/_tensor_elementwise_impl.cpython-312-x86_64-linux-gnu.so -rw-r--r-- 1 opavlyk opavlyk 38659896 Jan 19 20:58 dpctl/tensor/_tensor_elementwise_impl.cpython-312-x86_64-linux-gnu.so ``` After: ``` dev_dpctl) opavlyk@mtl-world:~/repos/dpctl$ ls -l dpctl/tensor/_tensor_elementwise_impl.cpython-312-x86_64-linux-gnu.so -rw-r--r-- 1 opavlyk opavlyk 37176600 Jan 21 06:36 dpctl/tensor/_tensor_elementwise_impl.cpython-312-x86_64-linux-gnu.so ``` Added static assertions to offset_utils to ensure that indexers are device copyable.
2a526d2
to
3f90e9b
Compare
Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞 |
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_490 ran successfully. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_491 ran successfully. |
This PR factors out inline submit to populate padded vector in specializations for binary operations on matrix and a vector.
Doing so allows to generate fewer of such kernels, resulting in the binary size decrease.
Before:
After:
Also this PR adds
static_assert
inoffset_utils.hpp
to verify that indexers are device copyable.It also sneaks in changes of defining local typename for the functor being submitted in
cgh.parallel_for
to simplify the invocation.