-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyTorch as the core compute layer #18
Comments
IIUC, another motivation for the suggestion is that the transfer could be zero-copy regardless of which device the data is on, whereas using NumPy requires transfer to CPU and back when the original data is elsewhere? In some cases, this would work and be useful. I'm not sure if there would be an advantage in the But maybe I don't understand the proposal - is it broader than cases like scipy/scipy#20772, which mosty tries to dispatch to existing functions in special libraries if they have a direct equivalent? What about cases like adding array API support to |
ISTM a simple version would be to build a |
This is what I was thinking too, and how I'd envisioned doing things for |
To come at this from a slightly different angle, if we think about the scope of the array libraries, guided by what is included in the array API standard right now, it is likely the case that some APIs will not be standardised, and deliberately so. For example, many SciPy functions, like those in The fact that modules like In any case, I think that where possible, the focus should be firstly on consuming the standard API, and only extending support / switching from NumPy to another library when compiled code forces our hand and the potential gain is judged worth it.
If so, that seems pretty drastic, but maybe that is just a knee-jerk reaction from me. There does exist sentiment among some NumPy users that NumPy is fine, and they do not want to invest any time in learning to use another array library (or indeed, learning to develop for another array library, cf. scipy/scipy#18286 (comment)). Even if the SciPy API remained the same, but made PyTorch a required runtime dependency, I think we would get some negative backlash. Also that the "PyTorch cpu wheel is ~ 183 MB, which is much bigger than NumPy".
This does sound like we are discussing moving away from the standard altogether, which I don't think makes sense. I think the wider coverage is the defining point of the push, as while PyTorch may provide a lot of NumPy-user-wishes right now (GPU, other nice things that you have already mentioned), it may not in the future (as wishes develop/change). The "all future Array libraries that adopt the standard" is really the star of the show IMO. As above, moving from NumPy to PyTorch would be a separate discussion (default fallback implementation vs. dispatching to known xp-native-modules), but from my perspective that seems like a bridge to cross at rather a later point. |
I should know but I don't: could you use this to solve the problem of "I want my (array consuming) library code to work with user inputs that are cupy, numpy, pytorch, etc without having to write code that contains lots of I think it is possible. You'd have a call at the start of your library function that performs the dlpack based mode, then exclusively use |
I think so, yeah. At the cost of PyTorch becoming a required runtime dependency, contributors having to use the PyTorch API, and becoming reliant on PyTorch to implement support for new devices etc. The problem it doesn't solve is "I want to keep everything native to my array library of choice (because of some unique feature), wherever only basic/fundamental functions are needed". I think the standard is needed for that. |
During Array API adoption and from conversations with SciPy devs, I've seen attempts to dispatch to underlying libraries for compute. For example: scipy/scipy#20772. The motivation for this special casing is that the Array API standard does not container all the required APIs. Logistically, I think the standard will always lag behind what libraries offer and there will be some APIs that may never be standardized.
An alternative proposal is to use
dlpack
to do a zero copy transfer to PyTorch, use the PyTorch API for compute, and usedlpack
transfer back to the original array container. Here are the pros and cons:Pros
Cons
Currently, I am -0 on such a move.
The text was updated successfully, but these errors were encountered: