-
-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for DOK slicing and advanced indexing #298
Comments
Hi! First of all, thanks for the offer to contribute! It means a lot. Second, your code looks pretty much perfect, except that one thing I'd like is for parts of it to be ported to Numba, along with some of the other indexing logic. If you're willing to accept that burden, it'd be great to have your contribution in. |
During working on the indexing code I also made some experiments in speeding up the code with numba. But I decided to not use numba for now and let it mature for some time. Especially the numba typed list and dict objects as well as the slice object get some development work at the moment. So it may be much less work in the near future. |
Fair enough. Were the experiments successful or are you not using them for some other reason? I'm also willing to accept your code and speed it up in the future. |
I have done my experiments to evaluate, if all the functionality could be written in nopython mode. And the conclusion is yes. But as I wrote the code, it turned out, the numba version would result in a much bigger code base therefore much more work. For example I now use the slice indices method and the itertools product function. And the replecement of the usage of the itertools product function alone wold be not so easy. And as you can imagine, at the start, I had no idea how complex the whole indexing logic is. |
In that case, feel free to make a PR. |
Is this bychance related to this issue found over at xarray? Or should I post a separate issue? |
@El-minadero It seems the issue you're looking for is #114. |
@hameerabbasi I'd like to pick up this issue again, and if possible turn @stschroe's gist into a PR. Fancy indexing for item access and assignment for DOK arrays would be very useful for us over at napari to support sparse labels layers (see the issue on the napari repo here Is this something you would still be open to? If so, I think we can get a PR in very soon. |
Of course. If it doesn’t have a wontfix label, we’re open to PRs. |
Since the DOK array is versatile applicable I use them frequently in my projects. I needed support for slicing and advanced indexing and implemented it myself, see gist.
The indexing supports slicing, advanced (fancy) indexing and as I need it most of the time "array indexing". This is a fast version for advanced indexing. Only vectors (1d sequences) are accepted for indexing. The getitem method always returns dense ndarrays in contrast to normal advanced indexing of sparse arrays which create new copies of sparse arrays. The setitem method allows only vectors and scalars as values. This allows fast calculations over ndarray methods without the expensive creation of temporary sparse arrays between operations. This is of course only useful for small vectors, which fit easily in to memory.
If there is interest in adding this to pydata.sparse, I could help. But as I have never contributed to any open project, I have no idea if I could handle the necessary amount of work.
The text was updated successfully, but these errors were encountered: