Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matrix input file format questions #6

Open
ChadBurdyshaw opened this issue Mar 18, 2024 · 1 comment
Open

Matrix input file format questions #6

ChadBurdyshaw opened this issue Mar 18, 2024 · 1 comment

Comments

@ChadBurdyshaw
Copy link

First of all, thank you for making your software available as open source. I've installed this library for a research client and they have their matrices stored as an anndata object in CSR format in an hdf5 file. From what I can tell, your data_io.py function read in matrices as csv, mat (matlab?), npy, npz formats.
Is there a way/plan to read anndata csr from hdf5? Any recommendations to convert to npy (csv would be too large)?
And for the currently available formats, are they reading in csr sparse? Can you preprocess the A matrix to be distributed into multiple files, or does A have to be read from a single file and then distributed?

@ceodspspectrum
Copy link
Member

Currently, pyDNMFk has support for only dense arrays. For sparse and GPU accelerated computing, I recommend utilizing our library https://github.com/lanl/T-ELF . Examples are listed https://github.com/lanl/T-ELF/tree/main/examples/NMFk . You can load your data and perform NMFk as demonstrated in one of the examples with csr arrays and GPUs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants