Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concat_on_disk fails to write alternative axis mapping and uns #1854

Open
2 of 3 tasks
milos7250 opened this issue Feb 12, 2025 · 0 comments · May be fixed by #1855
Open
2 of 3 tasks

concat_on_disk fails to write alternative axis mapping and uns #1854

milos7250 opened this issue Feb 12, 2025 · 0 comments · May be fixed by #1855

Comments

@milos7250
Copy link

milos7250 commented Feb 12, 2025

Please make sure these conditions are met

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of anndata.
  • (optional) I have confirmed this bug exists on the master branch of anndata.

Report

When concatenating any anndata files that contain a mapping for alternative axis (e.g. concatenating along obs and having varm), the command fails. Moreover, the uns_merge argument is completely ignored.

Code:

import anndata as ad

ad.experimental.concat_on_disk(
    in_files={"1": "sample1.h5ad", "2": "sample2.h5ad"},
    out_file="output.h5ad",
    max_loaded_elems=int(1e9),
    axis=0,
    join="outer",
    merge="unique",
    uns_merge="unique",
    index_unique="-",
)

Traceback:

Traceback (most recent call last):
  File "/mnt/shared/scratch/mmicik/rna-scripts/merge_adatas.py", line 68, in <module>
    ad.experimental.concat_on_disk(
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/experimental/merge.py", line 630, in concat_on_disk
    _write_alt_mapping(groups, output_group, alt_axis_name, alt_index, merge)
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/experimental/merge.py", line 381, in _write_alt_mapping
    write_elem(output_group, alt_axis_name, alt_mapping)
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/specs/registry.py", line 487, in write_elem
    Writer(_REGISTRY).write_elem(store, k, elem, dataset_kwargs=dataset_kwargs)
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/utils.py", line 249, in func_wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/specs/registry.py", line 354, in write_elem
    return write_func(store, k, elem, dataset_kwargs=dataset_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/specs/registry.py", line 71, in wrapper
    result = func(g, k, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/specs/methods.py", line 353, in write_mapping
    _writer.write_elem(g, sub_k, sub_v, dataset_kwargs=dataset_kwargs)
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/utils.py", line 249, in func_wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/specs/registry.py", line 351, in write_elem
    write_func = self.find_write_func(dest_type, elem, modifiers)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/specs/registry.py", line 318, in find_write_func
    return self.registry.get_write(dest_type, type(elem), modifiers, writer=self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/apps/users/mmicik/conda/envs/rna-python/lib/python3.12/site-packages/anndata/_io/specs/registry.py", line 134, in get_write
    raise IORegistryError._from_write_parts(dest_type, src_type, modifiers)
anndata._io.specs.registry.IORegistryError: No method registered for writing <class 'pandas.core.series.Series'> into <class 'h5py._hl.group.Group'>
Error raised while writing key 'gene_id' of <class 'h5py._hl.group.Group'> to /var

Upon examining the source code, I have noticed that in the _write_alt_mapping function, the mapping data (corresponding to varm group) is attempting to be written into var. After changing this part, concatenation completes, but uns is still missing in the result. Well, turns out the reason for this is that the code meant to merge uns is also missing.

I've had my go at fixing these issues, see my commit. For my data, it works with no problems. I was hesitant to create a PR with this fix, as I have not written appropriate tests for my change, but here it is #1855.

Versions

| Package      | Version |
| ------------ | ------- |
| anndata      | 0.11.3  |
| numpy        | 2.1.3   |
| pandas       | 2.2.3   |
| flatten_json | 0.1.14  |
| tqdm         | 4.67.1  |

| Dependency         | Version     |
| ------------------ | ----------- |
| natsort            | 8.4.0       |
| python-dateutil    | 2.9.0.post0 |
| charset-normalizer | 3.4.1       |
| Deprecated         | 1.2.18      |
| wrapt              | 1.17.2      |
| Cython             | 3.0.12      |
| asciitree          | 0.3.3       |
| pytz               | 2024.1      |
| session-info2      | 0.1.2       |
| zarr               | 2.18.4      |
| h5py               | 3.12.1      |
| setuptools         | 75.8.0      |
| six                | 1.17.0      |
| msgpack            | 1.1.0       |
| numcodecs          | 0.15.1      |
| packaging          | 24.2        |
| scipy              | 1.15.1      |

| Component | Info                                                                          |
| --------- | ----------------------------------------------------------------------------- |
| Python    | 3.12.8 | packaged by conda-forge | (main, Dec  5 2024, 14:24:40) [GCC 13.3.0] |
| OS        | Linux-6.1.0-31-amd64-x86_64-with-glibc2.36                                    |
| Updated   | 2025-02-12 17:43                                                              |
@milos7250 milos7250 linked a pull request Feb 12, 2025 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant