-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Recommended Chunk Shape doesn't take into account compound dtypes #1122
Comments
Update: |
@rly, when you have a chance could you provide some guidance on this issue? How can I get a builder from an in-memory nwbfile? Or, if that is too difficult, how can I get access to the schema for a given neurodata object? |
@pauladkisson I do not totally follow what happened here. If I understand correctly, An in-memory (not yet written) NWB file has no builders until you call import pynwb
manager = pynwb.get_manager()
manager.type_map.namespace_catalog.get_spec("core", "PlaneSegmentation").get_dataset("pixel_mask") |
Well, it isn't # define the data shape
if 'shape' in io_settings:
data_shape = io_settings.pop('shape')
elif hasattr(data, 'shape'):
data_shape = data.shape
elif isinstance(dtype, np.dtype) and len(dtype) > 1: # check if compound dtype
data_shape = (len(data),)
else:
data_shape = get_data_shape(data) And I'm not sure how exactly, but some recent changes uncovered this block of code so that it runs (
I don't think so. |
Another approach that may be more robust and which does not require knowledge of the spec, but does require knowledge of the pynwb object attributes: import pynwb
from pynwb.testing.mock.ophys import mock_PlaneSegmentation
manager = pynwb.get_manager()
ps = mock_PlaneSegmentation()
manager.type_map.get_map(ps).get_attr_spec("pixel_mask") |
After our meeting, we agreed that the way forward should be to just build the nwbfile, grab the dtype from the builder, and then figure out a mapping between the builder and the neurodata_objects that they correspond to. manager = pynwb.get_manager()
builder = manager.build(nwbfile)
# figure out how to map to the neurodata_object in the builder
dtype = neurodata_object_in_builder.dtype |
What happened?
Dev tests are failing: https://github.com/catalystneuro/neuroconv/actions/runs/11567195585/job/32197117750
Tracked this down to an update with hdmf that uncovered these lines: https://github.com/hdmf-dev/hdmf/blob/dev/src/hdmf/backends/hdf5/h5tools.py#L1427-L1435
And our chunking recommendation based on the data shape here: https://github.com/catalystneuro/neuroconv/blob/main/src/neuroconv/tools/nwb_helpers/_configuration_models/_base_dataset_io.py#L263
Notice how in hdmf, if the data has a compound dtype, the shape is
(len(data),)
, but in neuroconv the shape is alwaysget_data_shape(data)
.This throws an error when they mismatch in the case of a Caiman pixel_mask, which is a compound dtype in NWB.
The initial solution that I came up with was to load the NWB schema to figure out if a dataset is compound or not, but I was having some trouble finding the right code to load in the schema...
Steps to Reproduce
Traceback
No response
Operating System
Linux
Python Executable
Conda
Python Version
3.9
Package Versions
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: