Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] Ecephys integration of backend configuration #578

Closed
wants to merge 23 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
96bb078
first pass over ecephys integration
CodyCBakerPhD Sep 25, 2023
19f8a9b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 25, 2023
663086f
restore backcompatability
Sep 30, 2023
35e066b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 30, 2023
0058e35
Update CHANGELOG.md
CodyCBakerPhD Sep 30, 2023
26cb2d6
Merge branch 'main' into integrate_with_interfaces_and_converter
CodyCBakerPhD Jan 2, 2024
aebcf1a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 2, 2024
bc9d58b
debug by adding object type router with call sign adjustments; added …
CodyCBakerPhD Jan 2, 2024
4277785
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 2, 2024
b2eb07b
Merge branch 'main' into set_data_io_debug
CodyCBakerPhD Jan 2, 2024
6c170de
Merge branch 'main' into integrate_with_interfaces_and_converter
CodyCBakerPhD Jan 2, 2024
d9fe5d0
fix requirement URL
CodyCBakerPhD Jan 2, 2024
8e26844
Merge branch 'set_data_io_debug' of https://github.com/catalystneuro/…
CodyCBakerPhD Jan 2, 2024
b5aa9a2
add changelog
CodyCBakerPhD Jan 2, 2024
570ae59
fixing bad conflicts and imports
CodyCBakerPhD Jan 2, 2024
8a1c732
fix bad imports and resolve bad merge conflict
CodyCBakerPhD Jan 2, 2024
d75044e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 2, 2024
ab7a7fa
Merge branch 'set_data_io_debug' into integrate_with_interfaces_and_c…
CodyCBakerPhD Jan 2, 2024
662b5cb
don't need iterator parametrizations
CodyCBakerPhD Jan 2, 2024
4a07c5a
don't need iterator options
CodyCBakerPhD Jan 2, 2024
37a6dac
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 2, 2024
1f528c7
don't need buffer arguments
CodyCBakerPhD Jan 2, 2024
161d314
Merge branch 'set_data_io_debug' into integrate_with_interfaces_and_c…
CodyCBakerPhD Jan 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,14 @@
* Changed the `Suite2pSegmentationInterface` to support multiple plane segmentation outputs. The interface now has a `plane_name` and `channel_name` arguments to determine which plane output and channel trace add to the NWBFile. [PR #601](https://github.com/catalystneuro/neuroconv/pull/601)
* Added tool function `configure_datasets` for configuring all datasets of an in-memory `NWBFile` to be backend specific. [PR #571](https://github.com/catalystneuro/neuroconv/pull/571)
* Added `LightningPoseConverter` to add pose estimation data and the original and the optional labeled video added as ImageSeries to NWB. [PR #633](https://github.com/catalystneuro/neuroconv/pull/633)
* Integrated backend configuration with the interfaces, converters, and write tools for ecephys. [PR #578](https://github.com/catalystneuro/neuroconv/pull/578)

### Improvements
* `nwbinspector` has been removed as a minimal dependency. It becomes an extra (optional) dependency with `neuroconv[dandi]`. [PR #672](https://github.com/catalystneuro/neuroconv/pull/672)
* Added a `from_nwbfile` class method constructor to all `BackendConfiguration` models. [PR #673](https://github.com/catalystneuro/neuroconv/pull/673)
* Added compression to `FicTracDataInterface`. [PR #678](https://github.com/catalystneuro/neuroconv/pull/678)
* Exposed `block_index` to all OpenEphys interfaces. [PR #695](https://github.com/catalystneuro/neuroconv/pull/695)
* Added support for `DynamicTable` columns in the `configure_backend` tool function. [PR #700](https://github.com/catalystneuro/neuroconv/pull/700)



Expand Down
2 changes: 1 addition & 1 deletion requirements-minimal.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ jsonschema>=3.2.0
PyYAML>=5.4
scipy>=1.4.1
h5py>=3.9.0
hdmf>=3.11.0
hdmf @ git+https://github.com/hdmf-dev/hdmf.git@dev
hdmf_zarr>=0.4.0
pynwb>=2.3.2;python_version>='3.8'
pydantic>=1.10.13,<2.0.0
Expand Down
47 changes: 45 additions & 2 deletions src/neuroconv/basedatainterface.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,18 @@
import warnings
from abc import ABC, abstractmethod
from pathlib import Path
from typing import List, Optional
from typing import List, Literal, Optional, Union

from pynwb import NWBFile

from .tools.nwb_helpers import make_nwbfile_from_metadata, make_or_load_nwbfile
from .tools.nwb_helpers import (
HDF5BackendConfiguration,
ZarrBackendConfiguration,
configure_backend,
get_default_backend_configuration,
make_nwbfile_from_metadata,
make_or_load_nwbfile,
)
from .utils import get_schema_from_method_signature, load_dict_from_file
from .utils.dict import DeepDict

Expand Down Expand Up @@ -51,12 +58,36 @@ def create_nwbfile(self, metadata=None, **conversion_options) -> NWBFile:
def add_to_nwbfile(self, nwbfile: NWBFile, **conversion_options) -> None:
raise NotImplementedError()

def get_default_backend_configuration(
self, backend: Literal["hdf5", "zarr"] = "hdf5", metadata: Optional[dict] = None, **conversion_options
) -> Union[HDF5BackendConfiguration, ZarrBackendConfiguration]:
"""
Fill and return a default backend configuration to serve as a starting point for further customization.

Parameters
----------
backend : "hdf5" or "zarr", default: "hdf5"
The type of backend used to create the file.
metadata : dict, optional
Metadata dictionary with information used to create the NWBFile when one does not exist or overwrite=True.
conversion_options : dict, optional
Similar to source_data, a dictionary containing keywords for each interface for which non-default
conversion specification is requested.
"""
if metadata is None:
metadata = self.get_metadata()

with make_or_load_nwbfile(metadata=metadata, verbose=self.verbose) as nwbfile:
self.add_to_nwbfile(nwbfile=nwbfile, metadata=metadata, **conversion_options)
return get_default_backend_configuration(nwbfile=nwbfile, backend=backend)

def run_conversion(
self,
nwbfile_path: Optional[str] = None,
nwbfile: Optional[NWBFile] = None,
metadata: Optional[dict] = None,
overwrite: bool = False,
backend: Union[Literal["hdf5", "zarr"], HDF5BackendConfiguration, ZarrBackendConfiguration] = "hdf5",
**conversion_options,
):
"""
Expand All @@ -74,6 +105,11 @@ def run_conversion(
overwrite : bool, default: False
Whether to overwrite the NWBFile if one exists at the nwbfile_path.
The default is False (append mode).
backend : "hdf5", "zarr", a HDF5BackendConfiguration, or a ZarrBackendConfiguration, default: "hdf5"
If "hdf5" or "zarr", this type of backend will be used to create the file,
with all datasets using the default values.
To customize, call the `.get_default_backend_configuration(...)` method, modify the returned
BackendConfiguration object, and pass that instead.
"""
if nwbfile_path is None:
warnings.warn(
Expand All @@ -84,6 +120,12 @@ def run_conversion(

if metadata is None:
metadata = self.get_metadata()
if isinstance(backend, str):
backend_configuration = self.get_default_backend_configuration(
backend=backend, metadata=metadata, conversion_options=conversion_options
)
else:
backend_configuration = backend

with make_or_load_nwbfile(
nwbfile_path=nwbfile_path,
Expand All @@ -93,3 +135,4 @@ def run_conversion(
verbose=getattr(self, "verbose", False),
) as nwbfile_out:
self.add_to_nwbfile(nwbfile_out, metadata=metadata, **conversion_options)
configure_backend(nwbfile=nwbfile_out, backend_configuration=backend_configuration)
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,10 @@ def add_to_nwbfile(
starting_time: Optional[float] = None,
write_as: Literal["raw", "lfp", "processed"] = "lfp",
write_electrical_series: bool = True,
compression: Optional[str] = "gzip",
compression_opts: Optional[int] = None,
iterator_type: str = "v2",
iterator_opts: Optional[dict] = None,
compression: Optional[str] = None, # TODO: remove on or after March 1, 2024
compression_opts: Optional[int] = None, # TODO: remove on or after March 1, 2024
iterator_type: str = None, # TODO: remove on or after March 1, 2024
iterator_opts: Optional[dict] = None, # TODO: remove on or after March 1, 2024
):
return super().add_to_nwbfile(
nwbfile=nwbfile,
Expand All @@ -38,8 +38,5 @@ def add_to_nwbfile(
starting_time=starting_time,
write_as=write_as,
write_electrical_series=write_electrical_series,
compression=compression,
compression_opts=compression_opts,
iterator_type=iterator_type,
iterator_opts=iterator_opts,
compression=compression, # TODO: remove on or after March 1, 2024
)
Original file line number Diff line number Diff line change
Expand Up @@ -303,10 +303,10 @@ def add_to_nwbfile(
starting_time: Optional[float] = None,
write_as: Literal["raw", "lfp", "processed"] = "raw",
write_electrical_series: bool = True,
compression: Optional[str] = "gzip",
compression_opts: Optional[int] = None,
iterator_type: str = "v2",
iterator_opts: Optional[dict] = None,
compression: Optional[str] = None, # TODO: remove on or after March 1, 2024
compression_opts: Optional[int] = None, # TODO: remove on or after March 1, 2024
iterator_type: str = None, # TODO: remove on or after March 1, 2024
iterator_opts: Optional[dict] = None, # TODO: remove on or after March 1, 2024
):
"""
Primary function for converting raw (unprocessed) RecordingExtractor data to the NWB standard.
Expand All @@ -329,35 +329,6 @@ def add_to_nwbfile(
write_electrical_series : bool, default: True
Electrical series are written in acquisition. If False, only device, electrode_groups,
and electrodes are written to NWB.
compression : {'gzip', 'lzf', None}
Type of compression to use.
Set to None to disable all compression.
compression_opts : int, default: 4
Only applies to compression="gzip". Controls the level of the GZIP.
iterator_type : {'v2', 'v1'}
The type of DataChunkIterator to use.
'v1' is the original DataChunkIterator of the hdmf data_utils.
'v2' is the locally developed RecordingExtractorDataChunkIterator, which offers full control over chunking.
iterator_opts : dict, optional
Dictionary of options for the RecordingExtractorDataChunkIterator (iterator_type='v2').
Valid options are
buffer_gb : float, default: 1.0
In units of GB. Recommended to be as much free RAM as available. Automatically calculates suitable
buffer shape.
buffer_shape : tuple, optional
Manual specification of buffer shape to return on each iteration.
Must be a multiple of chunk_shape along each axis.
Cannot be set if `buffer_gb` is specified.
chunk_mb : float. default: 1.0
Should be below 1 MB. Automatically calculates suitable chunk shape.
chunk_shape : tuple, optional
Manual specification of the internal chunk shape for the HDF5 dataset.
Cannot be set if `chunk_mb` is also specified.
display_progress : bool, default: False
Display a progress bar with iteration rate and estimated completion time.
progress_bar_options : dict, optional
Dictionary of keyword arguments to be passed directly to tqdm.
See https://github.com/tqdm/tqdm#parameters for options.
"""
from ...tools.spikeinterface import add_recording

Expand All @@ -374,8 +345,5 @@ def add_to_nwbfile(
write_as=write_as,
write_electrical_series=write_electrical_series,
es_key=self.es_key,
compression=compression,
compression_opts=compression_opts,
iterator_type=iterator_type,
iterator_opts=iterator_opts,
compression=None, # TODO: remove on or after March 1, 2024
)
59 changes: 54 additions & 5 deletions src/neuroconv/nwbconverter.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,20 @@
import json
from collections import Counter
from pathlib import Path
from typing import Dict, List, Optional, Union
from typing import Dict, List, Literal, Optional, Union

from jsonschema import validate
from pynwb import NWBFile

from .basedatainterface import BaseDataInterface
from .tools.nwb_helpers import get_default_nwbfile_metadata, make_or_load_nwbfile
from .tools.nwb_helpers import (
HDF5BackendConfiguration,
ZarrBackendConfiguration,
configure_backend,
get_default_backend_configuration,
get_default_nwbfile_metadata,
make_or_load_nwbfile,
)
from .utils import (
dict_deep_update,
fill_defaults,
Expand Down Expand Up @@ -115,16 +122,48 @@ def add_to_nwbfile(self, nwbfile: NWBFile, metadata, conversion_options: Optiona
nwbfile=nwbfile, metadata=metadata, **conversion_options.get(interface_name, dict())
)

def get_default_backend_configuration(
self,
backend: Literal["hdf5", "zarr"] = "hdf5",
metadata: Optional[dict] = None,
conversion_options: Optional[dict] = None,
) -> Union[HDF5BackendConfiguration, ZarrBackendConfiguration]:
"""
Fill and return a default backend configuration to serve as a starting point for further customization.

Parameters
----------
backend : "hdf5" or "zarr", default: "hdf5"
The type of backend used to create the file.
metadata : dict, optional
Metadata dictionary with information used to create the NWBFile when one does not exist or overwrite=True.
conversion_options : dict, optional
Similar to source_data, a dictionary containing keywords for each interface for which non-default
conversion specification is requested.
"""
if metadata is None:
metadata = self.get_metadata()
self.validate_metadata(metadata=metadata)
self.validate_conversion_options(conversion_options=conversion_options)

self.temporally_align_data_interfaces() # Might not be entirely relevant for the backend, but keeping it anyway

with make_or_load_nwbfile(metadata=metadata, verbose=self.verbose) as nwbfile:
self.add_to_nwbfile(nwbfile=nwbfile, metadata=metadata, conversion_options=conversion_options)
return get_default_backend_configuration(nwbfile=nwbfile, backend=backend)

def run_conversion(
self,
nwbfile_path: Optional[str] = None,
nwbfile: Optional[NWBFile] = None,
metadata: Optional[dict] = None,
overwrite: bool = False,
conversion_options: Optional[dict] = None,
backend: Union[Literal["hdf5", "zarr"], HDF5BackendConfiguration, ZarrBackendConfiguration] = "hdf5",
) -> None:
"""
Run the NWB conversion over all the instantiated data interfaces.

Parameters
----------
nwbfile_path : FilePathType
Expand All @@ -140,12 +179,21 @@ def run_conversion(
conversion_options : dict, optional
Similar to source_data, a dictionary containing keywords for each interface for which non-default
conversion specification is requested.
backend : "hdf5", "zarr", a HDF5BackendConfiguration, or a ZarrBackendConfiguration, default: "hdf5"
If "hdf5" or "zarr", this type of backend will be used to create the file,
with all datasets using the default values.
To customize, call the `.get_default_backend_configuration(...)` method, modify the returned
BackendConfiguration object, and pass that instead.
"""
if metadata is None:
metadata = self.get_metadata()

if isinstance(backend, str):
backend_configuration = self.get_default_backend_configuration(
backend=backend, metadata=metadata, conversion_options=conversion_options
)
else:
backend_configuration = backend
self.validate_metadata(metadata=metadata)

self.validate_conversion_options(conversion_options=conversion_options)

self.temporally_align_data_interfaces()
Expand All @@ -157,7 +205,8 @@ def run_conversion(
overwrite=overwrite,
verbose=self.verbose,
) as nwbfile_out:
self.add_to_nwbfile(nwbfile_out, metadata, conversion_options)
self.add_to_nwbfile(nwbfile=nwbfile_out, metadata=metadata, conversion_options=conversion_options)
configure_backend(nwbfile=nwbfile_out, backend_configuration=backend_configuration)

def temporally_align_data_interfaces(self):
"""Override this method to implement custom alignment"""
Expand Down
14 changes: 11 additions & 3 deletions src/neuroconv/tools/nwb_helpers/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
"""Collection of Pydantic models and helper functions for configuring dataset IO parameters for different backends."""
# Mark these imports as private to avoid polluting the namespace; only used in global BACKEND_NWB_IO mapping
from hdmf_zarr import NWBZarrIO as _NWBZarrIO
from pynwb import NWBHDF5IO as _NWBHDF5IO

from ._backend_configuration import get_default_backend_configuration
from ._configuration_models._base_backend import BackendConfiguration
from ._configuration_models._base_dataset_io import DatasetInfo, DatasetIOConfiguration
Expand All @@ -13,7 +17,7 @@
ZarrDatasetIOConfiguration,
)
from ._configure_backend import configure_backend
from ._dataset_configuration import get_default_dataset_io_configurations
from ._dataset_io_configuration import get_default_dataset_io_configurations
from ._metadata_and_file_helpers import (
add_device_from_metadata,
get_default_nwbfile_metadata,
Expand All @@ -24,13 +28,17 @@

BACKEND_CONFIGURATIONS = dict(hdf5=HDF5BackendConfiguration, zarr=ZarrBackendConfiguration)
DATASET_IO_CONFIGURATIONS = dict(hdf5=HDF5DatasetIOConfiguration, zarr=ZarrDatasetIOConfiguration)
BACKEND_NWB_IO = dict(hdf5=_NWBHDF5IO, zarr=_NWBZarrIO)

__all__ = [
"AVAILABLE_HDF5_COMPRESSION_METHODS",
"AVAILABLE_ZARR_COMPRESSION_METHODS",
"BACKEND_CONFIGURATIONS",
"DATASET_IO_CONFIGURATIONS",
"BACKEND_NWB_IO",
"get_default_backend_configuration",
"get_default_dataset_io_configurations",
"configure_backend",
"AVAILABLE_HDF5_COMPRESSION_METHODS",
"AVAILABLE_ZARR_COMPRESSION_METHODS",
"BackendConfiguration",
"DatasetIOConfiguration",
"get_default_dataset_io_configurations",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
from pynwb import NWBFile

from ._base_dataset_io import DatasetIOConfiguration
from .._dataset_configuration import get_default_dataset_io_configurations
from .._dataset_io_configuration import get_default_dataset_io_configurations


class BackendConfiguration(BaseModel):
Expand Down
15 changes: 13 additions & 2 deletions src/neuroconv/tools/nwb_helpers/_configure_backend.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
"""Collection of helper functions related to configuration of datasets dependent on backend."""
from typing import Union

from pynwb import NWBFile
from hdmf.common import Data
from pynwb import NWBFile, TimeSeries

from ._configuration_models._hdf5_backend import HDF5BackendConfiguration
from ._configuration_models._zarr_backend import ZarrBackendConfiguration
Expand All @@ -21,4 +22,14 @@ def configure_backend(

# TODO: update buffer shape in iterator, if present

nwbfile_objects[object_id].set_data_io(dataset_name=dataset_name, data_io_class=data_io_class, **data_io_kwargs)
if isinstance(nwbfile_objects[object_id], Data):
nwbfile_objects[object_id].set_data_io(data_io_class=data_io_class, data_io_kwargs=data_io_kwargs)
elif isinstance(nwbfile_objects[object_id], TimeSeries):
nwbfile_objects[object_id].set_data_io(
dataset_name=dataset_name, data_io_class=data_io_class, **data_io_kwargs
)
else: # Strictly speaking, it would be odd if a backend_configuration led to this, but might as well be safe
raise NotImplementedError(
f"Unsupported object type {type(nwbfile_objects[object_id])} for backend "
f"configuration of {nwbfile_objects[object_id].name}!"
)
Loading
Loading