Skip to content

Commit

Permalink
Cellfinder migration (#101)
Browse files Browse the repository at this point in the history
* Write blog post about cellfinder migration

* Write full changelog

* Update dev docs

* Update the CLI tool usage instructions

* Create brainglobe-workflows front-facing page

* Pass over wording after a weekend

* Add note about what bg-workflows is

* Apply Adam's code suggestions

* Apply suggestions from code review

Co-authored-by: Adam Tyson <[email protected]>

* Add stub lookup page for individual repositories with dev info

---------

Co-authored-by: Adam Tyson <[email protected]>
  • Loading branch information
willGraham01 and adamltyson authored Nov 24, 2023
1 parent 110a57c commit ff79e55
Show file tree
Hide file tree
Showing 18 changed files with 298 additions and 184 deletions.
35 changes: 35 additions & 0 deletions docs/source/blog/version1/cellfinder_migration_live.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
blogpost: true
date: Nov 1, 2023
author: Will Graham
location: London, England
category: BrainGlobe-v1
language: English
---

# `cellfinder` has moved: version 1 of `brainglobe-workflows` released

Continuing the [restructuring of BrainGlobe](./version_1_announcement.md), the `cellfinder` command-line tool has moved to a new home, `brainglobe-workflows`.
Please note that we will no longer be providing Docker images for `cellfinder`'s command-line functionality either - if you were previously using the Docker image, please see the advice in the [full changelog](#full-changelog).

Our vision for this new `brainglobe-workflows` package is to provide one package that bundles together several data-analysis pipelines that are run frequently in neuroscience.
By providing a package which knits together the relevant BrainGlobe tools into a single command or Python function, we can reduce the amount of manual work that users have to do to setup and run their own analyses.
No longer will it be necessary to manually install or import 3 different BrainGlobe tools to do your analysis - if it's provided by the `workflows` package, we will take care of this setup behind the scenes so you can focus on obtaining the results.

For now, `cellfinder` will retain it's name, but it will soon be renamed to bring it in line with the convention for additional workflows we will be providing.

From a developer and sustainability perspective, it also provides us with a natural series of workflows to benchmark the BrainGlobe tools against - to help speed up run times, catch any bugs, and check for compatibility breakages before we push future updates.
It also frees up the name "cellfinder" for the backend code that is currently stored in `cellfinder-core` and `cellfinder-napari`.
Much like the changes to [`brainreg`](./brainreg_update_live.md), we are looking to combine `cellfinder-core` and `cellfinder-napari` into a single package, then bundle it with the `BrainGlobe` version 1 release.
Migrating the `cellfinder` command-line tool to `brainglobe-workflows` is the first step to ensure that the analysis workflow it provides remains available to users, in a manner which can still be updated and receive bug reports.

## What do I need to do?

If you were previously using the `cellfinder` command-line tool; you don't need to do anything right now if you want to wait for the full release of BrainGlobe version 1, which will take care of these dependencies for you.
Be aware however, that the `cellfinder` tool that you have installed will no longer be receiving updates.

If you would like to update to the `cellfinder` command-line tool provided by `brainglobe-workflows`, we recommend you take a look at the instructions in the [full changelog](#full-changelog).

## Full changelog

You can find the [full changelog on the releases page](../../community/releases/v1/cellfinder-migration.md).
2 changes: 0 additions & 2 deletions docs/source/community/developers/conventions.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,3 @@ determine the **minimum** set of supported package versions:

In addition to this, the last 24 months of other dependencies should also be
supported.


24 changes: 11 additions & 13 deletions docs/source/community/developers/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,14 @@ the [BrainGlobe Zulip chat](https://brainglobe.zulipchat.com/).
If for any reason, you'd rather not reach out in public, feel free to send a direct message on Zulip
to [Adam Tyson](https://github.com/adamltyson), one of the core developers.

Some of our tools have additional information about how data files are organised, where user caches are placed, and similar.
You can view these repositories and the relevant information by heading to the [specific repository developer docs page](./specific_repos.md).

## To contribute a new atlas

To add a new BrainGlobe atlas, please see the guide [here](/documentation/bg-atlasapi/adding-a-new-atlas).

## To contribute code

### Creating a development environment
## Creating a development environment

It is recommended to use `conda` to install a development environment for
BrainGlobe projects. Once you have `conda` installed, the following commands
Expand Down Expand Up @@ -76,29 +77,26 @@ For these reasons (and others) every part of all software must be documented as
and all new features must be fully documented.

### Editing the documentation

The documentation is hosted using [GitHub Pages](https://pages.github.com/), and the source can be found at
[GitHub](https://github.com/brainglobe/brainglobe.github.io). Most content is found under `docs/source`, where the
structure mostly mirrors the rendered website. To edit a page, please:
* Fork the repository
* Make edits to the relevant pages
* Create a pull request outlining the changes made

- Fork the repository
- Make edits to the relevant pages
- Create a pull request outlining the changes made

If you aren't sure where the changes should be made, please
[get in touch](https://brainglobe.info/contact.html#contributing).

## Further information

:::{toctree}
:maxdepth: 1
tooling
conventions
testing
new_releases
specific_repos
Code of conduct <https://github.com/brainglobe/.github/blob/main/CODE_OF_CONDUCT.md>
:::

## Specific repository information
:::{toctree}
:maxdepth: 1
repositories/cellfinder-core/index
repositories/cellfinder/index
:::
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# brainglobe-workflows

`brainglobe-workflows` is simultaneously a package containing data analysis pipelines that utilise BrainGlobe tools, as well as a benchmarking suite for those tools.

Users are only expected to interact with the command-line entry points (or equivalent backend functions) that the package provides.
A successful package install does not ship the benchmarking suite by default - indeed, a full clone of the repository is required to allow local running of the benchmarks.

Developers can clone the repository and run a `dev` install (`pip install .[dev]`) to install the developer requirements, in particular [`AirSpeed Velocity (asv)`](https://asv.readthedocs.io/en/v0.6.1/).
This will allow for running the benchmark workflows locally, however if you don't have a suitably performant machine, they will likely take a long time to run!

## `cellfinder` file paths

All file paths should be defined in `brainglobe_workflows.cellfinder.tools.prep.Paths`.
Any intermediate file paths, (i.e., those which are not of interest to the typical end-user) should be prefixed with `tmp__`.
These should then be cleaned up as soon as possible after generation.

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,5 +1,58 @@
# cellfinder-core

:::{toctree}
cell-detection
:::
## Cell detection

### Methodology

Cell detection in cellfinder has three stages:

1. 2D filter each image plane independently.
2. 3D filter small batches of planes.
3. Merge detected cell candidate voxels into into structures.

#### 2D filtering

Code can be found in `cellfinder_core/detect/filters/plane`.
Each plane of data is filtered independently, and in parallel across a number of processes.

This part of processing performs two tasks:

1. Applies a filter to enhance peaks in the data (``cellfinder_core/detect/filters/plane/classical_filter.py``).
This consists of (in order)
1. a median filter (`scipy.signal.medfilt2d`)
2. a gaussian filter (`scipy.ndimage.gaussian_filter`)
3. a laplacian filter (`scipy.signal.laplace`),
4. inverting the data
5. normalising to [0, 1]
6. scaling to *clipping_value*.

Because applying several of the filters is more time efficient when done on floating point data types, each plane is cast to `float64` in this step.

1. Works out which areas of the plane are inside or outside of the brain. To do this the plane is divided into square tiles that have edge length `2 * soma_diameter`. The lower corner tile is assumed to be outside the brain, and any tiles that have a mean intensity less than `1 + mean + (2 * stddev)` of the corner tile are marked as being outside the brain. This speeds up processing in later steps by automatically skipping over tiles marked as outside the brain in this step.

Memory usage during 2D filtering, for each plane, is the following:

- The plane itself is read into memory.
- During filtering, a copy of the plane is made and cast to `float64`.
- A small `uint8` mask is created to mark areas of the plane that are inside/outside of the brain.

#### 3D filtering

Code can be found in `cellfinder_core/detect/filters/volume/ball_filter.py`.
Both this step and the structure detection step take place in the main `Python` process, with no parallelism. As the planes are processed in the 2D filtering step, they are passed to this step. When `ball_z_size` planes have been handed over, 3D filtering begins.

The 3D filter stores a 3D array that has depth `ball_z_size`, and contains `ball_z_size` number of planes. This is a small 3D slice of the original data. A spherical kernel runs across the x, y dimensions, and where enough intensity overlaps with the spherical kernel the voxel at the centre of the kernel is marked as being part of a cell. The output of this step is the central plane of the array, with marked cells.

Memory usage information during 3D filtering:

- `ball_z_size` planes at a time are stored.
- Twice this amount of memory is required to roll the array each time a new array is fed to the 3D filter stage.

#### Structure detection

Code can be found in `cellfinder_core/detect/filters/volume/structure_detection.py`.
This step takes the planes output from 3D filtering with marked cell voxels, and detects collections of voxels that are adjacent.

Memory usage information during structure detection:

- Two planes are cast to `uint64` and are stored at the same time.

This file was deleted.

13 changes: 13 additions & 0 deletions docs/source/community/developers/specific_repos.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Specific Repository Developer Documentation

To complete BrainGlobe's [general developer guidelines](./index.md), some of our tools and repositories contain additional developer information which extends that the general guidelines.
Typically this information concerns the directory to store user caches, conventions for naming or storing data in the repository for testing purposes, and similar.
You can check the repositories with additional developer information by following the links below.

:::{toctree}
:maxdepth: 1
:caption: BrainGlobe tools with additional developer information
:glob:

repositories/**
:::
72 changes: 72 additions & 0 deletions docs/source/community/releases/v1/cellfinder-migration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Version 1: Cellfinder migration and `brainglobe-workflows`

## Disambiguation

Due to some historical naming decisions, there are some unfortunate clashes of language between packages, workflows, and tools.
We disambiguate these terms before proceeding with the changelog:

- "workflow" refers to an analysis pipeline - a sequence of data analysis steps to process data and produce an output, using a combination of BrainGlobe tools.
- `brainglobe-workflows` refers to the new package released with this update. [Source code on GitHub](https://github.com/brainglobe/brainglobe-workflows).
- cellfinder (package) refers to the [package named cellfinder on PyPI](https://pypi.org/project/cellfinder/0.8.0/). Specifically, to any versions `<1.0.0` of this package.
- cellfinder (repository) refers to the [source code on GitHub](https://github.com/brainglobe/cellfinder).
- `cellfinder` (CLI) refers to the `cellfinder` command-line tool. This is provided by both cellfinder (package) and `brainglobe-workflows`.
- cellfinder (Docker image) refers to the [Docker image](https://hub.docker.com/r/adamltyson/cellfinder) that allows users to mount their data and run `cellfinder` (CLI).

## Python version support

With this update, we are dropping `cellfinder` (CLI) support for Python 3.8.
Whilst the workflow should still run on Python 3.8 in the immediate future, the oldest supported version is now Python 3.9.

## cellfinder (package)

This package should now be considered deprecated to users.
Consequentially, the `cellfinder` (CLI) tool provided by this package will no longer receive updates.
If you want to keep the `cellfinder` (CLI) up to date, you will need to [install `brainglobe-workflows`](#updating-to-the-new-cellfinder-cli-tool) and use the `cellfinder` (CLI) provided there.

### Future-warning

For development purposes, cellfinder (package) **will later become** the new name for the combined `cellfinder-core` and `cellfinder-napari` package.
This release of the package will be tagged with version `>=1.0.0`, to indicate that a breaking change has occurred, and there will be a corresponding blog post published.

If you have not moved to using `cellfinder` (CLI) provided by `brainglobe-workflows`, this change will break your analysis pipelines if you try to update.
You will need to install either `brainglobe-workflows` or BrainGlobe version 1, or prevent your package manager from attempting to update cellfinder (package).

## cellfinder (repository)

For development purposes, the cellfinder repository will be recycled to include the backend code that is currently stored in `cellfinder-core` and `cellfinder-napari`.
Similarly to `brainreg`, we will be combining the functionality of the backend code and visualisation tool into a single package.

## cellfinder (Docker image)

We will no longer be providing Docker images for `cellfinder` (CLI).
We recommend that you install `brainglobe-workflows`, or BrainGlobe version 1 when it is release, into a clean virtual environment as an alternative.

## brainglobe-workflows

This package now provides the `cellfinder` (CLI) tool, and is the recommended way to run the analysis pipeline.
It can be installed via pip - [see the instructions below](#updating-to-the-new-cellfinder-cli-tool).

This package will continue to grow to include additional analysis pipelines and workflows for data analysis in neuroscience.

## Updating to the new `cellfinder` (CLI) tool

In order to update to the new `cellfinder` (CLI) tool provided by `brainglobe-workflows`, follow the steps below:

1. Uninstall cellfinder (package) from your Python environment. This should be a case of running `pip uninstall cellfinder` on the command line, with the environment activated.
1. (Optional) verify that the `cellfinder` (CLI) tool has been removed. If the output of `which cellfinder` is nothing, then the old tool has been successfully removed.
1. Install `brainglobe-workflows` into your environment. Again, we recommend installing via pip: `pip install brainglobe-workflows`.
1. (Optional) verify that the (new) `cellfinder` (CLI) tool has been installed. The output of `which cellfinder` should display a path to your activated environment, which ends inside the `brainglobe-workflows` package.

If you are making a clean install into a fresh environment, there is no need to run the first two (uninstall) steps.
Simply `pip install brainglobe-workflows` into your new environment.

As mentioned in the main blog post, `cellfinder` (CLI) will also be getting a new name in the near future, as additional workflow tools are added.
It's name isn't changing right now, but keep an eye on this space.

### Delaying updating

We **strongly recommend** you move to using `brainglobe-workflows` if you wish to continue using the `cellfinder` (CLI) tool.

If you really want to keep using the old cellfinder (package), you will need to prevent further updates to it.
The `cellfinder` (CLI) provided by cellfinder (package) will continue to work so long as you do not update, however you should consider any versions of `cellfinder` (CLI) provided by cellfinder (package) unmaintained.
You will eventually run into the [name conflicts](#cellfinder-repository) listed above, as BrainGlobe version 1 starts to roll out.
4 changes: 4 additions & 0 deletions docs/source/community/releases/v1/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ You can follow the links provided for more information; including a listing of r
|--------|:-:|
brainreg and brainreg-napari have been merged into a single package | [Further info](brainreg.md#brainreg-and-brainreg-napari) |
brainreg-segment has been renamed to brainglobe-segmentation | [Further info](brainreg.md#brainreg-segment) |
The `cellfinder` command-line-interface has been moved into `brainglobe-workflows` | [Further info](cellfinder-migration.md) |
The cellfinder package is deprecated - it will later be recycled to merge some backend functionality | [Further info](cellfinder-migration.md#cellfinder-repository)
The cellfinder Docker image is discontinued | [Further info](cellfinder-migration.md#cellfinder-docker-image)

## Complete index

Expand All @@ -17,4 +20,5 @@ brainreg-segment has been renamed to brainglobe-segmentation | [Further info](br
:glob:
brainreg
cellfinder-migration
```
Loading

0 comments on commit ff79e55

Please sign in to comment.