Skip to content

Enable patchwise training and prediction #135

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 169 commits into
base: main
Choose a base branch
from

Conversation

davidwilby
Copy link
Collaborator

Hey @tom-andersson - at long last, the long-awaited patchwise training and prediction feature that @nilsleh and @MartinSJRogers have been working on.

This PR adds patching capabilities to DeepSensor during training and inference.

Training

Optional args patching_strategy, patch_size, stride and num_samples_per_date are added to TaskLoader.__call__.

There are two available patching strategies: random_window and sliding_window. The random_window option randomly selects points in the x1 and x2 extent as the centroid of the patch. The number of patches is defined by the num_samples_per_date argument. The sliding_window function starts in the top left of the dataset and convolves from left to right and top to bottom over the data using the user-defined patch_size and stride.

TaskLoader.__call__ now contains additional conditional logic depending upon the patching strategy selected. If no patching strategy is selected, task_generator() runs exactly as before. If random_window (sliding_window) is selected the bounding boxes for the patches are generated using the sample_random_window() (sample_sliding_window()) methods. The bounding boxes are appended to the list bboxes, and passed to task_generator().

Within task_generator() after the sampling strategies are applied, the data is spatially sliced using each bbox in bboxes using the self.spatial_slice_variable() function.

When using a patching strategy, TaskLoader produces a list of tasks per date, rather than an individual task per date. A small change has been made to Task's summarise_str method to avoid an error when printing patched Tasks and to output more meaningful information.

Inference

To run patchwise predictions, a new method has been created in model.py called predict_patch(). This method iterates through and applies the pre-exisiting predict() method to each patched task. The predict() method has not been changed. Within each iteration, prior to running predict() for each patch, the bounding box of each patch is unnormalized, so the X_t of each patch can be passed to the predict() function. The patchwise predictions are stored in the list preds for subsequent stitching.

It is only possible to use the sliding_window patching function during inference, and the stride and patch size are defined when the user generates the test tasks within the task_loader() call. The data_processor must also be passed to predict_patch() method to enable unnormalisation of the coordinates of the bboxes in model.py.

Once the list of patchwise predictions are generated, stitch_clipped_predictions() is used to form a prediction at the original X_t extent. Currently, functionality is provided to subset or clip each patchwise prediction so there is no overlap between adjacent patches and then merge the patches using xr.combine_by_coords(). The modular nature of the code means there is scope for additional stitching strategies to be added after this PR, for example applying a weighting function to overlapping predictions. To ensure the patches are clipped by the correct amount, get_patch_overlap() calculates the overlap between adjacent patches. stitch_clipped_predictions() also contains code to handle patches at the edge or bottom of the dataset, where the overlap may be different.

The output from predict_patch() is the identical DeepSensor object produced in model.predict(), hence DeepSensor’s plotting functionality can subsequently be used in the same way.

Documentation and Testing

New notebook(s) are added illustrating the usage of both patchwise training and prediction.

New tests are added to verify the new behaviour.

Limitations

  • Patchwise prediction does not currently support predicting at more than one timestamp - calling predict_patch with more than one date raises a NotImplementedError.
  • predict_patch is a new, distinct function due to all the pre-processing it needs to do, the patchwise behaviour may be better served as an option in predict - let me know what you think.
  • Patched tasks don't exactly follow the proportions from patch_size, e.g. for a 'square' patch patch_size=(0.5,0.5) the exact dimensions won't be exactly square, this is accounted for in stitching of patches, but is slightly inelegant at the moment so we may want to come back and find a more refined solution in the future.
  • In test_model.test_patchwise_prediction I've temporarily commented-out the asserts checking for correct prediction shape, these fail with test datasets for now, but with real datasets the shapes are correct, see the patchwise_training_and_prediction.ipynb notebook.

@tom-andersson
Copy link
Collaborator

Hi @davidwilby, thanks for addressing my comments. Is it possible to move the added functionality into subclasses? See my comment here:

For readability and maintenance, could you move the additions to TaskLoader and DeepSensor model into subclasses to improve encapsulation of the patching functionality? I'm realising this PR makes the standard API more complicated, which may be offputting for users who don't require patching, and also makes the code for those classes trickier to parse. How about a PatchTaskLoader and DeepSensorPatchwiseModel? With no other changes to the class hierarchy I think we'd need ConvNP to subclass DeepSensorPatchwiseModel instead of the standard DeepSensorModel, though there may be a more elegant solution where the user decides whether to set up a model that predicts with patches through the ConvNP interface. WDYT?

This is such a large PR and hard for me to review everything, and complicates the API for users, that I would prefer to encapsulate the functionality here into TaskLoader and DeepSensorModel subclasses, which the caveat that they are somewhat experimental features.

@davidwilby
Copy link
Collaborator Author

davidwilby commented Apr 7, 2025

Hi @davidwilby, thanks for addressing my comments. Is it possible to move the added functionality into subclasses? See my comment here:

For readability and maintenance, could you move the additions to TaskLoader and DeepSensor model into subclasses to improve encapsulation of the patching functionality? I'm realising this PR makes the standard API more complicated, which may be offputting for users who don't require patching, and also makes the code for those classes trickier to parse. How about a PatchTaskLoader and DeepSensorPatchwiseModel? With no other changes to the class hierarchy I think we'd need ConvNP to subclass DeepSensorPatchwiseModel instead of the standard DeepSensorModel, though there may be a more elegant solution where the user decides whether to set up a model that predicts with patches through the ConvNP interface. WDYT?

This is such a large PR and hard for me to review everything, and complicates the API for users, that I would prefer to encapsulate the functionality here into TaskLoader and DeepSensorModel subclasses, which the caveat that they are somewhat experimental features.

Thanks @tom-andersson sorry for missing this one, there are a few comments that passed me by in the enormity of this PR..

I've had a look at refactoring TaskLoader and DeepSensorModel to move any patching methods into patching-specific subclasses in 7587726

This moves a lot of the patching-specific methods into a new PatchwiseTaskLoader class.

Some things I'd like to do but are proving difficult:

  • Reduce repetition in PatchwiseTaskLoader.__call__
    • There really is a lot of repeated code here which has followed the duplication in the original TaskLoader.__call__ method, repeated for cases of one or multiple dates, repeating for multiple cases of patching strategy - I'm struggling to do this elegantly at this point.
  • Try to remove patching specific code in TaskLoader.task_generation
    • I have been so far unable to pick out the usage of patching-specific properties such as bbox from the task_generation method since they're so intertwined, without a larger refactor of the method I'm unsure how to do this right now.
  • Move predict_patchwise from the DeepSensorModel class to a patching-specific model class.
    • Because of the reasonably complicated layers of inheritance and multiple dispatch of the __init__ method of the ConvNP class I think that for now it is ultimately simpler to have a patchwise predict method in the DeepSensorModel class. One option would be to have an option in DeepSensorModel.__init__ which replaces the predict method with predict_patchwise imported from some other location but that feels ugly to document.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants