Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GeoDatasets should use separate values for x and y resolution #2601

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

calebrob6
Copy link
Member

@calebrob6 calebrob6 commented Feb 21, 2025

This PR refactors GeoDatasets to use separate values to store the x and y resolutions. This is pretty important because not all raster pixels are square.

Some notes:

Closes #2594

@github-actions github-actions bot added datasets Geospatial or benchmark datasets models Models and pretrained weights testing Continuous integration testing labels Feb 21, 2025
@github-actions github-actions bot added the samplers Samplers for indexing datasets label Feb 21, 2025
@github-actions github-actions bot removed the models Models and pretrained weights label Feb 21, 2025
@calebrob6 calebrob6 requested a review from Copilot February 21, 2025 21:12

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 24 out of 38 changed files in this pull request and generated no comments.

Files not reviewed (14)
  • torchgeo/datasets/eudem.py: Evaluated as low risk
  • torchgeo/datasets/cms_mangrove_canopy.py: Evaluated as low risk
  • torchgeo/datasets/astergdem.py: Evaluated as low risk
  • torchgeo/datasets/enmap.py: Evaluated as low risk
  • torchgeo/datasets/cdl.py: Evaluated as low risk
  • torchgeo/datasets/agb_live_woody_density.py: Evaluated as low risk
  • torchgeo/datasets/esri2020.py: Evaluated as low risk
  • tests/datamodules/test_geo.py: Evaluated as low risk
  • torchgeo/datasets/eurocrops.py: Evaluated as low risk
  • tests/datasets/test_cbf.py: Evaluated as low risk
  • tests/samplers/test_batch.py: Evaluated as low risk
  • tests/samplers/test_single.py: Evaluated as low risk
  • torchgeo/datasets/chesapeake.py: Evaluated as low risk
  • tests/datasets/test_geo.py: Evaluated as low risk
@@ -416,7 +416,7 @@ def __init__(
self,
paths: Path | Iterable[Path] = 'data',
crs: CRS | None = None,
res: float | None = None,
res: tuple[float, float] | None = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, I would like all of these datasets to support both float and tuple[float, float] similar to how our Samplers work. We can use the same helper function to convert the former to the latter. This is also nice for backwards compatibility (which we have to start caring about soon).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also need to update docstrings to document the order of xres and yres.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which docstrings would you like this in?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every single docstring for every single GeoDataset with a res parameter.

@adamjstewart adamjstewart added this to the 0.7.0 milestone Feb 21, 2025
@calebrob6
Copy link
Member Author

calebrob6 commented Feb 21, 2025

The failing test is a bug present in our min version of rasterio (1.3.0.post1) in how it reads res from WarpedVRT. The minimum version that handles this correctly is 1.3.11.

image

@github-actions github-actions bot added the dependencies Packaging and dependencies label Feb 21, 2025
@adamjstewart adamjstewart removed this from the 0.7.0 milestone Mar 23, 2025
root = os.path.join('tests', 'data', 'raster', 'res_2-2_epsg_32631')
ds = RasterDataset(root, res=10.0)
assert ds.res == (10.0, 10.0)
ds.res = 20.0 # type: ignore[assignment]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, mypy doesn't like it when your getter returns a single type but your setter accepts a union of types, maybe discussed here python/mypy#3004

@calebrob6
Copy link
Member Author

The main ideas here are:

  • GeoDataset's _res should always be a tuple[float, float]
  • The setter for res handles float | tuple[float, float]

@adamjstewart adamjstewart added this to the 0.7.0 milestone Mar 26, 2025
@@ -139,7 +139,7 @@ def __init__(
self,
paths: Path | Iterable[Path] = 'data',
crs: CRS | None = None,
res: float | None = None,
res: tuple[float, float] | None = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these other datasets also need to be float | tuple[float, float] and need updated docstrings too.

Copy link
Collaborator

@adamjstewart adamjstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea is solid, just need to uniformly update the type hints and docstrings so that all datasets and samplers can handle float | tuple[float, float] and the order is documented. Should just be able to copy-n-paste things across all files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasets Geospatial or benchmark datasets dependencies Packaging and dependencies samplers Samplers for indexing datasets testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RasterDataset and Samplers don't work with non-square pixels
2 participants