Important
This software is Emerging and subject to ECMWF's guidelines on Software Maturity.
earthkit-hydro is a Python library for common hydrological functions. It is the hydrological component of earthkit.
- Support for PCRaster, CaMa-Flood and HydroSHEDS river networks
- Computing statistics over catchments and subcatchments
- Finding catchments and subcatchments
- Calculating distances along river networks
- Calculation of upstream or downstream fields
- Handle arbitrary missing values
- Handle N-dimensional fields
For a default installation, run
pip install earthkit-hydro
For a developer installation (includes linting and test libraries), run
git clone https://github.com/ecmwf/earthkit-hydro.git
cd earthkit-hydro
pip install -e .[dev]
pre-commit install
An example notebook showing how to use the earthkit-hydro is provided in addition to the documentation below.
earthkit-hydro can be imported as following:
import earthkit.hydro as ekh
The package contains different ways of constructing or loading a RiverNetwork
object. A RiverNetwork
object is a representation of a river network on a grid.
It can be used to compute basic hydrological functions, such as propagating a scalar field along the river network or extracting a catchment from the river network.
Given a discretisation of a domain i.e. a set of points
For ease of notation, if an edge exists from
ekh.river_network.load(domain, version)
Loads a precomputed RiverNetwork
. Current options can be listed with ekh.river_network.available()
and are:
domain |
version |
Details | Note | Attribution |
---|---|---|---|---|
"efas" | "5" | 1arcmin European | 1 | |
"efas" | "4" | 5km European | Smaller domain than v5 | 1 |
"glofas" | "4" | 3arcmin global | 60° South to 90° North | 2 |
"glofas" | "3" | 6arcmin global | 60° South to 90° North | 2 |
"cama_03min" | "4" | 3arcmin global | 3 | |
"cama_05min" | "4" | 5arcmin global | 3 | |
"cama_06min" | "4" | 6arcmin global | 3 | |
"cama_15min" | "4" | 15arcmin global | 3 | |
"hydrosheds_05min" | "1" | 5arcmin global | 56° South to 84° North | 4 |
"hydrosheds_06min" | "1" | 6arcmin global | 56° South to 84° North | 4 |
ekh.river_network.create(path, river_network_format, source="file")
Creates a RiverNetwork
. Current options are
- river_network_format: "esri_d8", "pcr_d8", , "merit_d8", "cama" or "precomputed"
- source: An earthkit-data compatable source. See list.
There are four high-level ways to compute metrics depending on the use-case.
ekh.upstream.sum(river_network, field, weights=None)
ekh.upstream.max(river_network, field, weights=None)
ekh.upstream.min(river_network, field, weights=None)
ekh.upstream.mean(river_network, field, weights=None)
ekh.upstream.prod(river_network, field, weights=None)
ekh.upstream.std(river_network, field, weights=None)
ekh.upstream.var(river_network, field, weights=None)
Given an input field, returns as output a new field with the upstream metric calculated for each cell.
ekh.downstream.sum(river_network, field, weights=None)
ekh.downstream.max(river_network, field, weights=None)
ekh.downstream.min(river_network, field, weights=None)
ekh.downstream.mean(river_network, field, weights=None)
ekh.downstream.prod(river_network, field, weights=None)
ekh.downstream.std(river_network, field, weights=None)
ekh.downstream.var(river_network, field, weights=None)
Given an input field, returns as output a new field with the downstream metric calculated for each cell.
ekh.catchments.sum(river_network, field, points, weights=None)
ekh.catchments.max(river_network, field, points, weights=None)
ekh.catchments.min(river_network, field, points, weights=None)
ekh.catchments.mean(river_network, field, points, weights=None)
ekh.catchments.prod(river_network, field, points, weights=None)
ekh.catchments.std(river_network, field, points, weights=None)
ekh.catchments.var(river_network, field, points, weights=None)
Given a field and a series of points, calculates the metric over all upstream nodes for each of the points.
ekh.subcatchments.sum(river_network, field, points, weights=None)
ekh.subcatchments.max(river_network, field, points, weights=None)
ekh.subcatchments.min(river_network, field, points, weights=None)
ekh.subcatchments.mean(river_network, field, points, weights=None)
ekh.subcatchments.prod(river_network, field, points, weights=None)
ekh.subcatchments.std(river_network, field, points, weights=None)
ekh.subcatchments.var(river_network, field, points, weights=None)
Given a field and a series of points, finds the subcatchments defined by the points and computes the metric for each subcatchment.
ekh.zonal.sum(field, labels, weights=None, return_field=False)
ekh.zonal.max(field, labels, weights=None, return_field=False)
ekh.zonal.min(field, labels, weights=None, return_field=False)
ekh.zonal.mean(field, labels, weights=None, return_field=False)
ekh.zonal.prod(field, labels, weights=None, return_field=False)
ekh.zonal.std(field, labels, weights=None, return_field=False)
ekh.zonal.var(field, labels, weights=None, return_field=False)
Calculates a metric over the input field for each zone defined by the labels field. If return_field is True, returns a field otherwise returns a dictionary of {label: metric} pairs.
(for advanced users) Similarly, one can also use a low-level API via
ekh.calculate_upstream_metric(river_network, field, metric, weights=None)
ekh.calculate_catchment_metric(river_network, field, points, metric, weights=None)
ekh.calculate_subcatchment_metric(river_network, field, points, metric, weights=None)
ekh.calculate_zonal_metric(field, labels, metric, weights=None)
# applies the ufunc on the field starting from the sources all the way down to the sinks
ekh.flow_downstream(river_network, field, ufunc=np.add)
# applies the ufunc on the field starting from the sinks all the way up to the sources
ekh.flow_upstream(river_network, field, ufunc=np.add)
These are analagous to above.
ekh.catchments.find(river_network, field)
Finds the catchments (all upstream nodes of specified nodes, with overwriting).
ekh.subcatchments.find(river_network, field)
Finds the subcatchments (all upstream nodes of specified nodes, without overwriting).
ekh.distance.min(river_network, points, weights=None, upstream=False, downstream=True)
ekh.distance.max(river_network, points, weights=None, upstream=False, downstream=True)
Given a set of input points, computes the min or max distance (upstream and/or downstream depending on input) from those points for all cells in the field. Unreachable points are given a value np.inf. Weights represents the distance to a downstream cell (weights at sinks are ignored). By default, it is assumed the distance to any downstream cell is 1 regardless if the connection is diagonal or not.
ekh.length.min(river_network, points, weights=None, upstream=False, downstream=True)
ekh.length.max(river_network, points, weights=None, upstream=False, downstream=True)
Given a set of input points, computes the min or max length (upstream and/or downstream depending on input) of the network starting from those points for all cells in the field. Unreachable points are given a value np.inf. Weights represents the length of the river in the grid cell. By default, this is assumed to be one.
We also provide some analagous convenience functions for calculating shortest/longest distances/lengths to sources or sinks.
ekh.distance.to_source(river_network, weights, path='shortest')
ekh.distance.to_sink(river_network, weights, path='shortest')
ekh.length.to_source(river_network, weights, path='shortest')
ekh.length.to_sink(river_network, weights, path='shortest')
ekh.move_downstream(river_network, field, ufunc=np.add)
Updates each node with the sum of its upstream nodes.
ekh.move_upstream(river_network, field)
Updates each node with its downstream node.
river_network.create_subnetwork(field)
Computes the river subnetwork defined by a field mask of the domain.
river_network.export(filename)
Exports the RiverNetwork
as a joblib pickle.
earthkit-hydro provides many functions with PCRaster equivalents, summarised below:
PCRaster | earthkit-hydro | Note |
---|---|---|
accuflux | upstream.sum | |
catchmenttotal | upstream.sum | |
areatotal | zonal.sum | return_field=True |
areaaverage | zonal.mean | return_field=True |
areamaximum | zonal.max | return_field=True |
areaminimum | zonal.min | return_field=True |
downstream | move_upstream | |
upstream | move_downstream | |
catchment | catchments.find | |
subcatchment | subcatchments.find | |
path | upstream.max | |
ldddist | distance.min | friction input is slightly different to weights, and by default ekh takes distance between two nodes to be one regardless if on diagonal or not |
downstreamdist | distance.to_sink | Same caveats as for ldddist |
slopelength | distance.to_source | path="longest", same caveats as for ldddist |
abs, sin, cos, tan, ... | np.abs, np.sin, np.cos, np.tan, ... | any numpy operations can be directly used |
Points of difference
- earthkit-hydro treats missing values as np.nans i.e. any arithmetic involving a missing value will return a missing value. PCRaster does not always handle missing values exactly the same.
- earthkit-hydro can handle vector fields and fields of integers, floats, bools. PCRaster supports a restricted subset of this.
1 The EFAS river network is available under the conditions set out in the European Commission Reuse and Copyright Notice and is available at https://data.jrc.ec.europa.eu/dataset/f572c443-7466-4adf-87aa-c0847a169f23.
Choulga, Margarita; Moschini, Francesca; Mazzetti, Cinzia; Grimaldi, Stefania; Disperati, Juliana; Beck, Hylke; Salamon, Peter; Prudhomme, Christel (2023): LISFLOOD static and parameter maps for Europe. European Commission, Joint Research Centre (JRC) [Dataset] PID: http://data.europa.eu/89h/f572c443-7466-4adf-87aa-c0847a169f23
2 The GloFAS river network is available under the conditions set out in the European Commission Reuse and Copyright Notice and is available at https://data.jrc.ec.europa.eu/dataset/68050d73-9c06-499c-a441-dc5053cb0c86.
Choulga, Margarita; Moschini, Francesca; Mazzetti, Cinzia; Disperati, Juliana; Grimaldi, Stefania; Beck, Hylke; Salamon, Peter; Prudhomme, Christel (2023): LISFLOOD static and parameter maps for GloFAS. European Commission, Joint Research Centre (JRC) [Dataset] PID: http://data.europa.eu/89h/68050d73-9c06-499c-a441-dc5053cb0c86
3 The CaMa river networks are available under CC-BY 4.0 licence and are available at http://hydro.iis.u-tokyo.ac.jp/~yamadai/cama-flood/.
Yamazaki, Dai; Ikeshima, Daiki; Sosa, Jeison; Bates, Paul D.; Allen, George H.; Pavelsky, Tamlin M. (2019): MERIT Hydro: A high-resolution global hydrography map based on latest topography datasets. Water Resources Research, vol.55, pp.5053-5073, 2019, DOI: 10.1029/2019WR024873
4 The HydroSHEDS river networks are available under the conditions set out in the HydroSHEDS Version One Licence Agreement and are available at https://www.hydrosheds.org.
Lehner, Bernhard; Verdin, Kristine; Jarvis, Andy (2008): New global hydrography derived from spaceborne elevation data. Eos, Transactions, 89(10): 93-94. Data available at https://www.hydrosheds.org.
Copyright 2024, European Centre for Medium Range Weather Forecasts.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
In applying this licence, ECMWF does not waive the privileges and immunities
granted to it by virtue of its status as an intergovernmental organisation
nor does it submit to any jurisdiction.