Asset and company footprint impact workflow
Suggested citation: Natural Capital Project. 2023. Ecosystem Services Footprinting Tool. Natural Capital Project, Stanford University. Available at: https://github.com/natcap/natural-capital-footprint-impact
This command-line script calculates metrics of the impact of human-made structures on certain ecosystem services and measures of biodiversity, based on their physical footprint on the landscape. It uses point or polygon locations of corporate assets, along with raster layers of ecosystem services or biodiversity metrics to generate spatial and tabular results that can be used to compare impacts across assets, companies, and portfolios.
The ecosystem services provided are:
coastal_risk_reduction_service
: Relative value of coastal and marine habitats for reducing the risk of erosion and inundation from storms for people who live near the coast.nitrogen_retention_service
: Nitrogen that is retained by the landscape, keeping it out of streams, times the number of people who live downstream who may benefit from cleaner water.sediment_retention_service
: Sediment that is retained by the landscape, keeping it out of streams, times the number of people who live downstream who may benefit from cleaner water.nature_access
: For each pixel considered, the number of people within a 1-hour travel distance to it.
Please see the Data provided for you > ecosystem service data section below in this document for more information about these layers.
Some useful definitions:
- Asset: A unit of physical infrastructure that occupies land on Earth's surface, such as a mine, office, restaurant, cell tower, hospital, pipeline, or billboard.
- Footprint: The area on Earth's surface occupied by an asset.
Asset location data are usually available as point coordinates (latitude/longitude). The real footprint of an asset may be available, but usually is not. To account for differences in data availability, this script can be used in three different ways:
- Point mode: Assets are provided by the user as latitude/longitude points. The actual asset footprint is not known or modeled. Ecosystem service statistics are calculated under each point only.
- Buffer mode: Assets are provided by the user as latitude/longitude points. The asset footprint is modeled around each point by creating a circular buffer of an area defined by the asset type. Ecosystem service statistics are calculated under each buffered footprint.
- Polygon mode: Assets are provided by the user as footprint polygons. This mode is preferred if actual asset footprint data are available. Ecosystem service statistics are calculated under each footprint.
Sample data for both point and vector assets can be downloaded here.
Required for both Point mode and Buffer mode. Point data must be provided in a GDAL-supported vector format. All points must be in the same layer. All features in the layer must be of the Point
type. MultiPoint
s are not allowed. Any attributes that exist in the original vector attribute table will be preserved in the output.
The asset vector layer contains an attribute table, where each row represents an asset. The following fields are used by the script:
- The
category
column determines footprint size. This field is required when using Buffer mode only. Footprint sizes vary widely, but correlate with the type of asset (for example, power plants take up more space than restaurants). As a default, we categorize assets using the S&P "facility category" designations, which corresponds to the default data provided in the Buffer Table (Table 3). - The
company
attribute is required for both Point mode and Buffer mode, and is used to aggregate results. Assets belonging to the samecompany
will be grouped together when calculating the aggregate statistics.
Field names must be spelled exactly as shown above, with no extra spaces or characters.
FID | <...other non-script related attributes...> | category | company |
---|---|---|---|
0 | <...non-script-related value...> | Bank Branch | XYZ Corp |
1 | ... | ... | .... |
Table 1. Point asset vector attribute table example.
Required for Polygon mode. Polygon data must be provided in a GDAL-supported vector format. All polygons must be in the same layer. All features in the layer must be of the Polygon
or MultiPolygon
type. Any attributes that exist in the original vector attribute table will be preserved in the output.
The company
attribute is reqired for Polygon mode, and is used to aggregate results. Assets belonging to the same company
will be grouped together when calculating the aggregate statistics. The field name, company
, must be spelled exactly as shown here, with no extra spaces or characters.
FID | <...other non-script related attributes...> | company |
---|---|---|
0 | <...non-script-related value...> | XYZ Corp |
1 | ... | .... |
Table 2. Polygon asset vector attribute table example.
Footprint buffer area data are defined in a CSV (comma-separated value) table (Table 3), where each row represents an asset category.
The first column is named category
. The category values will be cross-referenced with the category field in the Asset Table (Table 1).
The second column is named area
. This is the area (in square meters) of footprint to draw for assets of this category. Footprints will be drawn as a circular buffer around each asset point.
category | area |
---|---|
Bank Branch | 5073.7 |
... | ... |
Table 3. Buffer table: footprint area (in square meters) modeled for each asset category, used in Buffer mode.
The provided footprint areas were derived by manually estimating the footprint area of real assets from satellite imagery. We took the median of a small sample from each category. You may modify or replace this table if you wish to use different data, but they must be in CSV format, and include the required category
and area
fields.
Download the footprint area data by asset category CSV table.
Ecosystem service data are provided as geospatial raster layers (such as TIFFs), where each pixel has a value representing the quantity of service provided at that pixel.
Four ecosystem service raster datasets are provided for use with this script - sediment retention, nitrogen retention, coastal risk reduction, nature access. In order to use this script, you must either download one or more of these layers, or provide your own layers representing ecosystem services or biodiversity values. Following is a description of each provided layer, along with links for downloading them.
coastal_risk_reduction_service
: Relative value of coastal and marine habitats for reducing the risk of erosion and inundation from storms for people who live near the coast. Risk reduction is calculated using the InVEST Coastal Vulnerability model. Modeling by Chaplin-Kramer and Sharp (2023). Population is from Landscan 2019. Values are unitless, representing a relative index of risk reduction times the number of people who benefit. Download link for coastal risk. License for using these data: CC BY 4.0.nature_access
: The number of people within 1 hour travel distance of every pixel. Modeling by Chaplin-Kramer and Sharp (2023). Population is from Landscan 2019. Download link for nature access. License for using these data: CC BY 4.0.nitrogen_retention_service
: Nitrogen that is retained by the landscape, keeping it out of streams, times the number of people who live downstream who may benefit from cleaner water. Nitrogen retention is calculated using the InVEST Nutrient Delivery Ratio (NDR) model. Modeling by Chaplin-Kramer and Sharp (2023). Population is from Landscan 2019. Values are unitless, representing kilograms of nitrogen retained times number of people who benefit. Download link for nitrogen retention. License for using these data: CC BY 4.0.sediment_retention_service
: Sediment that is retained by the landscape, keeping it out of streams, times the number of people who live downstream who may benefit from cleaner water. Sediment retention is calculated using the InVEST Sediment Delivery Ratio (SDR) model. Modeling by Chaplin-Kramer and Sharp (2023). Population is from Landscan 2019. Values are unitless, representing tons of sediment retained times number of people who benefit. Download link for sediment retention. License for using these data: CC BY 4.0.
The script requires that all services to be analyzed are listed in a CSV (comma-separated value) table, where each row represents an ecosystem service. Required columns are:
es_id
: A unique identifier for the ecosystem service, which consists of any ASCII characters and may be of any length. This identifier is used to label the output statistics. Thees_id
s for the provided data are:coastal_risk_reduction_service
,nitrogen_retention_service
,sediment_retention_service
,nature_access
,endemic_biodiversity
,redlist_species
,species_richness
,kba_within_1km
.es_value_path
: File path to a GDAL-supported geospatial raster map of the ecosystem service. These may be given as paths that are relative to the location of the CSV file, or may be given as absolute paths.flag_threshold
: Threshold value of interest for the ecosystem service. Pixels with an ecosystem service value greater than this threshold will be flagged, and results will be provided indicating whether the service value for each asset exceeds this threshold. In the provided data, we used the 90th percentile of global terrestrial values as the threshold for each ecosystem service. For Coastal Risk Reduction, this included any value greater than 0.
es_id | es_value_path | flag_threshold |
---|---|---|
sediment | ES_layers/sediment_retention_for_downstream_populations.tif | 96485936 |
... | ... | ... |
Table 4. Ecosystem service table example: defines the ecosystem service layers that will be used by the script.
You may modify or replace this table, with the requirement that it must be in CSV format, and include the required es_id
, es_value_path
and flag_threshold
, fields. This table must be modified if you are using your own service layers or changing the path location of the default layers.
Download a sample ecosystem service layer table, which can be used with the four service layers listed above.
Before installing natural-capital-footprint-impact
, you must have already installed GDAL. We recommend using conda
because it is the easiest way to install GDAL:
$ conda create -n footprint python gdal
$ conda activate footprint
If you cannot use conda
, see the GDAL website and PyPI page for alternate installation instructions.
Install natural-capital-footprint-impact
:
$ pip install git+https://github.com/natcap/natural-capital-footprint-impact.git@main
The command natural-capital-footprint-impact
will then be available:
$ natural-capital-footprint-impact --help
usage: ...
- If your asset point data are in CSV format, convert it to a GDAL-supported vector format such as GeoPackage (GPKG):
ogr2ogr -s_srs EPSG:4326 -t_srs EPSG:4326 -oo X_POSSIBLE_NAMES=longitude -oo Y_POSSIBLE_NAMES=latitude assets.gpkg assets.csv
To do this in QGIS, use the Add Delimited Text Layer tool to add the CSV data as a point layer, then Export > Save Features As to save the layer to a GeoPackage.
In ArcGIS Pro, import the CSV data to a point layer using the XY Data to Point tool. This will create a point shapefile that you can use instead of a GeoPackage.
- Reproject your asset data to match the projection of your ecosystem service layers. If using the provided ecosystem service layers, assets must be projected in Eckert IV (ESRI:54012):
ogr2ogr -t_srs ESRI:54012 assets_eckert.gpkg assets.gpkg
This can also be done in QGIS with the Warp tool, and ArcGIS using the Project tool.
- Run the workflow:
natural-capital-footprint-impact -e ECOSYSTEM_SERVICE_TABLE {points,polygons} [-b BUFFER_TABLE] asset_vector
usage: natural-capital-footprint-impact [-h] -e ECOSYSTEM_SERVICE_TABLE [-b BUFFER_TABLE]
{points,polygons} asset_vector footprint_results_path company_results_path
positional arguments:
{points,polygons} mode of operation. in points mode, the asset vector contains point geometries. in polygons mode, it contains polygon geometries.
asset_vector path to the asset vector
footprint_results_path
path to write out the asset results vector
company_results_path path to write out the aggregated results table
options:
-h, --help show this help message and exit
-e ECOSYSTEM_SERVICE_TABLE, --ecosystem-service-table ECOSYSTEM_SERVICE_TABLE
path to the ecosystem service table
-b BUFFER_TABLE, --buffer-table BUFFER_TABLE
buffer asset points according to values in this table
The examples below assume your ecosystem service table is named ecosystem_service_table.csv
and your assets vector is named assets_example.gpkg
. You may use any other valid file path instead.
natural-capital-footprint-impact -e ecosystem_service_table.csv points assets_example.gpkg asset_results.gpkg company_results.csv
In Point mode, you provide the assets as latitude/longitude coordinate points. The asset footprint is not known or modeled. Ecosystem service statistics are calculated under each point only.
When run in point mode, the script performs these steps:
- Calculates ecosystem service values at the location of each asset point.
- Groups asset points by their
company
attribute. - Aggregates statistics about the asset points for each
company
.
natural-capital-footprint-impact -e ecosystem_service_table.csv points --buffer-table buffer_table.csv assets_example.gpkg asset_results.gpkg company_results.csv
In Buffer mode, you provide the assets as latitude/longitude coordinate points. The asset footprint is modeled by buffering each point with an area determined by the asset category in the Buffer Table. Ecosystem service statistics are calculated under each footprint.
When run in point buffer mode, the script performs these steps:
- Cross-references the asset vector with the buffer table on the
category
attribute to get the buffer area for each asset. - Buffers each asset point to form an asset footprint polygon with the appropriate buffer area. The buffer polygon is a many-sided regular polygon approximating a circle.
- Calculates statistics about each ecosystem service within the area of each asset footprint. See note below for caveats.
- Groups asset footprints by their
company
attribute. - Aggregates statistics about the assets for each
company
.
natural-capital-footprint-impact -e ecosystem_service_table.csv polygons assets_example.gpkg asset_results.gpkg company_results.csv
In Polygon mode, you provide the assets as footprint polygons. This mode is preferred if asset footprint data are available. Ecosystem service statistics are calculated under each footprint.
When run in polygon mode, the script performs these steps:
- Calculates statistics about each ecosystem service within the area of each asset footprint. See note below for caveats.
- Groups asset footprints by their
company
attribute. - Aggregates statistics about the assets for each
company
.
Because of the coarse resolution of the ecosystem service layers relative to typical asset footprint sizes, and the way that zonal statistics are calculated in the underlying library pygeoprocessing
, some results at the asset level may be non-intuitive. pygeoprocessing.zonal_statistics
calculates statistics using this algorithm:
If the polygon overlaps the centerpoint of at least one pixel:
Statistics are only calculated from the set of pixel(s) whose centerpoints fall within the polygon
If the polygon does not overlap the centerpoint of any pixel:
Statistics are calculated from the set of pixel(s) that intersect the bounding box of the polygon
If this causes problems, you may try resampling the ecosystem service layers to a finer (smaller) resolution.
The output vector attribute table is based on the point or polygon asset vector provided as input. Several statistics are calculated for each ecosystem service:
In point mode:
<es_id>_max
: maximum service value within the asset footprint<es_id>_flag
: binary value indicating whether the asset has been flagged. Assets are flagged if their<es_id>_max
value is greater than the correspondingflag_threshold
value in the ecosystem service table.
Using the provided service list, 8 columns named <es_id>_<statistic>
are added to the original attribute table, one for each combination of the 5 ecosystem services and these 2 statistics.
Example output attribute table for point mode:
FID | crr_max | crr_flag | ... |
---|---|---|---|
1 | 1 | 1 | ... |
2 | 0 | 0 | ... |
In point buffer mode and polygon mode:
<es_id>_max
: maximum service value within the asset footprint.<es_id>_mean
: mean service value within the asset footprint.<es_id>_adj_sum
: Area-adjusted sum of service values on each pixel within the asset footprint. This is<es_id>_mean
multiplied by the asset footprint area.<es_id>_count
: number of pixels within the asset footprint that have data for the service.<es_id>_nodata_count
: number of pixels within the asset footprint that are missing data for the service.<es_id>_flag
: binary value indicating whether the asset has been flagged. Assets are flagged if their<es_id>_max
value is greater than the correspondingflag_threshold
value in the ecosystem service table.
Note: These statistics are derived from the set of pixels that is calculated as described above, see "Caveats about footprint statistics".
Using the provided service list, 24 columns named <es_id>_<statistic>
are added to the original attribute table, one for each combination of the 4 ecosystem services and these 6 statistics.
Example output attribute table for buffer mode and polygon mode:
FID | crr_max | crr_mean | crr_adj_sum | ... |
---|---|---|---|---|
1 | 1 | 0.25 | 0.3 | ... |
2 | 0 | 0 | 0 | ... |
The units for the <es_id>
, <es_id>_max
, <es_id>_mean
, and <es_id>_adj_sum
values will vary depending on the service. If you are using the default/provided services, see the Data provided for you section of this Readme for a description of these services and their units. If the ecosystem service table has been modified with a different number of services, then the statistics will be calculated for each of the user-defined services, with new columns defined as noted above.
The output company table contains
<es_id>_adj_sum
: Sum of<es_id>_adj_sum
under asset footprints for each service<es_id>_mean
: Mean of each ecosystem service under asset footprints<es_id>_assets
: For each service, the number of assets with data<es_id>_area
: Total area of asset footprints per company that are overlapping data for each service<es_id>_flagged
: For each service, the number of assets flaggedpercent_<es_id>_flagged
: For each service, the percent of assets flaggedtotal_assets
: Total number of assets belonging to each companytotal_area
: Total area (in square meters) of asset footprints belonging to each companytotal_flagged
: Number of assets flagged (receiving a 1) by criteria defined abovepercent_total_flagged
: Percent of assets flagged in any category
Again, the units for the <es_id>_adj_sum
and <es_id>_mean
values will vary depending on the service. If you are using the default/provided services, see the Data provided for you section of this Readme for a description of these services and their units.
Example output company statistics table:
company | crr_adj_sum | crr_mean | crr_assets | ... |
---|---|---|---|---|
XYZ Corp | 0 | 0 | 3 | ... |
AAA Inc. | 0.57 | 0.7 | 13 | ... |
Point mode produces a CSV and a point vector in geopackage (.gpkg) format. Both contain the same data. These are copies of the input data with additional columns added. There is one column added for each ecosystem service. This column contains the ecosystem service value at each point, or NULL
if there are no data available at that location.
Buffer mode and Polygon mode both produce a polygon vector in geopackage (.gpkg) format. It is a copy of the input vector with additional columns added to the attribute table. There is one column added for each combination of ecosystem service and statistic.
If you prefer to work with the asset-level results in CSV format, you can convert the GPKG to CSV like so:
ogr2ogr -f CSV asset_results.csv asset_results.gpkg
Chaplin-Kramer, R. and Sharp., R.P. Nature’s Contributions to People under Potential Natural Vegetation. (Unpublished dataset). Based on models described in Chaplin-Kramer, R., Neugarten, R. A., Sharp, R. P., Collins, P. M., Polasky, S., Hole, D., et al. (2023). Mapping the planet’s critical natural assets. Nature Ecology & Evolution, 7(1), 51-61.
Damania, R., Polasky, S., Ruckelshaus, M., Russ, J., Amann, M., Chaplin-Kramer, R., Gerber, J. Hawthorne, P., Heger, M., Saleh Mamun, Ruta, G., Schmitt, R., Smith, J., Vogl, A. Wagner, F., and Zaveri, E. (2023). Nature’s Frontiers: Achieving Sustainability, Efficiency, and Prosperity with Natural Capital. Environment and Sustainable Development series. Washington, DC: World Bank. In press.
Oak Ridge National Laboratory. East View Cartographic, Inc. LandScan 2019 global population database. East View Cartographic, Inc.