Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V1.3.2 #2

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .idea/CarbonFluxQA.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 4 additions & 1 deletion .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 0 additions & 7 deletions Components/01_CreateMasks.py

This file was deleted.

4 changes: 4 additions & 0 deletions Components/01_DownloadFiles.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
from funcs import download_files

# Create folder structure, download files, and clip TCL rasters to GADM extent
download_files()
9 changes: 9 additions & 0 deletions Components/02_CreateMasks.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
import constants_and_names as cn
from funcs import create_masks

# Set input folders = Mask, Inputs folders and select tcd_threshold/ gain/ save intermediate values
create_masks(cn.tcd_threshold, cn.gain, cn.save_intermediates)

# Other options:
# create_masks([0, 75], cn.gain, False)
# create_masks([30], cn.gain, True)
14 changes: 0 additions & 14 deletions Components/02_ZonalStats.py

This file was deleted.

15 changes: 4 additions & 11 deletions Components/03_ZonalStats_Masked.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,5 @@
import arcpy
import os
import pandas
from funcs import ZonalStatsMasked

arcpy.env.overwriteOutput = True

# Set the workspace and input/output folders
arcpy.env.workspace = r"U:\eglen\Data\CarbonFlux_QA_2023"

ZonalStatsMasked(arcpy.env.workspace)
import constants_and_names as cn
from funcs import zonal_stats_masked

# Calculate zonal stats for all input rasters at each tcd threshold value
zonal_stats_masked(cn.aois_folder, cn.input_folder, cn.mask_output_folder, cn.outputs_folder)
11 changes: 0 additions & 11 deletions Components/04_ZonalStatsAnnualized.py

This file was deleted.

6 changes: 6 additions & 0 deletions Components/04_ZonalStats_Annualized.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
import constants_and_names as cn
from funcs import zonal_stats_annualized

# Calculate emissions for each year of tree cover loss using TCL rasters
zonal_stats_annualized(cn.tcl_clip_folder, cn.input_folder, cn.mask_output_folder, cn.annual_folder)

15 changes: 3 additions & 12 deletions Components/05_ZonalStats_Clean.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,4 @@
"""
This section of code is for organizing the outputs into one csv for clarity. It uses pandas to loop through the dbf
outputs and write the sum fields to a new dataframe, then export this dataframe to a csv. Comment out if not necessary
"""
import os
import pandas as pd
import arcpy
from funcs import ZonalStatsClean
from funcs import zonal_stats_clean

arcpy.env.workspace = r"U:\eglen\Data\CarbonFlux_QA_2023"
arcpy.env.overwriteOutput = True

ZonalStatsClean(arcpy.env.workspace)
#Combine and clean masked and annual zonal stats output into final csv
zonal_stats_clean()
130 changes: 84 additions & 46 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,24 +6,59 @@
outputs are compared across platforms and methods including the GeoTrellis Tool, GFW Dashboard download spreadsheets,
the GFW API, and ArcGIS Zonal Statistics calculations.

### Inputs
### Overview

#### Areas of Interest:

This tool is set up to run statistics for two areas (Indonesia and Gambia), although it could be expanded to other
areas of interest. The inputs for these areas are derived from the GADM 3.6 Dataset, available for download here:
This code has been automated to create the working directory folder structure, download all of the required data
from s3, and produce a summary csv with the resulting zonal statisitcs.

### User Inputs

#### Area(s) of Interest:

The only file(s) required by the user are the shapefile(s) for the area(s) of interest. The shapefile(s) need to be
located in a subfolder in the woking directory named "AOIS".

This tool is set up to run statistics for as many areas of interest as the user provides. We used country
boundaries for Indonesia and Gambia from the GADM 3.6 Dataset, available for download here:
[](https://gadm.org/download_country_v3.html).

The Indonesia boundary is IND.14.13 and the Gambia Boundary is GMB.2.
The Indonesia boundary is IND.14.13 and the Gambia boundary is GMB.2.

These inputs will need to be updated if and when GFW switches to a newer version of GADM.

| Dataset | Directory | Description |
|---------------------|---------------------------|-----------------------------------------|
| Area(s) of Interest | {working_directory}/AOIS/ | Shapefiles for the area(s) of interest. |


#### User-Specified Parameters:
You must update the constants_and_names.py file with the path to your working_directory. This is the folder which
contains your areas of interest and where all of the data and results will be saved. There are a number of other
arguments in the constants_and_names.py file that users have the option to update. A description of each argument is
detailed below:

| Argument | Description | Type |
|-------------------------|------------------------------------------------------------------------------|------------|
| working_directory | Directory which contains the AOIS subfolder. | String |
| overwrite_arcgis_output | Whether or not you want to overwrite previous arcpy outputs. | Boolean |
| loss_years | Number of years of tree cover loss in the TCL dataset. | Integer |
| model_run_date | s3 folder where per-pixel outputs from most recent model run are located. | String |
| tile_list | List of 10 x 10 degree tiles that overlap with all aoi(s). | List |
| tile_dictionary | Dictionary that matches each country to their overlapping tiles. | Dictionary |
| extent | Which tile set(s) to download for zonal stats (options: forest, full, both). | String |
| tcd_threshold | List of tree cover density thresholds to mask by. | List |
| gain | Whether to include tree cover gain pixels in masks. | Boolean |
| save_intermediates | Whether to save intermediate masks (useful for troubleshooting). | Boolean |


### Datasets

#### Carbon Flux Model Data:

Three separate outputs from the Carbon Flux Model, each at two different extents, are used in as inputs in
this tool. This is a total of six different inouts. Inputs include gross emissions (all gasses), gross removals
(CO2), and net flux (CO2e). All are in inuts Mg / pixel. Calculations are run using both the forest extent and
full extent outputs.
Three separate outputs from the Carbon Flux Model, each with two different extents, are used as inputs in
this tool. This is a total of six different possible inputs. Inputs include gross emissions (all gasses),
gross removals (CO2), and net flux (CO2e). All are in inputs Mg / pixel. You have the option to calculate
zonal statistics according to tile extent: forest extent only, full extent only, or both extents.

| AOI | Extent | Type | Units | Tile |
|-----|--------|-----------------|---------------|----------|
Expand All @@ -45,13 +80,14 @@

Other auxiliary inputs for this tool include:

| Dataset | Use Description |
|----------------------|-----------------------------------------------------------------------------------------|
| Tree Cover Loss | Used to calculate annual emissions. |
| Tree Cover Density | Used to create density threshold mask. Default set to 30> or greater |
| Tree Cover Gain | Used to create tree cover gain mask. Areas of tree cover gain included in mask |
| Mangrove Extent | Used to create Mangrove mask. Areas of mangrove included in mask. |
| Pre-2000 Plantations | Used to create Pre-2000 plantations mask. Pre-2000 plantations masked from calculations |
| Dataset | Use Description |
|----------------------|------------------------------------------------------------------------------------------|
| Tree Cover Gain | Used to create tree cover gain mask. |
| Above Ground Biomass | Used to filter tree cover gain mask to only pixels that contain biomass. |
| Tree Cover Density | Used to create density threshold mask. |
| Mangrove Extent | Used to create Mangrove mask. Areas of mangrove included in mask. |
| Pre-2000 Plantations | Used to create Pre-2000 plantations mask. Pre-2000 plantations masked from calculations. |
| Tree Cover Loss | Used to calculate annual emissions. |


### Outputs:
Expand All @@ -62,58 +98,60 @@
### Code Summary

#### calculate_zonal_statistics
This file is for running the code in its entirety. The only input necessary from the user is the environemnt
path. Assuming all input datasets and subfolders are organized in the workspace correctly, this script will
execute all functions in the repository and produce output csvs.
This file is for running the code in its entirety. This script will execute all functions in the repository
consecutively and produce output csvs.

#### constants_and_names
This file stores all of the input arguments provided to the functions. Any changes to the arguments in this file
will be applies to all scripts in the repository.

#### funcs
This file stores all of the functions used in the tool. Any edits to functions would be made in this file.

#### components

This folder houses individual scripts for running separate functions. The only input necessary in these scripts
is the workspace environemnt path. These can be useful for running particular functions separately and testing edits
/ troubleshootins. Each function is described below.
This folder houses individual scripts for running separate functions. These can be useful for running particular
functions separately and testing edits/ troubleshootins. Each function is described below.

##### 01 Create Masks
##### 01 Download Files
This script creates the folder structure (other than the AOI folder) and downloads all of the required datasets from
s3 using the paths provided in the the constant_and_names file. You will need to set your AWS_ACCESS_KEY and
AWS_SECRET_ACCESS_KEY in your environment variables for this step to work (assuming you have s3 copy permissions).

##### 02 Create Masks
This script uses data on tree cover density, tree cover gain, mangrove extent, WHRC biomass, and pre-2000 plantations
to replicate the masks that are used in GFW data processing. This facilitates direct comparison with results from the GFW
dashboard, geotrellis client, and GFW API. The script creates masks based on criteria for each input dataset and saves these
masks in a sub directory. These masks are used later as extent inputs in the Zonal Statistics Masked script.

##### 02 Zonal Stats
This script calulates zonal statistics for each area of interest and carbon dataset combination without applying any
additional masking.

##### 03 Zonal Stats Masked
This script calculates zonal statistics for each area of interest and carbon dataset combination and applies each of
two masks:

_tcd: considers tree cover density > 30
_tcd_gain: considers tree cover density > 30 or gain = 1
This script calculates zonal statistics for each area of interest and carbon dataset combination and applies each
mask saved in the Mask/ Mask directory. The number of masks depend on the number of tcd_threshold values you indicated
and wheter or not you set the save_intermediate flag to True. At minimum, this will include final masks for each
tcd_threshold value and at maximum this will be the number of tcd_threshold values multiplied by the number of
intermediate masks (this varies depending on whether or not the area of interest includes mangroves and/ or
pre-2000 plantations).

##### 04 Zonal Stats Annualized
This script calculates annual emissions in each area of interest
This script calculates annual emissions in each area of interest using the tree cover loss dataset.

##### 05 Zonal Stats Cleaned
This script utilizes pandas to compile the results of all analyses and export them into a user-friendly csv file
This script utilizes pandas to compile the results of all analyses and export them into a user-friendly csv file.

### Running the Code
To run the code, you will need to set up a workspace with inputs organized into the correct directories. A future update
will include a script which automatically downloads these datasets and creates the correct directories. Until this is available,
please reach out to [email protected] for more information.
To run the code, you will need to set up a workspace with the AOI inputs organized into the correct directory,
update the user inputs section of the constants_and_names.py file, and provide your AWS_ACCESS_KEY and
AWS_SECRET_ACCESS_KEY in your environment variables.

This code is built on arcpy, which will require a valid ArcGIS license to run.

Once all inputs and directories are set up, the only input required from the user is the workspace path.
This code is built on arcpy, which will require a valid ArcGIS license to run.

### Other Notes
Updates in progress include...

A data download / prep script that will automatically download new data inputs from
s3 and build out the correct folder structure within a given workspace.

Additional functions to clean and export annualized results
- Currently, the annual zonal stats do not sum to the total emissions using the TCL dataset from s3,
but they do when using the previous TCL clipped rasters. For now, reach out to Erin Glen or Melissa Rose for
the previous TCL clipped rasters.

#### Contact Info
Erin Glen - [email protected]
Erin Glen - [email protected]
Melissa Rose - [email protected]
40 changes: 12 additions & 28 deletions calculcate_zonal_stats.py
Original file line number Diff line number Diff line change
@@ -1,38 +1,22 @@
import arcpy
import os
from funcs import create_masks, zonal_stats_clean, zonal_stats_masked, zonal_stats, zonal_stats_annualized
import constants_and_names as cn
from funcs import download_files, create_masks, zonal_stats_masked, zonal_stats_annualized, zonal_stats_clean

"""
Set the workspace to the folder which contains carbon value rasters for both AOIs:
"""
arcpy.env.overwriteOutput = True
arcpy.env.workspace = r"C:\GIS\Data\Carbon\CarbonFlux_QA_2023"
#Execute Download File...
print("Step 1: Downloading Files... \n")
download_files()

#Execute Create Masks...
print("Step 1: Creating Masks... \n")
arcpy.env.overwriteOutput = True
create_masks()

#Execute Calculate Zonal Stats...
print("Step 2: Calculating Zonal Stats... \n")
input_folder = os.path.join(arcpy.env.workspace,"Input","AOIS")
zonal_stats(input_folder)
print("Step 2: Creating Masks... \n")
create_masks(cn.tcd_threshold, cn.gain, cn.save_intermediates)

#Execute Calculate Zonal Stats Masked...
print("Step 3: Calculating Zonal Stats with Masks... \n")
input_folder = os.path.join(arcpy.env.workspace,"Input","AOIS")
zonal_stats_masked(input_folder)
zonal_stats_masked(cn.aois_folder, cn.input_folder, cn.mask_output_folder, cn.outputs_folder)

#Execute Calculcate Zonal Stats Annualized...
print("Step 3: Calculating Zonal Stats Annualized... \n")
annual_input_folder = os.path.join(arcpy.env.workspace, "TCL")
zonal_stats_annualized(annual_input_folder)
print("Step 4: Calculating Zonal Stats Annualized... \n")
zonal_stats_annualized(cn.tcl_clip_folder, cn.input_folder, cn.mask_output_folder, cn.annual_folder)

#Execute Zonal Stats Clean...
print("Step 3: Cleaning Zonal Stats... \n")
input_folders = [
os.path.join(arcpy.env.workspace, "Outputs", "00N_110E"),
os.path.join(arcpy.env.workspace, "Outputs", "20N_20W"),
os.path.join(arcpy.env.workspace, "Outputs", "Annual")
]
zonal_stats_clean(input_folders)
print("Step 5: Cleaning Zonal Stats... \n")
zonal_stats_clean()
Loading