Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEAS5 z-score and ASAP warnings #13

Open
wants to merge 9 commits into
base: switch-ecmwf
Choose a base branch
from
Open

Conversation

t-downing
Copy link
Contributor

@t-downing t-downing commented Feb 14, 2025

Two main things:

  1. Use SEAS5 z-score instead of absolute values for forecast trigger
    • Processing of z-score and threshold calc all in analysis/ecmwf_switch_zcore
    • Main things to check are that I'm doing the z-score calculation correctly, and plotting the RP correctly
    • I also do some quick RP modeling- not critical that this is perfect, it's really just tweaking the thresholds.
  2. Use JRC ASAP warnings for observational trigger
    • Basic processing of raw file from ASAP in src/datasources/asap.py
    • Everything else in exploration/asap_warnings
      • Main things to check is that I'm reading in the warnings correctly, and calculating the RP correctly
    • Also I combine the forecast and observational triggers in the last section of exploration/asap_warnings
      • Main thing to check is that I'm calculating the combined RP correctly

Corresponding slide deck here.

@t-downing t-downing changed the base branch from main to switch-ecmwf February 14, 2025 18:59
@t-downing t-downing changed the title Asap vegetation SEAS5 z-score and ASAP warnings Feb 19, 2025
@t-downing t-downing marked this pull request as ready for review February 19, 2025 23:52
@t-downing t-downing requested a review from hannahker February 19, 2025 23:52
Copy link

@hannahker hannahker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still reviewing the code in more detail, but a couple reproducibility points and curious generally about SPI-3... Was doing a bit of reading and isn't SPI-3 calculated a bit differently than a z-score? Wouldn't we want to fit the precip points to a gamma distribution, normalize, etc? Not questioning the validity of a z-score for these purposes, but just wondering if it's misleading to conflate the two. It also looks like Copernicus has a SPI-3 forecast product based on SEAS5. Would it be worth investigating this instead of calculating ourselves?

].min(axis=1)
else:
raise ValueError("invalid crop_range")
if biomass_only:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not defined

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And note doesn't seem to work if I set this to True:

KeyError                                  Traceback (most recent call last)
File ~/Desktop/pa-aa-bfa-drought/venv/lib/python3.13/site-packages/pandas/core/indexes/base.py:3805, in Index.get_loc(self, key)
   3804 try:
-> 3805     return self._engine.get_loc(casted_key)
   3806 except KeyError as err:

File index.pyx:167, in pandas._libs.index.IndexEngine.get_loc()

File index.pyx:175, in pandas._libs.index.IndexEngine.get_loc()

File pandas/_libs/index_class_helper.pxi:70, in pandas._libs.index.Int64Engine._check_type()

KeyError: 'indicator'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
Cell In[16], line 27
     25 if True:
     26     dff = dff["indicator"]
---> 27 dff = dff[dff["indicator"] >= alert_level]
     28 adm_counts = (
     29     dff.groupby("year")
     30     .agg(
   (...)
     34     .reset_index()
     35 )
     36 display(adm_counts)

File ~/Desktop/pa-aa-bfa-drought/venv/lib/python3.13/site-packages/pandas/core/series.py:1121, in Series.__getitem__(self, key)
   1118     return self._values[key]
   1120 elif key_is_scalar:
-> 1121     return self._get_value(key)
   1123 # Convert generator to list before going through hashable part
   1124 # (We will iterate through the generator there to check for slices)
   1125 if is_iterator(key):

File ~/Desktop/pa-aa-bfa-drought/venv/lib/python3.13/site-packages/pandas/core/series.py:1237, in Series._get_value(self, label, takeable)
   1234     return self._values[label]
   1236 # Similar to Index.get_value, but we do not fall back to positional
-> 1237 loc = self.index.get_loc(label)
   1239 if is_integer(loc):
   1240     return self._values[loc]

File ~/Desktop/pa-aa-bfa-drought/venv/lib/python3.13/site-packages/pandas/core/indexes/base.py:3812, in Index.get_loc(self, key)
   3807     if isinstance(casted_key, slice) or (
   3808         isinstance(casted_key, abc.Iterable)
   3809         and any(isinstance(x, slice) for x in casted_key)
   3810     ):
   3811         raise InvalidIndexError(key)
-> 3812     raise KeyError(key) from err
   3813 except TypeError:
   3814     # If we have a listlike key, _check_indexing_error will raise
   3815     #  InvalidIndexError. Otherwise we fall through and re-raise
   3816     #  the TypeError.
   3817     self._check_indexing_error(key)

KeyError: 'indicator'

):
specific_path = None
if data_type == "raw":
if data_type == "raw":

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repetition here

@@ -12,7 +12,7 @@
from azure.storage.blob import ContainerClient, ContentSettings
from dotenv import load_dotenv

load_dotenv()
load_dotenv(override=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want to avoid overwriting other variables you have already defined. Which is this for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just for checking things with the new blob creds (I generally don't use load_dotenv and have them all stored on my .zshenv so they're pre-loaded into the terminal, so this was to quickly overwrite the old ones without having to restart the terminal running my Jupyter).

Anyways, this is generally not needed so I'll remove it.

@t-downing
Copy link
Contributor Author

Still reviewing the code in more detail, but a couple reproducibility points and curious generally about SPI-3... Was doing a bit of reading and isn't SPI-3 calculated a bit differently than a z-score? Wouldn't we want to fit the precip points to a gamma distribution, normalize, etc? Not questioning the validity of a z-score for these purposes, but just wondering if it's misleading to conflate the two. It also looks like Copernicus has a SPI-3 forecast product based on SEAS5. Would it be worth investigating this instead of calculating ourselves?

Yeah you're right, I think it's misleading to call this simplified version the SPI. To be honest, we could just use the anomaly instead of the z-score or SPI; I think this is more easily understood anyways. The only requirement was that we don't use the absolute values of the precipitation forecast, since these shouldn't be compared across areas with different precipitation. So the anomaly would fit the bill too. Anyways we can discuss further on a call.

```

```python
df_seas5 = df_seas5_zscore_q.copy()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this?

### Plot historical activations

```python
thresh_3, thresh_7

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't defined yet

Fixing thresholds based on modeled RP (values calculated a few cells down)

```python
thresh_3 = -0.9

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a bit dangerous to hard code here? If these are pulled from the modelled RP can we un-hard code them?

```

```python
df_pivot.plot(x="year", y=["issued_3", "issued_7"])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm ok yes looking at this plot again I think 2000 seems like a very reasonable cut off year

import numpy as np
import ochanticipy.utils.raster # noqa: F401

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would still be good to remove the ochanticipy dependencies if possible

Copy link

@hannahker hannahker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I think looks good to merge here to me! I've confirmed all the code runs and left some notes on reproducibility. Methodologically, everything seems logical to me, although I'll note that some of the plotting and ASAP data processing is challenging for me to follow (mostly because I'm not super familiar with that dataset).

As we discussed, I think it'd be interesting to look into the precipitation anomaly in addition to the z-score. I have the gut feeling that this could be more valid on a non-normal distribution than the z-score, although it's hard to pinpoint exactly why (and I'm still warming up my stats brain after some years of unuse 😆 ).

```

```python
# this is literally just to make those little boxes on the plot

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😆

def get_alert_gr_int(alert_gr_str):
try:
return int(alert_gr_str.removeprefix("Warning group "))
except ValueError:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might potentially be too general and mask other errors that we aren't expecting... can we be more specific here?

```python
# set first possible trigger dekad to 3rd dekad of July
min_dekad = 21
# set last possible trigger dekad to 3rd dekad of July

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a confusing comment here. This dekad is different?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants