-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
note on inflation, allow_missining_in_clm, and groups #277
Comments
i think the intent was this would skip computing inflation on an item-by-item basis if any ensemble members had missing values for that item. if this is not skipping the entire group then i think something's not right. edit: if any members have missing_r8 and group size is > 1, then i'd think all groups for this item should be skipped? ??? now i'm more confused. |
i looked at this code again (and i updated my last comment). i'd forgotten how groups work. this code is trying to compute inflation on a per-item basis, but if your group size is > 1 then it computes the values on subsets of the ensemble members, right? i have to think in concrete numbers sometimes. if your ensemble size is 30 and your group size is 2, then it computes inflation for ensemble members 1-15, and then 16-30 as if you had two separate ensembles of 15 member each. is that right? if so, then how does adaptive inflation work? because you only have a single inflation value per item written out at the end of the assimilation. |
Yeah that is the way I was thinking about allow_missing_in_clm. If one of ensemble member 1-15 has a missing_r8, but 16-30 don't. I think you would inflate 16-30 but not 1-15. Who cares? I don't know. I think it is worth making a note of the places where it is not clear how groups interact with other options, mostly to make a plan for refactoring. For example, this loop
I think the reset of RTPS on line 1646 (edit) means you only get the inflation from the last group in the output file. |
in looking at the inflation code i believe for adaptive inflation kinds 2 and 5 that the |
note on missing_8 footprint in quad_utils_mod #249 |
models calling set_missing_ok_status as of Jan 6th 2022: ../../../models/noah/model_mod.f90:252:call set_missing_ok_status(.true.)
This is not so good. |
eons ago i think we decided that instead of filter setting the missing status, model_mod should have the control. that accomplishes two good things. 1) any executable that uses the model_mod will have a consistent missing status set, and 2) since supporting missing data values seems to be model-dependent it's the more logical place to have it. but i don't think the main branch ever moved where missing was being controlled. |
I'm not sure I buy this. |
does PMO use the same init code as filter? also dart_to_model and model_to_dart - all of those main modules will have to have the same flag or they won't treat the missing values consistently. it's not only filter, is it? |
maybe maybe, let me think about this and pick your brains. |
this is an attempt at a brain dump on "invalid data in the state" issues. i'm not sure where to put this comment but i'll add it here and you can cut and paste it other places if that's better. Missing values model_mod::interpolate() quad utilities code model_mod::get_close() model_to_dart and dart_to_model invalid data in state without an aux array if MISSING_R8 is not allowed to be in the state the assimilation and inflation tests can be removed, but there still needs to be a way to determine if a location contains invalid data, both for the forward operator interpolation code and for writing the updated state back to the model data files. one solution is to add/compute an aux array in the model_mod code to flag invalid state locations. the other is to construct data arrays that do not have invalid points (e.g. constructing sums of valid values) and recreating the needed model data with a separate dart_to_model program. but it seems that the goal should be to avoid depending on special values in the state data. |
great point about differentiating between all ensemble members having invalid data at a location vs some members having valid data and others not. here are two other thoughts about missing values just for completeness. bookkeeping issues: if all members have the same invalid locations, an aux array of model grid size is sufficient. however if each ensemble member could have different invalid locations, an array (ensemble size x model grid size) or worse (ensemble size x model grid size x number of variables which can have missing values) would be required to track them. or a list of (location, member number, variable) would have to be kept and searched. the former could double the size of the state, the latter could be slow to search. forward operators where some members fail, some succeed: |
This is note about missing values.
allow_missing_in_clm is in the inflation handle.
filter:
inflate_ens:
So if some groups have missing values, but some don't, you will apply inflation to some groups but not others.
The text was updated successfully, but these errors were encountered: