-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EAMxx: allow multi-output diags #6935
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good start, but of course, there's more to do. I would add more beef before merging, as it doesn't really add anything (nor does much that we can test) until then.
Here's what I think should be added:
- in IO, we need to build diags smartly. Right now, we build one diag object for each output field, but if one obj can compute 2+, we should not build 2+ copies of the same diag. This requires IO to do 2 passes when building diags: the first pass to figure out which diags to build, along with all the input params, and the second to actually build the diags)
- I would convert one of the multi-purpose diags to allow computing N fields. E.g., the number path or water path diags, since they are very simple. Then, we can test that they work fine when computing 1 or 2+ fields.
- Diag like NumberPath will now have to change their param list organization. E.g., that diag now checks the param
Number Kind
, to figure out which one to compute. Instead, it should now have 3 inputs, and check them separately. E.g., it should check something likeCompute Liq
,Compute Ice
, andCompute Rain
. - We currently use
name()
from the diag as a proxy for the diagnostic output name. With 2+ fields, this is no longer the case, so we must ensure that we DON'T use that string to refer to the field. Instead, thename()
method should refer to the class itself.
For the first point, I envision something like this:
map<string,ParameterList> params;
for (f in diag_fields) {
diag_factory_key = find_key_for(f); // this is the key to pass to the diag factory.
params[diag_factory_key].set(...);
}
for (f in diag_field) {
diag_factory_key = find_key_for(f);
create_diag(diag_factory_key,params[diag_factory_key])
}
I didn't do a great job explaining, but basically, the 1st loop is neede b/c different fields will set multiple params for the same diag class (e.g., the Compute Liq
/Compute Ice
/Compute Rain
mentioned above). Only after ALL params have been set, we can proceed to create the diags.
I would still want to ensure this minor addition isn't breaking some corner test. Could you run the CI when you have a moment? |
I agree with everything, but I am not sure about this one. Let me try to think about it. My goal is to keep edits as noninvasive as possible. However, this part is more or less necessary if we want to compute diags smartly. Let me think more carefully |
The single-output diags can still use name() internally when creating the diag, I wasn't hinting at that. I just want to make sure that from the outside we are not using |
@bartgol I'm confused why your approval doesn't allow the testing to move forward |
No, don't worry about the approval. This is NOT ready to be merged. I am going to address Luca's comments. I think Luca approved so that the CI runs. |
dismissing review since more work is needed. Will re-request review after I address the mods needed.
54ca49a
to
40a88b7
Compare
9a476f1
to
c477c30
Compare
@bartgol PTAL; there's very likely something I am missing on the IO side, especially surrounding the name() part. A few notes on design choices (feel free to push back or request drastic changes or suggest improvements)
In general, I think we should limit functionality to what we want to support and not leave it open-ended up and fragile. Adding Aaron to review as well, since he might be interested in weighing in. |
b5d627c
to
68188d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Submitting some comments. The biggest concern is the comment in scorpio_output.cpp. Then the offline comment about the create_diagnostic
utility.
" - valid values: Liq, Ice, Rain\n"); | ||
m_kinds = m_params.get<std::vector<std::string>>("Number Kinds"); | ||
// check if "Liq" or "Ice" or "Rain" is in m_kinds | ||
EKAT_REQUIRE_MSG( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imho, this check is a bit redundant with the one below. In the for loop below, you already throw for anything other than "Liq", "Ice", and "Rain". So maybe (in the spirit of DRY) here you could just check that m_kinds.size()>0
, deferring to the for loop for the validity of the values.
@@ -196,6 +210,9 @@ void run(std::mt19937_64 &engine) { | |||
REQUIRE(std::abs(rnp_h(icol) - qndr_prod) < macheps); | |||
} | |||
} | |||
for(int icol = 0; icol < ncols; icol++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about checking that the 3 fields in the cnp
diag match exactly the fields in each of the lnp
/inp
/rnp
diagnostics? You can prob just use views_are_equal
on the fields, and be done.
@@ -1268,7 +1269,7 @@ compute_diagnostic(const std::string& name, const bool allow_invalid_fields) | |||
for (auto f : diag->get_fields_in()) { | |||
if (not f.get_header().get_tracking().get_time_stamp().is_valid()) { | |||
// Fill diag with invalid data and return | |||
diag->get_diagnostic().deep_copy(m_fill_value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't find another place to add this comment, so I add it here.
A few lines above we do m_diag_computed[name] = true;
. This will ensure that we don't recompute name
if another diag has it as a dependency. We need to modify that line, so that it sets the value to true for ALL fields computed by diag
.
scream::set_diagnostic(all_diag_list, single_diag_list, multi_diag_list, | ||
diag_map); | ||
|
||
for(const auto &name : single_diag_list) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why you need to distinguish between single- and multiple-output diagnostics. The two loops below are identical, so you could just loop over all the diags in one loop, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree; i just lazily didn't create a combined list ... the one above contains redundancies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you could see how this was developing in my head but never reached the finished line ... so duplication and such are consequences. Will fix
std::regex pot_temp ("(Liq)?PotentialTemperature$"); | ||
std::regex vert_layer ("(z|geopotential|height)_(mid|int)$"); | ||
std::regex horiz_avg ("([A-Za-z0-9_]+)_horiz_avg$"); | ||
void set_diagnostic(const std::vector<std::string> &all_diag_list, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naser and I had a private conversation on slack. Too long to report here, just noting that we are discussing different takes on how to get a list of diags to create.
currently, our AtmoshereDiagnostic class allows for only one field to be output by design; this was the intended use when we first started supporting these diagnostics. For example, currently, a diagnostic can be seen as
output1 = function ( { input1, input2, ... } )
. This PR allows additional outputs, for example{ output1, output2, ...} = = function ( { input1, input2, ... } )
.more design details to follow...
[bfb]