EAMxx: Adds aerosols heterogeneous freezing calculations in P3 microphysics #6947

singhbalwinder · 2025-01-25T23:02:43Z

The heterogeneous freezing calculations from prognostics aerosols are
added to P3 microphysics. Setting use_hetfrz_classnuc to true
will turn on these calculations. Otherwise, P3 will use the default
prescribed aerosol calculations.

[BFB] for EAM and EAMxx

github-actions · 2025-01-25T23:04:53Z

PR Preview Action v1.6.0
🚀 View preview at https://E3SM-Project.github.io/E3SM/pr-preview/pr-6947/
Built to branch `gh-pages` at 2025-02-06 00:34 UTC. Preview will be ready when the GitHub Pages deployment is complete.

singhbalwinder · 2025-01-25T23:08:09Z

TODO:

Turn off this feature for default EAMxx
Revive commented-out P3 tests after adding missing arguments in various function signatures.

mahf708 · 2025-01-29T19:05:27Z

Qucik comments:

Please turn off the feature by default
Please follow for the do_ice_production procedure (as an example) for passing the flag inside
Please hide most (if not all) additions inside if-else guards (with the new flag), for example, the add_required calls and such
Please keep tests intact if you intend to integrate this
Also, ensure you don't break PAM/MMF2 (I am ~~100%~~ almost certain you're currently breaking it)

bartgol

I have a few comments. Mostly: why are lots of unit tests now commented?

components/eamxx/cime_config/namelist_defaults_scream.xml

components/eamxx/src/physics/p3/eamxx_p3_process_interface.cpp

components/eamxx/src/physics/p3/impl/p3_CNT_couple_impl.hpp

bartgol · 2025-01-29T20:30:52Z

components/eamxx/src/physics/p3/impl/p3_CNT_couple_impl.hpp

+  const auto mask = qc_incld > qsmall;
+  switch (Iflag) {
+    case 1:  // cloud droplet immersion freezing
+      ncheti_cnt.set(mask, frzimm*1.0e6/rho /* frzimm input is in [#/cm3] */ , Zero);


In all these "set" calls, how often do you expect the mask to be true/false? If the mask could often be ALL false (not sometimes, often), then you may consider using if statements, to avoid computing the packs for the true case for nothing (e.g., in the 1st line we have to compute frzimm*1e6/rho regardless of whether we need it or not).

Note: this nano-opt makes sense only if you expect mask to be often false. I assume that's not the case, since qsmall is very small. But I don't know how qc_incld is computed, so maybe it's often 0?

That is a good question. I am not sure about that. @kaizhangpnl or @AaronDonahue might know if mask can often be false or not.

components/eamxx/src/physics/p3/tests/p3_main_unit_tests.cpp

components/eamxx/src/physics/p3/tests/p3_nc_conservation_tests.cpp

bartgol

I have a few comments. Mostly: why are lots of unit tests now commented?

components/eamxx/src/physics/p3/impl/p3_main_impl.hpp

components/eamxx/src/physics/p3/impl/p3_CNT_couple_impl.hpp

components/eamxx/src/physics/p3/impl/p3_main_impl_part2.hpp

components/eamxx/src/physics/p3/p3_functions.hpp

mahf708 · 2025-02-02T18:17:11Z

Requesting reviews from @hassanbeydoun and @brhillman because I know they're very curious about and interested in this part of the p3 code

components/eamxx/src/physics/p3/eamxx_p3_process_interface.cpp

…options

singhbalwinder · 2025-02-05T20:59:45Z

@mahf708 and @bartgol : I have now addressed all the review comments. Please let me know if there is anything still missing. The P3 tests passed on Compy. I will try running the tests on PM-GPUs.

mahf708

Looks good to me. If the tests pass, I support merging.

For the record, I will note that Balwinder, Luca, and I all agree we likely need to restructure the P3 code at some point in the future. This is outside the scope of the current PR, and we will think about finding the time to do it at a later point.

mahf708 · 2025-02-05T21:20:56Z

One of the public CI tests failed with (which I think is related to my comment here #6947 (comment))

 FAIL:
!m_add_time_dim
/__w/E3SM/E3SM/components/eamxx/src/share/io/scorpio_output.cpp:477
Error! Time-dependent output field 'hetfrz_contact_nucleation_tend' has not been initialized yet
.

 FAIL:
!m_add_time_dim
/__w/E3SM/E3SM/components/eamxx/src/share/io/scorpio_output.cpp:477
Error! Time-dependent output field 'hetfrz_contact_nucleation_tend' has not been initialized yet
.

 FAIL:
!m_add_time_dim
/__w/E3SM/E3SM/components/eamxx/src/share/io/scorpio_output.cpp:477
Error! Time-dependent output field 'hetfrz_contact_nucleation_tend' has not been initialized yet
.

 FAIL:
!m_add_time_dim
/__w/E3SM/E3SM/components/eamxx/src/share/io/scorpio_output.cpp:477
Error! Time-dependent output field 'hetfrz_contact_nucleation_tend' has not been initialized yet
.

to reproduce locally, this is the test:

ERS_Ld5_P4.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.<MACHINE>_<COMPILER>.eamxx-prod

singhbalwinder · 2025-02-06T00:35:51Z

Thanks, Naser! With your help, I have fixed this test.

mahf708

@singhbalwinder, looks good to me from the standpoint of actual p3 runtime and eamxx runtime, so I am approving.

Note there are likely two sticky problems that someone (not me, because I already gave up, see link below) has to contend with one way or another:

Your PR is making the p3 unit tests fail (these are p3_tests and p3_sk_tests, found under components/eamxx/p3/tests). The change seems to be in the comparison only, so something you did is changing those. That aligns with my prior experience, but I opted to close the PR rather than figuring it out. See link below.
A slightly less sticky problem is resolving the MM2 test (it doesn't fail on the ci here because it didn't run at all) but if you test it locally, it will likely fail to build, try SMS_Ln3_P4.ne4pg2_oQU480.F2010-MMF2 on some machine with this PR and it will almost certainly fail to build). I can probably help you fix this if you want; I fixed these fails multiple times in the past.

I ran into this precise situation a few weeks ago and I decided not to bother with it. With even much simpler code edits. You can see the discussion here: #6938

bartgol · 2025-02-06T15:39:15Z

Your PR is making the p3 unit tests fail (these are p3_tests and p3_sk_tests, found under components/eamxx/p3/tests). The change seems to be in the comparison only, so something you did is changing those. That aligns with my prior experience, but I opted to close the PR rather than figuring it out. See link below.

@singhbalwinder I noticed that only p3_tests/p3_sk_tests fail, while all of the XYZ_baseline_cmp tests (where XYZ includes p3) pass. Is it b/c you hard code the new fields to the constant value they have in master?

Also, it's interesting that the tests pass in the FPE build. The main diff between FPE and DBG is that the former uses a pack size of 1. That said, also CUDA builds use pack size of 1, and yet they fail. I would love it if you digged a bit, and see if there's an explanation for why all standalone tests fail but the FPE build passes. If fails and pass are expected, then great. If not, I'd hold off the merge.

mahf708 · 2025-02-06T16:10:18Z

Update the MMF2 test fails with this annoying error:

e3sm.exe: /home/runner/_work/E3SM/E3SM/externals/ekat/src/ekat/kokkos/ekat_subview_utils.hpp:32: ekat::Unmanaged<Kokkos::View<ST*, Kokkos::LayoutRight, Props ...> > ekat::subview(ViewLR<ST**, Props ...>&, int) [with ST = const Pack<double, 1>; Props = {Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, Kokkos::MemoryTraits<0>}; Unmanaged<Kokkos::View<ST*, Kokkos::LayoutRight, Props ...> > = Kokkos::View<const Pack<double, 1>*, Kokkos::LayoutRight, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, Kokkos::MemoryTraits<1> >; ViewLR<ST**, Props ...> = Kokkos::View<const Pack<double, 1>**, Kokkos::LayoutRight, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, Kokkos::MemoryTraits<0> >]: Assertion `v.data() != nullptr' failed.

Program received signal SIGABRT: Process abort signal.

This type of error is almost certainly to do with missing views in the diagnostic_inputs struct based on my prior experience, but I could be misremembering things. Could be fixed by populating these in the PAM interface.

bartgol · 2025-02-06T16:32:10Z

Update the MMF2 test fails with this annoying error:

e3sm.exe: /home/runner/_work/E3SM/E3SM/externals/ekat/src/ekat/kokkos/ekat_subview_utils.hpp:32: ekat::Unmanaged<Kokkos::View<ST*, Kokkos::LayoutRight, Props ...> > ekat::subview(ViewLR<ST**, Props ...>&, int) [with ST = const Pack<double, 1>; Props = {Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, Kokkos::MemoryTraits<0>}; Unmanaged<Kokkos::View<ST*, Kokkos::LayoutRight, Props ...> > = Kokkos::View<const Pack<double, 1>*, Kokkos::LayoutRight, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, Kokkos::MemoryTraits<1> >; ViewLR<ST**, Props ...> = Kokkos::View<const Pack<double, 1>**, Kokkos::LayoutRight, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace>, Kokkos::MemoryTraits<0> >]: Assertion `v.data() != nullptr' failed.

Program received signal SIGABRT: Process abort signal.

This type of error is almost certainly to do with missing views in the diagnostic_inputs struct based on my prior experience, but I could be misremembering things. Could be fixed by populating these in the PAM interface.

It could be a bad index of a subview. E.g., try to subview (ncols,nlevs) at first index ncols ...

singhbalwinder · 2025-02-06T17:31:44Z

Both p3_tests/p3_sk_tests passed on my end on Compy (standalone build) and pm-gpu (using test-all-scream). Yesterday I tried with different pack sizes and omp threads on Compy and they all passed. I do not expect any non-MAM4xx tests to fail as the new flag is set to false by default.

In p3 tests, I am using the engine to generate input for the new flag. I thought it was okay to use engine as the asserts will compare the outputs consistently (e.g. output when flag is true on device vs. output when flag is true on host and vice versa). Should I hardwire it to false always?

Once I reproduce it locally, I should be able to debug it. I am currently looking at ways to reproduce it.

mahf708 · 2025-02-06T17:40:37Z

Both p3_tests/p3_sk_tests passed on my end on Compy (standalone build) and pm-gpu (using test-all-scream). Yesterday I tried with different pack sizes and omp threads on Compy and they all passed. I do not expect any non-MAM4xx tests to fail as the new flag is set to false by default.

In p3 tests, I am using the engine to generate input for the new flag. I thought it was okay to use engine as the asserts will compare the outputs consistently (e.g. output when flag is true on device vs. output when flag is true on host and vice versa). Should I hardwire it to false always?

Once I reproduce it locally, I should be able to debug it. I am currently looking at ways to reproduce it.

The fails are to do with comparison, so you likely need to generate the baselines before this PR and then run the tests with compare enabled. Check out the test-all-scream options

singhbalwinder changed the title ~~Adds aerosols heterogeneous freezing calculations in P3 microphysics~~ EAMxx: Adds aerosols heterogeneous freezing calculations in P3 microphysics Jan 28, 2025

singhbalwinder requested review from mahf708, bartgol and kaizhangpnl January 29, 2025 19:01

mahf708 requested review from AaronDonahue and hassanbeydoun January 29, 2025 19:09

bartgol reviewed Jan 29, 2025

View reviewed changes

mahf708 reviewed Feb 2, 2025

View reviewed changes

components/eamxx/src/physics/p3/impl/p3_main_impl.hpp Outdated Show resolved Hide resolved

mahf708 reviewed Feb 2, 2025

View reviewed changes

components/eamxx/src/physics/p3/impl/p3_CNT_couple_impl.hpp Outdated Show resolved Hide resolved

mahf708 reviewed Feb 2, 2025

View reviewed changes

components/eamxx/src/physics/p3/impl/p3_main_impl_part2.hpp Show resolved Hide resolved

mahf708 reviewed Feb 2, 2025

View reviewed changes

components/eamxx/src/physics/p3/p3_functions.hpp Outdated Show resolved Hide resolved

Adds heterogeneous freezing from aerosols in P3 microphysics

ede97b6

mahf708 requested a review from brhillman February 2, 2025 18:16

Passes all tests on Compy

9be599e

singhbalwinder force-pushed the jroverf/singhbalwinder/eamxx/add-het-frz-p3_rebase1_1 branch from 487f9b6 to 9be599e Compare February 3, 2025 19:25

mahf708 reviewed Feb 3, 2025

View reviewed changes

components/eamxx/src/physics/p3/eamxx_p3_process_interface.cpp Outdated Show resolved Hide resolved

singhbalwinder added 6 commits February 3, 2025 18:08

Turn of this feature by default, all P3 tests pass on PM-GPUs

3a76876

Fixes p3_test by adding new args

dcf1f5f

Removes debug statements

f98ac90

Partially remove use_hetfrz_classnuc from arg lists and use runtime_…

85b6e6e

…options

use_hetfrz_classnuc is not controlled only by runtime_options

dee8f35

Modified CNT function name to more readable name

5808e72

singhbalwinder marked this pull request as ready for review February 5, 2025 20:57

mahf708 approved these changes Feb 5, 2025

View reviewed changes

Adds logic to exclude aci inputs to p3 when use_hetfrz_classnuc is false

611abe0

odiazib self-requested a review February 6, 2025 00:40

odiazib approved these changes Feb 6, 2025

View reviewed changes

mahf708 approved these changes Feb 6, 2025

View reviewed changes

rljacob assigned tcclevenger Feb 6, 2025

rljacob added the EAMxx PRs focused on capabilities for EAMxx label Feb 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EAMxx: Adds aerosols heterogeneous freezing calculations in P3 microphysics #6947

EAMxx: Adds aerosols heterogeneous freezing calculations in P3 microphysics #6947

singhbalwinder commented Jan 25, 2025 •

edited

Loading

github-actions bot commented Jan 25, 2025 •

edited

Loading

Built to branch `gh-pages` at 2025-02-06 00:34 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

singhbalwinder commented Jan 25, 2025

mahf708 commented Jan 29, 2025 •

edited

Loading

bartgol left a comment

bartgol Jan 29, 2025

singhbalwinder Jan 30, 2025

bartgol left a comment

mahf708 commented Feb 2, 2025

singhbalwinder commented Feb 5, 2025

mahf708 left a comment

mahf708 commented Feb 5, 2025 •

edited

Loading

singhbalwinder commented Feb 6, 2025

mahf708 left a comment

bartgol commented Feb 6, 2025 •

edited

Loading

mahf708 commented Feb 6, 2025

bartgol commented Feb 6, 2025

singhbalwinder commented Feb 6, 2025

mahf708 commented Feb 6, 2025

EAMxx: Adds aerosols heterogeneous freezing calculations in P3 microphysics #6947

Are you sure you want to change the base?

EAMxx: Adds aerosols heterogeneous freezing calculations in P3 microphysics #6947

Conversation

singhbalwinder commented Jan 25, 2025 • edited Loading

github-actions bot commented Jan 25, 2025 • edited Loading

Built to branch gh-pages at 2025-02-06 00:34 UTC. Preview will be ready when the GitHub Pages deployment is complete.

singhbalwinder commented Jan 25, 2025

mahf708 commented Jan 29, 2025 • edited Loading

bartgol left a comment

Choose a reason for hiding this comment

bartgol Jan 29, 2025

Choose a reason for hiding this comment

singhbalwinder Jan 30, 2025

Choose a reason for hiding this comment

bartgol left a comment

Choose a reason for hiding this comment

mahf708 commented Feb 2, 2025

singhbalwinder commented Feb 5, 2025

mahf708 left a comment

Choose a reason for hiding this comment

mahf708 commented Feb 5, 2025 • edited Loading

singhbalwinder commented Feb 6, 2025

mahf708 left a comment

Choose a reason for hiding this comment

bartgol commented Feb 6, 2025 • edited Loading

mahf708 commented Feb 6, 2025

bartgol commented Feb 6, 2025

singhbalwinder commented Feb 6, 2025

mahf708 commented Feb 6, 2025

singhbalwinder commented Jan 25, 2025 •

edited

Loading

github-actions bot commented Jan 25, 2025 •

edited

Loading

Built to branch `gh-pages` at 2025-02-06 00:34 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

mahf708 commented Jan 29, 2025 •

edited

Loading

mahf708 commented Feb 5, 2025 •

edited

Loading

bartgol commented Feb 6, 2025 •

edited

Loading