Add SWE repartitioning for CLM #313

braczka · 2021-11-22T19:35:23Z

Description:

This adds the capability to repartition snow layer mass and dimensional variables. The repartitioning assures that the adjustments applied to the snow layers are consistent (mass and dimensions are conserved) with the adjustment of the total snow water equivalent. This also resolves the weakness where if a snow layer does not exist for all ensemble members, it is not adjusted for all ensemble members. We allow the user to choose the SWE repartitioning approach within the dart_to_clm namelist within SWE_repartition. A value of '0' provides the previous, default behavior of CLM-DART, a value of '1' repartitions snow to all active snow layers, and a value of '2' repartitions snow to the bottom snow layer only.

This SWE_repartition_fix branch began from clm_main at commit 337686f (8/27/21). Concurrent changes of the clm_main branch have occurred during this time by @hkershaw-brown. I made an attempt to merge the SWE_repartition_fix with the current clm_main at commit a741218 (10/19). This led to expected behavior for all variables except for the clm variable T_SOISNO (snow/soil temperature) which was updated to unrealistic large values for certain snow layers. This led to mass balance errors in CLM. At the suggestion of Helen I reverted the changes to SWE_repartition_fix to the last working version before that attempted merge with clm_main. File changes directly related to adding new SWE repartitioning capability as part of this PR include:

~/models/clm/dart_to_clm.f90
~/models/clm/dart_to_clm.rst
~/models/clm/readme.rst
~/models/clm/dart_to_clm.nml
~/models/clm/work/input.nml
~/models/clm/shell_scripts/cesm2.2/assimilate.csh

Fixes issue

This directly addresses github issue #253 , updating snow in CLM.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update

Documentation changes needed?

My change requires a change to the documentation.
I have updated the documentation accordingly.

Tests

Performed perfect data experiment of CLM5-DART across global domain with daily assimilation across 7 day period. See case /glade/u/home/bmraczka/cases/cesm2.2.0/clm5_SWE1_fix as an example of setup including setup scripts of CLM5_setup_assimilation.original and input.nml. Performed this test for all 3 SWE repartitioning options (clm5_SWE0_fix, clm5_SWE1_fix, clm5_SWE2_fix) that demonstrated: 1) the default implementation (no repartitioning) provides identical bitwise behavior to previous working clm_main version, 2) implementation of repartitioning schemes were mathematically sound (mass/dimensions of snow layers conserved), and 3) Update of CLM restart file is compatible with all internal CLM checks.

I intend to perform further scientific testing of repartitioning script across Western United States domain for Utah
NASA CMS collaboration -- mainly intended to explore impact of SWE repartition options upon soil moisture and surface albedo impacts.

Checklist for merging

Updated changelog entry
Documentation updated
Version tag

Additional changes have to occur before this PR is ready to merge.

Testing Datasets

Dataset needed for testing available upon request
Dataset download instructions included
No dataset needed

For an example set of CLM files for testing dart_to_clm functionality see: /glade/scratch/bmraczka/cesm2.2.0/clm5_SWE_PR/run

dart_to_clm_input_file = 'clm2_0001.r.2011-01-02-00000.nc.original'
dart_to_clm_output_file = 'clm5_SWE_PR.clm2_0001.r.2011-01-02-00000.nc.original'
repartition_vhist_file = 'clm5_SWE_PR.clm2_0001.h2.2011-01-02-00000.nc'
repartition_analysis_file = 'clm_analysis_member_0001_d03.2011-01-02-00000.nc'

If a variable has both a _FillValue attribute and a missing_value attribute, make sure the values are identical or we cannot support it with a single DART missing value. Handle _FillValue and missing_value separately because variables may have one or the other but not both. Adding a couple trivial unrelated changes not worthy of their own pull request, removing unused variables, making a hardcoded filename a parameter, that sort of thing.

The ultimate goal for models that support missing values is to read the variables and replace both _FillValue and missing_value items with the DART missing value. As long as r8 /= r4, there is no loss of precision when writing BACK to the original files. If DART is built with r8 = r4, and it reads from a NF90_DOUBLE and writes to a NF90_DOUBLE, there may be trouble ...

The game plan is to ultimately replace the DART missing value with whatever is originally in the file with the missing values in a post-processing 'dart_to_model' program.

Basically, this replaces the use of the progvar structure with the state_structure. There are some bits in the static_init_model that have been moved to subroutines to improve readability.

clm_to_dart.f90 is needed to reliably replace the CLM variables that have missing_value and/or _FillValue attributes but do not actuall use them. This happens in the restart files for the snow layers, for example.

This is a standalone program and replaces bogus values in layers of snow with the _FillValue value. It replaces the values in-place - i.e. it modifies the input file. To use it, you should COPY the clm restart file to the desired input filename. That filename should be the source of the variables used in pmo or filter or ...

Explores what happens to related snow variables when there is 'no snow' according to SNLSNO. T_SOISNO in the layer closest to the ground still has values even if there is no trace of snow (H2OSNO = 0.0). This seems like a problem with CLM logic. frac_sno seems like a reliable indicator to prevent replacing values in the layer thought to be 'empty'.

Using 'simple table' format for namelist which allows specification of newlines in table cells. Described in https://rest-sphinx-memo.readthedocs.io/en/latest/ReST.html#tables

CLM update 1

levsoi and levdcmp are new dimensions ... and carbon/nitrogen vars use them, so we need them.

The get_grid_vertval() routine still needs work to fully support the different vertical coordinates.

Replacing the 'kind' nomenclature when reporting QTY values.

The vector history files have variables declared with the column varying fastest, i.e. float SOIL1N_vr(time, levdcmp, column) ; while the other vector format file (the restart) always has vertical varying fastest, i.e. double H2OSOI_LIQ(column, levtot) ; That makes a difference when creating the 'model_size' metadata arrays.

Using the netcdf_utilities_mod routines makes it a lot clearer.

get_grid_vertval() still needs work, as does dart_to_clm.f90

The H2OSOI variable (QTY_SOIL_MOISTURE) is currently throwing a model_interpolate() error ... the COSMOS_NEUTRON_INTENSITY obs are exercising that FO. SOIL_MOISTURE is a particularly ugly interpolation - will deal with that later. SMINN_vr(time,levsoi,lat,lon) also uses the 'levsoi' coordinates.

Has the variables necessary for several of the observations in the obs_seq.in Avoiding soil moisture for now.

The exit from the case statement was in a funny place and probably not doing what it should have been doing. The code did the right thing, it just checked extra dimensions.

The dart_to_clm.f90 still needs support for the repartitioning of snow.

Merge remote-tracking branch 'upstream/clm_update_2' into clm_update_2

…repartition_fix

Using integer in repartition_swe >0 Co-authored-by: hkershaw-brown <[email protected]>

braczka · 2022-01-11T15:47:51Z

My latest batch of commits are associated with the observation converters mostly used with clm, that never got migrated to the the DART repository during the prior clm_update PRs. These commits include soil moisture and leaf area observations converters coming from NASA (NASA_Earthdata folder) and NSIDC (NSIDC folder). I updated the links to which acquire the raw netcdf or hdf5 data files, tested functionality, and created documentation for each, which was either sparse or non-existent. Changes to HDF5_utilities_mod.f90, obs_seq_utilities.f90, EASE_utilities_mod.f90 were needed to support the obs converters within the NASA_Earthdata and NSIDC folders.

…repartition_fix

converters' page. Minor changes to formating to stop readthedocs complaining

observations/obs_converters/NASA_Earthdata/README.rst

Remove SMAP_L2_to_obs in NASA_Earthdata doc Co-authored-by: hkershaw-brown <[email protected]>

braczka · 2022-01-19T01:15:06Z

@hkershaw-brown I re-tested this current version of the branch for all 3 snow re-partitioning schemes and for the most part it looks pretty good, after I added the allow_missing_clm=.true. option within &filter_nml. I am not sure if it is a concern that it is not bit-by-bit identical to a previous version that I have run for the CLM5_setup_assimilation toy script caserun. It starts out identical for the first couple assimilation time steps, then starts to deviate for all snow adjusted CLM variables. Did we add/remove some precision in the DART scripting with these recent updates? Not sure this is expected behavior, and realize sometimes getting bit-by-bit reproducibility can be messy.......

…repartition_fix

hkershaw-brown · 2022-01-19T16:00:59Z

@hkershaw-brown I re-tested this current version of the branch for all 3 snow re-partitioning schemes and for the most part it looks pretty good, after I added the allow_missing_clm=.true. option within &filter_nml. I am not sure if it is a concern that it is not bit-by-bit identical to a previous version that I have run for the CLM5_setup_assimilation toy script caserun. It starts out identical for the first couple assimilation time steps, then starts to deviate for all snow adjusted CLM variables. Did we add/remove some precision in the DART scripting with these recent updates? Not sure this is expected behavior, and realize sometimes getting bit-by-bit reproducibility can be messy.......

Let's check this out.

I'm not sure if CLM is always bit-for-bit idenical, but DART should be.
There is a small bug fix that is not in the clm-swe_pre-release tag, which is io_filenames_mod.f90 should be checking _FillValue not FillValue.

diff --git a/assimilation_code/modules/io/io_filenames_mod.f90 b/assimilation_code/modules/io/io_filenames_mod.f90
index a3a777a15..632cbb3d6 100644
--- a/assimilation_code/modules/io/io_filenames_mod.f90
+++ b/assimilation_code/modules/io/io_filenames_mod.f90
@@ -661,15 +661,15 @@ if ( get_has_FillValue(   domid, varid) ) then
    select case (get_xtype(domid, varid))
       case (NF90_INT)
          call get_FillValue(domid, varid, spvalINT)
-         call check_attribute_value_int(ncFile, filename, ncVarID, 'FillValue', spvalINT)
+         call check_attribute_value_int(ncFile, filename, ncVarID, '_FillValue', spvalINT)
 
       case (NF90_FLOAT)
          call get_FillValue(domid, varid, spvalR4)
-         call check_attribute_value_r4(ncFile, filename, ncVarID, 'FillValue', spvalR4)
+         call check_attribute_value_r4(ncFile, filename, ncVarID, '_FillValue', spvalR4)
 
       case (NF90_DOUBLE)
          call get_FillValue(domid, varid, spvalR8)
-         call check_attribute_value_r8(ncFile, filename, ncVarID, 'FillValue', spvalR8)
+         call check_attribute_value_r8(ncFile, filename, ncVarID, '_FillValue', spvalR8)

I'm wondering if this is the difference between your old and new runs. But it could be something else.
Do you mind if I grab the files you are running with? I just need which directory they are in Cheyenne.

Cheers,
Helen

braczka · 2022-01-19T16:42:59Z

@hkershaw-brown I re-tested this current version of the branch for all 3 snow re-partitioning schemes and for the most part it looks pretty good, after I added the allow_missing_clm=.true. option within &filter_nml. I am not sure if it is a concern that it is not bit-by-bit identical to a previous version that I have run for the CLM5_setup_assimilation toy script caserun. It starts out identical for the first couple assimilation time steps, then starts to deviate for all snow adjusted CLM variables. Did we add/remove some precision in the DART scripting with these recent updates? Not sure this is expected behavior, and realize sometimes getting bit-by-bit reproducibility can be messy.......

Let's check this out.

I'm not sure if CLM is always bit-for-bit idenical, but DART should be. There is a small bug fix that is not in the clm-swe_pre-release tag, which is io_filenames_mod.f90 should be checking _FillValue not FillValue.
diff --git a/assimilation_code/modules/io/io_filenames_mod.f90 b/assimilation_code/modules/io/io_filenames_mod.f90
index a3a777a15..632cbb3d6 100644
--- a/assimilation_code/modules/io/io_filenames_mod.f90
+++ b/assimilation_code/modules/io/io_filenames_mod.f90
@@ -661,15 +661,15 @@ if ( get_has_FillValue(   domid, varid) ) then
    select case (get_xtype(domid, varid))
       case (NF90_INT)
          call get_FillValue(domid, varid, spvalINT)
-         call check_attribute_value_int(ncFile, filename, ncVarID, 'FillValue', spvalINT)
+         call check_attribute_value_int(ncFile, filename, ncVarID, '_FillValue', spvalINT)
 
       case (NF90_FLOAT)
          call get_FillValue(domid, varid, spvalR4)
-         call check_attribute_value_r4(ncFile, filename, ncVarID, 'FillValue', spvalR4)
+         call check_attribute_value_r4(ncFile, filename, ncVarID, '_FillValue', spvalR4)
 
       case (NF90_DOUBLE)
          call get_FillValue(domid, varid, spvalR8)
-         call check_attribute_value_r8(ncFile, filename, ncVarID, 'FillValue', spvalR8)
+         call check_attribute_value_r8(ncFile, filename, ncVarID, '_FillValue', spvalR8)
I'm wondering if this is the difference between your old and new runs. But it could be something else. Do you mind if I grab the files you are running with? I just need which directory they are in Cheyenne.

Cheers, Helen

Sure -- I am currently running with the files within /glade/work/bmraczka/DART. This should be identical, (nearly identical) to the upstream files in NCAR/DART branch SWE_repartition_fix. The directories for the newer case runs are located at /glade/u/home/bmraczka/cases/cesm2.2.0/clm5_SWE0_final, clm5_SWE1_final and clm5_SWE2_final. The run folders for each case respectively are located within /glade/scratch/bmraczka/cesm2.2.0. I have been comparing the output files against the older cases: clm5_SWE0_fix, clm5_SWE1_fix and clm5_SWE2_fix for benchmarking. These older test cases were generated with the clm-swe_pre-release tag.

I have diffused the netcdf restart files for each companion case (the new and older versions) for variable H2OSOI_ICE. Haven't really checked the other variables, but assume the same loss/gain in precision occurs across all variables. This issue is slightly different than previous benchmark testing with DART-CLM in which there were immediate differences with the first set (first time step) of CLM output files -- suggesting CLM was the culprit. In this situation, the first couple of time steps (with assimilation) are identical (as far as I can tell), and differences start to creep in at the 2nd time step after the DART update. This suggests perhaps DART is introducing the differences....?

braczka · 2022-01-26T01:41:50Z

Odds and ends update:

The 156dca3 commit updates the syntax fix to replace 'nlon' with 'lon' with model_mod.f90. This was not causing a problem with any of my recent assimilation run tests -- but this is indeed a typo as brought to our attention from the E3SM-DART user today.

In reference to the reproducibility of SWE_repartition_fix, the slight differences in clm variables are pervasive -- not just the snow related variables that start to deviate from the tag clm-swe_pre-release, but all other variables that I tested as well (e.g. biomass). The deviation in behavior always occurs for portions of the variable state that are updated by DART, suggesting a difference in the DART code, or the way it was compiled. Below are my DART compiler options on Cheyenne:

MPIFC = mpif90
MPILD = mpif90
FC = ifort
LD = ifort

INCS = -I$(NETCDF)/include
LIBS = -L$(NETCDF)/lib -lnetcdff -lnetcdf
FFLAGS = -O -assume buffered_io $(INCS)
LDFLAGS = $(FFLAGS) $(LIBS)

I am going to re-run the test clm5_setup_assimilation with the tag clm-swe_pre-release one more time to see if I can match the behavior. I also plan to re-run the CLM5-DART Western US benchmark simulation with SWE_reparition_fix. This is probably overkill, but won't take too long to check.

braczka · 2022-02-02T22:35:15Z

The CLM5-DART Western US benchmark simulation with SWE_repartition_fix has been completed and is nearly identical to the previous benchmark simulation using the tag clm-swe_pre-release. This is ready to merge as far as I am concerned and am pulling it off draft.

Out of curiosity, I performed some tests running CLM-DART assimilations, with CLM in 'branch' mode (bit by bit reproducible for the start of CLM runs) instead of 'hybrid' mode which is not bit by bit. This didn't explain some of the small differences I was seeing in reproducibility, but doesn't effect the readiness of this PR as far as I am concerned.

assimilation_code/modules/observations/obs_sequence_mod.f90

hkershaw-brown

Approved.

I'll bring this up to date with main (it is just the inflation namelist bug-fix)and bump the version before merging.

timhoar added 30 commits May 20, 2021 09:52

proper select indentation

c3bb89e

Simpler missing _FillValue handling

07a8322

The game plan is to ultimately replace the DART missing value with whatever is originally in the file with the missing values in a post-processing 'dart_to_model' program.

fixing typos in comments

3610311

removing useless comment with bad spelling

29dd1dd

Updated with version from cesm_clm branch

5f9835f

Basically, this replaces the use of the progvar structure with the state_structure. There are some bits in the static_init_model that have been moved to subroutines to improve readability.

Starting to revive clm_to_dart.f90 - not done.

d1adad6

clm_to_dart.f90 is needed to reliably replace the CLM variables that have missing_value and/or _FillValue attributes but do not actuall use them. This happens in the restart files for the snow layers, for example.

exploring trace amounts of snow ... do not use.

703638e

clm_to_dart ready and documented.

bc81ee6

fixing formatting for table

c8e1b86

Using 'simple table' format for namelist which allows specification of newlines in table cells. Described in https://rest-sphinx-memo.readthedocs.io/en/latest/ReST.html#tables

Merge pull request #229 from NCAR/clm_update_1

77e5711

CLM update 1

broken version ... supporting levsoi dimension

22c6a4d

levsoi and levdcmp are new dimensions ... and carbon/nitrogen vars use them, so we need them.

restoring SIF converter file

7e82a6b

variables with levsoi and levdcmp read OK

42ba951

The get_grid_vertval() routine still needs work to fully support the different vertical coordinates.

more accurate names in state structure dump

a3b238c

Replacing the 'kind' nomenclature when reporting QTY values.

cleaned up nc_write_model_atts

87fe86b

Using the netcdf_utilities_mod routines makes it a lot clearer.

exploits netcdf_utilities_mod to improve clarity

1cbf658

get_grid_vertval() still needs work, as does dart_to_clm.f90

good namelist for testing

d69f611

Has the variables necessary for several of the observations in the obs_seq.in Avoiding soil moisture for now.

changed related() code that determines get_close_state

3fcc813

The exit from the case statement was in a funny place and probably not doing what it should have been doing. The code did the right thing, it just checked extra dimensions.

removing svn keyword baggage

bd00c2a

Replacing indeterminate values only.

eb87aa9

The dart_to_clm.f90 still needs support for the repartitioning of snow.

Adding remark about pert_model_state capability

5a9f609

adding example namelist for dart_to_clm.f90

494f0b8

good testing version

a2179c6

getting dart_to_clm capability from cheyenne repo

8cde32d

Merge remote-tracking branch 'upstream/clm_update_2' into clm_update_2

braczka and others added 4 commits January 10, 2022 16:43

Including flux obs converter shell scripts

6d01b70

Merge remote-tracking branch 'upstream/SWE_repartition_fix' into SWE_…

41ca2d7

…repartition_fix

Update models/clm/dart_to_clm.f90

33b649a

Using integer in repartition_swe >0 Co-authored-by: hkershaw-brown <[email protected]>

Update models/clm/dart_to_clm.f90

b42a7b5

Using integer in repartition_swe >0 Co-authored-by: hkershaw-brown <[email protected]>

braczka and others added 5 commits January 12, 2022 14:41

NASA_Earthdata docs formatting fixi

275648b

Adding docs on instrument/rep error

62d785f

Reverting back to _FillValue in clm_to_dart.f90

abc8e16

Merge remote-tracking branch 'upstream/SWE_repartition_fix' into SWE_…

93b5843

…repartition_fix

Added NASA_Earthdata and NSIDC to table of contents and 'available

b48f45d

converters' page. Minor changes to formating to stop readthedocs complaining

hkershaw-brown reviewed Jan 18, 2022

View reviewed changes

observations/obs_converters/NASA_Earthdata/README.rst Outdated Show resolved Hide resolved

Update observations/obs_converters/NASA_Earthdata/README.rst

122caa2

Remove SMAP_L2_to_obs in NASA_Earthdata doc Co-authored-by: hkershaw-brown <[email protected]>

braczka added 2 commits January 19, 2022 08:39

Add allow_missing_clm in filter nml

c849a72

Merge remote-tracking branch 'upstream/SWE_repartition_fix' into SWE_…

30cbbbe

…repartition_fix

clm model_mod syntax error with nlon

156dca3

braczka marked this pull request as ready for review February 2, 2022 22:35

hkershaw-brown reviewed Feb 3, 2022

View reviewed changes

assimilation_code/modules/observations/obs_sequence_mod.f90 Outdated Show resolved Hide resolved

revert white space change

9a4431e

hkershaw-brown approved these changes Feb 3, 2022

View reviewed changes

hkershaw-brown added 2 commits February 3, 2022 13:57

Merge branch 'main' into SWE_repartition_fix

cbd5954

Changelog and bump version for release

9e65ce9

hkershaw-brown merged commit c4de64f into main Feb 3, 2022

This was referenced Feb 3, 2022

Updating snow in CLM #253

Closed

Feature request: support CLM on public branch #224

Closed

hkershaw-brown deleted the SWE_repartition_fix branch February 4, 2022 20:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SWE repartitioning for CLM #313

Add SWE repartitioning for CLM #313

braczka commented Nov 22, 2021 •

edited by hkershaw-brown

Loading

braczka commented Jan 11, 2022

braczka commented Jan 19, 2022

hkershaw-brown commented Jan 19, 2022

braczka commented Jan 19, 2022

braczka commented Jan 26, 2022

braczka commented Feb 2, 2022

hkershaw-brown left a comment

Add SWE repartitioning for CLM #313

Add SWE repartitioning for CLM #313

Conversation

braczka commented Nov 22, 2021 • edited by hkershaw-brown Loading

Description:

Fixes issue

Types of changes

Documentation changes needed?

Tests

Checklist for merging

Testing Datasets

braczka commented Jan 11, 2022

braczka commented Jan 19, 2022

hkershaw-brown commented Jan 19, 2022

braczka commented Jan 19, 2022

braczka commented Jan 26, 2022

braczka commented Feb 2, 2022

hkershaw-brown left a comment

Choose a reason for hiding this comment

braczka commented Nov 22, 2021 •

edited by hkershaw-brown

Loading