Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Metop-SG-a1 Microwave Sounder (MWS) in GSI #844

Open
wants to merge 18 commits into
base: develop
Choose a base branch
from

Conversation

jianjunj
Copy link
Contributor

@jianjunj jianjunj commented Mar 18, 2025

Set the procedure to assimilate future Metop-SG-a1 MWS radiance data. The Metop-SG-a1 satellite will be launched in fall 2025. MWS is similar to the ATMS instrument except that there are 24 channels in its radiance data. Therefore, all those observation processings such as reading, data selection, and quality control of MWS are mimics of those for ATMS. The significant changes include a new reader "read_mws.f90", and new subroutine "mws_spatial_average_mod.f90" to conduct spatial average. A new "mws_beamwidth.txt" is required to conduct spatial average and is added in GSI-fix. An issue is in NOAA-EMC/GDASApp#1455

It depends on a branch:
https://github.com/jianjunj/GSI-fix/tree/feature/initial_metop-sg-a1_mws

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?
I tested these update with proxy MWS data on Orion:
/work2/noaa/da/jianjun/obs_data/gdas.20240219/00/atmos//gdas.t00z.mws.tm00.bufr_d
Test results are in /work2/noaa/da/jianjun/ufoeval/GSIobserver_new/hercules/2024021900
Note, MWS are assimilated passively without thinning or bias correction in my test. A diagnost file is saved as
/work2/noaa/da/jianjun/ufoeval/GSIobserver_new/hercules/2024021900/diags/diag_mws_metop-sg-a1_ges.2024021900.nc4

Provide instructions so we can reproduce

  1. The proxy BUFR data need to be copied over. The date is 9/12/2007. However, any date before 01/01/2025 is ignored by the new mws reader.
  2. Need CRTM coefficient data for MWS observations:
    /work2/noaa/da/jianjun/crtm/crtm.coefficients/mws_metop-sg-a1.TauCoeff.bin
    /work2/noaa/da/jianjun/crtm/crtm.coefficients/mws_metop-sg-a1.SpcCoeff.bin
  3. Need bias correction coefficients in
    /work2/noaa/da/jianjun/restarts/gdas.20240218/18/analysis/atmos/gdas.t18z.abias
    /work2/noaa/da/jianjun/restarts/gdas.20240218/18/analysis/atmos/gdas.t18z.abias_pc
  4. Need checkout the branch https://github.com/jianjunj/GSI-fix/tree/feature/initial_metop-sg-a1_mws. to have updated
    "cloudy_radiance_info.txt", "global_scaninfo.txt", and "global_satinfo.txt", and a new file "mws_beamwidth.txt".

Please also list any relevant details for your test configuration
A test was conducted to run GSI only in /work2/noaa/da/jianjun/git/GSI_forked/ush/run_observer/ on Orion. Specific changes can be seen by:
xxdiff /work2/noaa/da/jianjun/git/GSI_forked/ush/run_observer/gsi_observer.sh.0 /work2/noaa/da/jianjun/git/GSI_forked/ush/run_observer/gsi_observer.sh

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

All ctests passed without any proxy MWS data:
ctest1.log:1/1 Test#1: global_4denvar ................... Passed 3934.27 sec
ctest2.log:1/1 Test#2: rtma ............................. Passed 2768.58 sec
ctest3.log:1/1 Test#3: rrfs_3denvar_rdasens ............. Passed 1391.24 sec
ctest4.log:1/1 Test#4: hafs_4denvar_glbens .............. Passed 3446.39 sec
ctest5.log:1/1 Test#5: hafs_3denvar_hybens .............. Passed 3389.92 sec
ctest6.log:1/1 Test#6: global_enkf ...................... Passed 1932.70 sec

@jianjunj jianjunj self-assigned this Mar 18, 2025
@jianjunj jianjunj added enhancement New feature or request zero-diff The changes in this PR are verified to be zero-diff with the target branch labels Mar 18, 2025
@jianjunj jianjunj removed the zero-diff The changes in this PR are verified to be zero-diff with the target branch label Mar 19, 2025
@jianjunj
Copy link
Contributor Author

I cannot say there is zero-diff, though I don't expect this PR causes any differences since no proxy or real MWS data are processed in the ctests.

@RussTreadon-NOAA
Copy link
Contributor

@jianjunj , on which machine were ctests run?

@jianjunj
Copy link
Contributor Author

@RussTreadon-NOAA It is Orion.

@RussTreadon-NOAA RussTreadon-NOAA self-requested a review March 20, 2025 12:44
@RussTreadon-NOAA RussTreadon-NOAA self-assigned this Mar 20, 2025
Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments

  • Current suite of ctests do not cover this data type
  • Exercising the functionality added by this PR needs updates to GSI-fix. When will the required fix files be added?
  • Advise against adding a date/time check in read_obs.F90

@jianjunj
Copy link
Contributor Author

jianjunj commented Mar 20, 2025

@RussTreadon-NOAA

Current suite of ctests do not cover this data type

Right. There aren't real data to do the tests. The satellite is planed to be launched in this fall.

@RussTreadon-NOAA
Copy link
Contributor

Hera ctests

Install jianjunj:feature/initial_metop-sg-a1_mws at 405b00c and develop at 33012bc. Run ctests with following results

Test project /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/pr844/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  2483.64 sec
2/6 Test #6: global_enkf ......................   Passed  4307.89 sec
3/6 Test #1: global_4denvar ...................   Passed  6250.88 sec
4/6 Test #5: hafs_3denvar_hybens ..............   Passed  8486.52 sec
5/6 Test #4: hafs_4denvar_glbens ..............   Passed  8643.46 sec
6/6 Test #2: rtma .............................   Passed  9560.97 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 9561.00 sec

All tests pass. This is an expected result because Metop-SG-a1 MWS dump files are not available for the ctest cases.

@RussTreadon-NOAA
Copy link
Contributor

Hercules ctests
nstall jianjunj:feature/initial_metop-sg-a1_mws at 405b00c and develop at 33012bc. Run ctests with following results

Test project /work/noaa/da/rtreadon/git/gsi/pr844/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............***Failed  6428.51 sec
2/6 Test #6: global_enkf ......................   Passed  18496.88 sec
3/6 Test #5: hafs_3denvar_hybens ..............   Passed  20108.16 sec
4/6 Test #4: hafs_4denvar_glbens ..............   Passed  20113.52 sec
5/6 Test #2: rtma .............................   Passed  20406.74 sec
6/6 Test #1: global_4denvar ...................   Passed  21425.99 sec

83% tests passed, 1 tests failed out of 6

Total Test time (real) = 21425.99 sec

The following tests FAILED:
          3 - rrfs_3denvar_rdasens (Failed)

The rrfs_3denvar_rdasens failure is due to

The memory for rrfs_3denvar_rdasens_loproc_updat is 1112824 KBs.  This has exceeded maximum allowable memory of 1088661 KBs,
resulting in Failure memthresh of the regression test.

This is not a fatal fail. The memory check is not a robust test since it only reports task 0 memory usage.

@RussTreadon-NOAA
Copy link
Contributor

Hercules rrfs_3denvar_rdasens

Update and recompile develop at b25de8c and jianjunj:feature/initial_metop-sg-a1_mws at fbeea03. Rerun rrfs_3denvar_rdasens test with following results

Test project /work/noaa/da/rtreadon/git/gsi/pr844/build
    Start 3: rrfs_3denvar_rdasens
1/1 Test #3: rrfs_3denvar_rdasens .............   Passed  549.59 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) = 549.61 sec

This Passed result for this and other tests is expected. This PR does not alter existing functionality. It adds the ability to process a new observation type. This observation type is not yet routinely available.

@RussTreadon-NOAA
Copy link
Contributor

WCOSS2 ctests

Install develop at b25de8c and jianjunj:feature/initial_metop-sg-a1_mws at fbeea03 on Dogwood. Run ctests with the following results

Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr844/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............***Failed  731.58 sec
2/6 Test #6: global_enkf ......................   Passed  858.72 sec
3/6 Test #2: rtma .............................   Passed  1035.02 sec
4/6 Test #5: hafs_3denvar_hybens ..............***Failed  1221.05 sec
5/6 Test #4: hafs_4denvar_glbens ..............***Failed  1342.26 sec
6/6 Test #1: global_4denvar ...................***Failed  1804.11 sec

33% tests passed, 4 tests failed out of 6

Total Test time (real) = 1804.34 sec

The following tests FAILED:
          1 - global_4denvar (Failed)
          3 - rrfs_3denvar_rdasens (Failed)
          4 - hafs_4denvar_glbens (Failed)
          5 - hafs_3denvar_hybens (Failed)

Each of the tests failed due to

The results (penalty) between the two runs are nonreproducible,
thus the regression test has Failed on cost for hafs_3denvar_hybens_loproc_updat and hafs_3denvar_hybens_loproc_\
contrl analyses.

Comparison of the updat (jianjunj:feature/initial_metop-sg-a1_mws) and contrl (develop) show that while the initial penalty an gradient norm are identical to the printed 19 digits, the first step size differs in the 15th or 16th digit. This difference accumulates so that by the end of the second outer loop only the first few (2 to 5) digits in the total penalty are identical. For rrfs_3denvar_rdasens the final total penalties agree to 15 digits.

It is interesting that results are reproducible on Hera, Hercules, and Orion but differ on Dogwood. The RDHPCS builds use newer versions of the intel compiler and spack-stack installed libraries. The WCOSS2 build uses an older intel compiler (19.1.3.304) with hpc-stack installed libraries.

We need to better understand the observed WCOSS2 differences. The changes in this PR should not alter GSI results.

@RussTreadon-NOAA
Copy link
Contributor

RussTreadon-NOAA commented Mar 24, 2025

WCOSS2 global_4denvar test

As a test, replace setuprad.f90 from jianjunj:feature/initial_metop-sg-a1_mws with develop setuprad.f90.

Make the following changes to the devlop copy of setuprad.f90

  • add logical mws
  • set mws = .false.
  • add mws to call calc_clw argument list

Rebuild gsi.x and run ctest -R global_4denar. The ctest passed.

Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/test/build
    Start 1: global_4denvar
1/1 Test #1: global_4denvar ...................   Passed  1803.31 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) = 1803.33 sec

We should take a closer look at the changes this PR makes to to setuprad.f90.

@RussTreadon-NOAA
Copy link
Contributor

RussTreadon-NOAA commented Mar 24, 2025

WCOSS2 omp test

setuprad.f90 contains four !$omp parallel do loops. Comment out the four !$omp parallel do directives in setuprad.f90 in both working copies of develop and initial_metop-sg-a1_mws. Recompile both working copies and rerun ctests on Dogwood.

Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/test/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  733.70 sec
2/6 Test #6: global_enkf ......................   Passed  859.61 sec
3/6 Test #2: rtma .............................   Passed  1035.69 sec
4/6 Test #5: hafs_3denvar_hybens ..............   Passed  1223.31 sec
5/6 Test #4: hafs_4denvar_glbens ..............   Passed  1342.58 sec
6/6 Test #1: global_4denvar ...................   Passed  1924.15 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) = 1924.17 sec

Interesting result. Why / how does deactivating !$omp directives have this impact on WCOSS2?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants