Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: CICE get_expected_agreg_over_grid giving > 100% #833

Open
hkershaw-brown opened this issue Mar 5, 2025 · 13 comments
Open

bug: CICE get_expected_agreg_over_grid giving > 100% #833

hkershaw-brown opened this issue Mar 5, 2025 · 13 comments
Assignees
Labels
cice Sea Ice models

Comments

@hkershaw-brown
Copy link
Member

🐛 🧊

User reported

Describe the bug

  1. Run cice with QCEFF with bounds on sea ice concentration
  2. What was the expected outcome? Run to completion, bounds respected
  3. What actually happened? CICE fwd op returning out of bounds causing QCEFF filter to die with "Largest ensemble member greater than upper bound"

Error Message

"Largest ensemble member greater than upper bound"

Which model(s) are you working with?

CESM CICE5(6?)

Version of DART

Which version of DART are you using?
You can find the version using git describe --tags
Will reproduce, believe it is v11

Have you modified the DART code?

No

Work around is forcing the fwd op to be <100%

Build information

Please describe:

  1. Derecho
  2. unknown, will confirm if it makes a difference.
@hkershaw-brown
Copy link
Member Author

See also #819 #681

@hkershaw-brown hkershaw-brown self-assigned this Mar 6, 2025
@hkershaw-brown
Copy link
Member Author

hkershaw-brown commented Mar 6, 2025

restarts: /glade/derecho/scratch/hkershaw/DART/Tickets/CICE

reproducer in /glade/derecho/scratch/hkershaw/DART/Tickets/CICE/run

@hkershaw-brown
Copy link
Member Author

hkershaw-brown commented Mar 6, 2025

ifx -O -assume buffered_io -I/glade/u/apps/derecho/24.12/spack/opt/spack/netcdf/4.9.2/oneapi/2025.0.3/cm5e/include  -c	/glade/derecho/scratch/hkershaw/DART/Tickets/CICE/DART/models/cice/cice_to_dart.f90
/glade/derecho/scratch/hkershaw/DART/Tickets/CICE/DART/models/cice/cice_to_dart.f90(124): warning #6843: A dummy argument with an explicit INTENT(OUT) declaration is not given an explicit value.   [TABLE]
subroutine verify_parameters(parameters, ngood,table)
-----------------------------------------------^

@hkershaw-brown
Copy link
Member Author

reproducer in /glade/derecho/scratch/hkershaw/DART/Tickets/CICE/run

fix_bound_violations = .true.,

Before observation assimilation TIME: 2025/03/06 13:40:17
 ERROR FROM:
  source : probit_transform_mod.f90
  routine: fix_bounds
  message:  Egregious upper bound violoation first check(see code)   1.01133114890742        1.00000000000000
 
MPICH ERROR [Rank 55] [job id bf1d451c-e89f-41b3-adf1-710fecc3105e] [Thu Mar  6 13:40:17 2025] [dec0222] - Abort(99) (rank 55 in comm 496): application called MPI_Abort(comm=0x84000001, 99) - process 55

fix_bound_violations = .false.,

Before observation assimilation TIME: 2025/03/06 13:47:20
 ERROR FROM:
  source : bnrh_distribution_mod.f90
  routine: bnrh_cdf
  message:  Largest ensemble member greater than upper bound   1.01133114890742        1.00000000000000

@hkershaw-brown
Copy link
Member Author

I don't think the matrix maths in this module is coping with the quad locations.
Entering this code -> sum of interpolated categories adding up to > 1.0

DART/models/cice/model_mod.f90

Lines 1732 to 1741 in fd3a813

!********
! Avoid exceeding maxima or minima as stopgap for poles problem
! When doing bilinear interpolation in quadrangle, can get interpolated
! values that are outside the range of the corner values
if(expected_obs > maxval(p)) then
expected_obs = maxval(p)
else if(expected_obs < minval(p)) then
expected_obs = minval(p)
endif
!********

@hkershaw-brown
Copy link
Member Author

Image

Image

@jlaucar
Copy link
Contributor

jlaucar commented Mar 10, 2025 via email

@hkershaw-brown
Copy link
Member Author

Here you go. The red point is the obs.
Observation location
lon = 255.0703125;
lat = 74.473571777343764;

Image

@jlaucar
Copy link
Contributor

jlaucar commented Mar 10, 2025 via email

@hkershaw-brown
Copy link
Member Author

No this is single category.
The interpolation is each category, then sum - it would be better the other way round I think.
The plots were just to check that then quad was actually a quad with an observation inside it. And compare interpolation in Matlab.

There is a little test code at
main...ice-mkl
which just does a call to quad_bilinear_interp if you want to play with it. I have not convinced myself that is is not a logic bug vs. precision, vs. something else.

@jlaucar
Copy link
Contributor

jlaucar commented Mar 10, 2025 via email

@hkershaw-brown
Copy link
Member Author

um yeah, might have to be tomorrow though. Here is the print from filter for the branch ice-mkl. The 1st four values are p.

F : 4.222093679422526E-002 2.939444349332336E-002 2.361927128232367E-002 2.457275269829678E-002 : 255.070312500000 74.4735717773438 : 253.978338312172 254.698325218660 256.152459376863 255.464594303337 : 74.4775053981321 74.2803767364551 74.6534788353970 74.8496146897753
F : 4.180024009822765E-002 2.903021773063639E-002 2.316891853011881E-002 2.401653309655898E-002 : 255.070312500000 74.4735717773438 : 253.978338312172 254.698325218660 256.152459376863 255.464594303337 : 74.4775053981321 74.2803767364551 74.6534788353970 74.8496146897753
F : 3.858482016407528E-002 2.551216863383185E-002 2.350450598493552E-002 2.457871281778398E-002 : 255.070312500000 74.4735717773438 : 253.978338312172 254.698325218660 256.152459376863 255.464594303337 : 74.4775053981321 74.2803767364551 74.6534788353970 74.8496146897753
T : 0.272483288432262 0.214638023603198 0.225436441741235 0.272507576727131 : 255.070312500000 74.4735717773438 : 253.978338312172 254.698325218660 256.152459376863 255.464594303337 : 74.4775053981321 74.2803767364551 74.6534788353970 74.8496146897753
T : 0.271966157127309 0.214096434839077 0.224591088988924 0.271393294103321 : 255.070312500000 74.4735717773438 : 253.978338312172 254.698325218660 256.152459376863 255.464594303337 : 74.4775053981321 74.2803767364551 74.6534788353970 74.8496146897753
F : 0.262230283240149 0.195409872700680 0.225106340451221 0.272343026861034 : 255.070312500000 74.4735717773438 : 253.978338312172 254.698325218660 256.152459376863 255.464594303337 : 74.4775053981321 74.2803767364551 74.6534788353970 74.8496146897753
F : 0.343727086081390 0.328381548006076 0.369470679835526 0.400497351327027 : 255.070312500000 74.4735717773438 : 253.978338312172 254.698325218660 256.152459376863 255.464594303337 : 74.4775053981321 74.2803767364551 74.6534788353970 74.8496146897753
F : 0.344223545805018 0.328813930395614 0.370007031562336 0.401365247079376 : 255.070312500000 74.4735717773438 : 253.978338312172 254.698325218660 256.152459376863 255.464594303337 : 74.4775053981321 74.2803767364551 74.6534788353970 74.8496146897753
T : 0.337814595281467 0.306467506379775 0.369622543270980 0.400559520434266 : 255.070312500000 74.4735717773438 : 253.978338312172 254.698325218660 256.152459376863 255.464594303337 : 74.4775053981321 74.2803767364551 74.6534788353970 74.8496146897753
T : 0.156417931882716 0.201045113239156 0.209389781289019 0.174274598612392 : 255.070312500000 74.4735717773438 : 253.978338312172 254.698325218660 256.152459376863 255.464594303337 : 74.4775053981321 74.2803767364551 74.6534788353970 74.8496146897753
T : 0.156709120371198 0.201421260268884 0.210040447346788 0.175006162157788 : 255.070312500000 74.4735717773438 : 253.978338312172 254.698325218660 256.152459376863 255.464594303337 : 74.4775053981321 74.2803767364551 74.6534788353970 74.8496146897753
T : 0.156804973000336 0.189916892302121 0.209536806173645 0.174247672350704 : 255.070312500000 74.4735717773438 : 253.978338312172 254.698325218660 256.152459376863 255.464594303337 : 74.4775053981321 74.2803767364551 74.6534788353970 74.8496146897753
T : 0.112230109685071 0.159225785188454 0.156279842118735 0.110656434344579 : 255.070312500000 74.4735717773438 : 253.978338312172 254.698325218660 256.152459376863 255.464594303337 : 74.4775053981321 74.2803767364551 74.6534788353970 74.8496146897753
T : 0.112268458566014 0.159329115972507 0.156377730428623 0.110734946388902 : 255.070312500000 74.4735717773438 : 253.978338312172 254.698325218660 256.152459376863 255.464594303337 : 74.4775053981321 74.2803767364551 74.6534788353970 74.8496146897753
F : 0.112268626620001 0.150076247320983 0.156273454463897 0.110631803868448 : 255.070312500000 74.4735717773438 : 253.978338312172 254.698325218660 256.152459376863 255.464594303337 : 74.4775053981321 74.2803767364551 74.6534788353970 74.8496146897753

I've been looking at them in the debugger; you can steal the run from
/glade/derecho/scratch/hkershaw/DART/Tickets/CICE/run

@nancycollins
Copy link
Collaborator

nancycollins commented Mar 10, 2025

i believe this is real. we had a roms user report bad interpolation values a long time ago. the quad interpolation module was based on the pop quad interp code, which looks a lot like the cice interp code. when the quad sides were not aligned closely to the lat/lon grid (and this one looks like it's not), i could get interp values that were outside of the range of the corners. by rotating the quad first, the values were within range. the quad routines have a namelist option to rotate the quad to align closer to one of the lat or lon axes before interpolating (do_rotate = .true.) and it removed this problem. i see it has a default of false right now, but that might make an easy test if you enable it. (or just drop the rotate code into the cice interp code to test it - look for the do_rotate code block.)

edit: one way i found this was to interpolate a dense grid of test points inside the original data grid, and you could see clear discontinuities across the data quad boundaries. (the interpolation results were not smooth.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cice Sea Ice models
Projects
None yet
Development

No branches or pull requests

3 participants