Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complex ncml test case EDDGridFromNcFilesTests.testNcml failing #148

Open
srstsavage opened this issue Apr 15, 2024 · 0 comments
Open

Complex ncml test case EDDGridFromNcFilesTests.testNcml failing #148

srstsavage opened this issue Apr 15, 2024 · 0 comments

Comments

@srstsavage
Copy link
Contributor

Describe the bug
Getting ahead of things a little bit since #142 is not yet merged, but wanted a dedicated issue to track EDDGridFromNcFilesTests.testNcml issues.

Currently, with netcdf-java 5.5.3 dependencies, the following error results when loading a complex union ncML file in EDDGridFromNcFilesTests.testNcml

java.lang.NullPointerException: Cannot invoke "String.contains(java.lang.CharSequence)" because "location" is null
 at thredds.inventory.zarr.MFileZip$Provider.canProvide(MFileZip.java:200)                                                                                                   
 at thredds.inventory.MFiles.create(MFiles.java:37)                                                                                                                          
 at ucar.nc2.internal.ncml.AggDataset.<init>(AggDataset.java:74)                                                                                                             
 at ucar.nc2.internal.ncml.Aggregation.makeDataset(Aggregation.java:453)                                                                                                     
 at ucar.nc2.internal.ncml.Aggregation.addExplicitDataset(Aggregation.java:136)                                                                                              
 at ucar.nc2.internal.ncml.NcmlReader.readAgg(NcmlReader.java:1476)                                                                                                          
 at ucar.nc2.internal.ncml.NcmlReader.readNetcdf(NcmlReader.java:521)                                                                                                        
 at ucar.nc2.internal.ncml.NcmlReader.readNcml(NcmlReader.java:478)                                                                                                          
 at ucar.nc2.internal.ncml.NcmlReader.readNcml(NcmlReader.java:397)                                                                                                          
 at ucar.nc2.internal.ncml.NcmlNetcdfFileProvider.open(NcmlNetcdfFileProvider.java:24)                          
 at ucar.nc2.dataset.NetcdfDatasets.openProtocolOrFile(NetcdfDatasets.java:431)                                                                                              
 at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:152)                                                                                                     
 at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:135)
 at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:118)
 at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:104)
 at gov.noaa.pfel.erddap.dataset.EDDGridFromNcFilesTests.testNcml(EDDGridFromNcFilesTests.java:155)

This was originally reported to the netcdf-java mailing list by Bob Simons in July 2022.

The error stems from loading of MFileProvider implementations using Java service loading. The canProvide(String location) method is called of each implementation, and in 5.5.3 one particular provider MFileZip location isn't checked for null (https://github.com/Unidata/netcdf-java/blob/v5.5.3/cdm/zarr/src/main/java/thredds/inventory/zarr/MFileZip.java#L200).

This bug was fixed in October 2022 with this commit.

Unidata/netcdf-java@19f9476#diff-05b863736a1a2b21b57d0a498f731991e82c8b824dc2235d54f3f9d5f257eb80R200

However, this change hasn't yet been included in any release. I asked about the possibility of a 5.5.4 release here: Unidata/netcdf-java#1332

However, even with this fix (testing with netcdf-java 5.5.4-SNAPSHOT), this test produces another error:

java.lang.IllegalStateException: Shared Dimension fakeDim0 = 4320; does not exist in a parent group
 at ucar.nc2.Variable.<init>(Variable.java:1847)
 at ucar.nc2.dataset.VariableDS.<init>(VariableDS.java:879)
 at ucar.nc2.dataset.VariableDS$Builder.build(VariableDS.java:1134)
 at ucar.nc2.dataset.VariableDS$Builder.build(VariableDS.java:985)
 at ucar.nc2.Group.<init>(Group.java:924)
 at ucar.nc2.Group.<init>(Group.java:44)
 at ucar.nc2.Group$Builder.build(Group.java:1410)
 at ucar.nc2.Group$Builder.build(Group.java:1402)
 at ucar.nc2.NetcdfFile.<init>(NetcdfFile.java:2576)
 at ucar.nc2.dataset.NetcdfDataset.<init>(NetcdfDataset.java:1611)
 at ucar.nc2.dataset.NetcdfDataset.<init>(NetcdfDataset.java:88)
 at ucar.nc2.dataset.NetcdfDataset$Builder.build(NetcdfDataset.java:1812)
 at ucar.nc2.dataset.NetcdfDataset$Builder.build(NetcdfDataset.java:1687)
 at ucar.nc2.internal.ncml.NcmlReader$NcmlElementReader.open(NcmlReader.java:1605)
 at ucar.nc2.internal.ncml.NcmlReader$NcmlElementReader.open(NcmlReader.java:1586)
 at ucar.nc2.dataset.NetcdfDatasets.acquireFile(NetcdfDatasets.java:383)
 at ucar.nc2.internal.ncml.AggDataset.acquireFile(AggDataset.java:114)
 at ucar.nc2.internal.ncml.AggregationUnion.buildNetcdfDataset(AggregationUnion.java:30)
 at ucar.nc2.internal.ncml.Aggregation.build(Aggregation.java:349)
 at ucar.nc2.internal.ncml.NcmlReader.readNetcdf(NcmlReader.java:528)
 at ucar.nc2.internal.ncml.NcmlReader.readNcml(NcmlReader.java:483)
 at ucar.nc2.internal.ncml.NcmlReader.readNcml(NcmlReader.java:385)
 at ucar.nc2.internal.ncml.NcmlNetcdfFileProvider.open(NcmlNetcdfFileProvider.java:24)
 at ucar.nc2.dataset.NetcdfDatasets.openProtocolOrFile(NetcdfDatasets.java:431)
 at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:152)
 at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:135)
 at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:118)
 at ucar.nc2.dataset.NetcdfDatasets.openDataset(NetcdfDatasets.java:104)
 at gov.noaa.pfel.erddap.dataset.EDDGridFromNcFilesTests.testNcml(EDDGridFromNcFilesTests.java:155)

This is related to the attempted renaming of the fakeDim variables in the two of the three aggregated files:

$ grep dimension src/test/resources/largeFiles/viirs/MappedMonthly4km/m4.ncml 
      <dimension name="latitude" orgName="fakeDim0" />
      <dimension name="longitude" orgName="fakeDim1" />

I haven't done thorough checking to see if src/test/resources/largeFiles/viirs/MappedMonthly4km/m4.ncml is fully legal ncml, but the fake dimensions are indeed in the aggregated data files, and the target dimensions already exist in the LatLon.nc file:

$ ncks --json -M src/test/resources/largeFiles/viirs/MappedMonthly4km/LatLon.nc | jq .dimensions                                 
{
  "latitude": 4320,
  "longitude": 8640
}
$ ncks --json -M src/test/resources/largeFiles/viirs/MappedMonthly4km/V20120012012031.L3m_MO_NPP_CHL_chlor_a_4km | jq .dimensions
{
  "fakeDim0": 4320,
  "fakeDim1": 8640
}
$ ncks --json -M src/test/resources/largeFiles/viirs/MappedMonthly4km/V20120322012060.L3m_MO_NPP_CHL_chlor_a_4km | jq .dimensions
{
  "fakeDim0": 4320,
  "fakeDim1": 8640
}

To Reproduce
Steps to reproduce the behavior:
Run test case EDDGridFromNcFilesTests.testNcml (example mvn test -Dtest=EDDGridFromNcFilesTests#testNcml)

Expected behavior
Test passes

Desktop (please complete the following information):

  • OS: Linux (Debian 12)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant