Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUFR files and position update stopping at LYN_T #164

Closed
BaptisteVandecrux opened this issue Aug 9, 2023 · 1 comment · Fixed by #165
Closed

BUFR files and position update stopping at LYN_T #164

BaptisteVandecrux opened this issue Aug 9, 2023 · 1 comment · Fixed by #165

Comments

@BaptisteVandecrux
Copy link
Member

BaptisteVandecrux commented Aug 9, 2023

> getBUFR --positions --positions-filepath ../aws-l3/AWS_latest_locations.csv

....
####### Processing LYN_T #######
Generating LYN_T.bufr from ../aws-l3/tx/LYN_T/LYN_T_hour.csv
TIMESTAMP: 2023-05-26 21:00:00
----> Time checks failed for LYN_T
      current: 2023-05-26 21:00:00
       latest: 2023-05-26 21:00:00
finding positions for LYN_T
last transmission: 2023-08-09 11:00:00
Traceback (most recent call last):
  File "pandas/_libs/index.pyx", line 548, in pandas._libs.index.DatetimeEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 2263, in pandas._libs.hashtable.Int64HashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 2273, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 1685134800000000000

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/aws/miniconda3/envs/py38/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3803, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 516, in pandas._libs.index.DatetimeEngine.get_loc
  File "pandas/_libs/index.pyx", line 550, in pandas._libs.index.DatetimeEngine.get_loc
KeyError: Timestamp('2023-05-26 21:00:00')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/aws/miniconda3/envs/py38/lib/python3.8/site-packages/pandas/core/indexes/datetimes.py", line 736, in get_loc
    return Index.get_loc(self, key, method, tolerance)
  File "/home/aws/miniconda3/envs/py38/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3805, in get_loc
    raise KeyError(key) from err
KeyError: Timestamp('2023-05-26 21:00:00')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/aws/miniconda3/envs/py38/bin/getBUFR", line 246, in <module>
    df1_limited, positions = find_positions(df1, stid, args.time_limit, current_timestamp, positions)
  File "/home/aws/miniconda3/envs/py38/lib/python3.8/site-packages/pypromice/postprocess/csv2bufr.py", line 452, in find_positions
    s = df_limited.loc[current_timestamp]
  File "/home/aws/miniconda3/envs/py38/lib/python3.8/site-packages/pandas/core/indexing.py", line 1073, in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis)
  File "/home/aws/miniconda3/envs/py38/lib/python3.8/site-packages/pandas/core/indexing.py", line 1312, in _getitem_axis
    return self._get_label(key, axis=axis)
  File "/home/aws/miniconda3/envs/py38/lib/python3.8/site-packages/pandas/core/indexing.py", line 1260, in _get_label
    return self.obj.xs(label, axis=axis)
  File "/home/aws/miniconda3/envs/py38/lib/python3.8/site-packages/pandas/core/generic.py", line 4056, in xs
    loc = index.get_loc(key)
  File "/home/aws/miniconda3/envs/py38/lib/python3.8/site-packages/pandas/core/indexes/datetimes.py", line 738, in get_loc
    raise KeyError(orig_key) from err
KeyError: Timestamp('2023-05-26 21:00:00')

At that site t_i, rh_i, p_i are not available since '2023-05-26 21:00:00' while gps_lon, gps_lat, gps_alt are still available
image

The problem might come from the "current_timestamp" being passed to the find_position function:

pypromice/bin/getBUFR

Lines 238 to 246 in 3424f2c

print('----> Time checks failed for {}'.format(stid))
print(' current:', current_timestamp)
if args.dev is True:
print(' latest (DEV):', latest_timestamp)
else:
print(' latest:', latest_timestamp)
no_recent_data.append(stid)
if args.positions is True:
df1_limited, positions = find_positions(df1, stid, args.time_limit, current_timestamp, positions)

We can see that the "Time checks failed", meaning that the code correctly identify the last instantaneous values as too old.
But still it is running find_position on "current_timestamp", which is several months old.
I suggest setting current_timestamp = None in that situation.

@BaptisteVandecrux
Copy link
Member Author

Another necessary update was to make extrapolation of GPS coordinates default.

17eaa07

This was necessary because the instantaneous values are taken at the end of the hourly time step and are therefore always one hour ahead of the gps coordinates (which are the average over that same hour and are given the beginning of the hour as timestamp).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant