Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release/1.6.0 #277

Merged
merged 60 commits into from
Feb 4, 2025
Merged

Release/1.6.0 #277

merged 60 commits into from
Feb 4, 2025

Conversation

torimcd
Copy link
Collaborator

@torimcd torimcd commented Jan 22, 2025

Release notes

  • This release adds support for new version D SWOT River and Lake collections, adding new database feature tables, track ingest tables, and a new collection_name API parameter for users to specify which data version to return results for

Test Results

  • New version D feature and track ingest database tables populate successfully in the UAT venue
  • Queries for Reaches and Nodes work as expected
  • No version D lake data has been ingested into UAT so Lakes have not been tested as of 01/22

More test details can be found in #270

Changelog

[1.6.0]

Added

- Issue 229 - Create new database tables for version d collections
- Issue 233 - Create new API parameter for user to specify data version
- Issue 273 - Modify the logging of the timeseries Lambda response object to reduce the log statement size
- Issue 267 - Modify track ingest operations to support new collection versions

Changed

Deprecated

Removed

Fixed

Security

nikki-t and others added 30 commits September 26, 2024 08:47
* Update hydrocron-lambda.tf

* Update pyproject.toml

* update lock

* fix lint

* fix lint
…gest" status (#227)

* /version 1.3.0a0

* Update build.yml

* /version 1.3.0a1

* /version 1.3.0a2

* Feature/issue 175 - Update docs to point to OPS (#176)

* changelog

* update examples, remove load_data readme, info moved to wiki

* Dependency update to fix snyk scan

* issues/101: Support for HTTP Accept header (#172)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a3

* issues/102: Support compression of API response (#173)

* Enable payload compression

* Update changelog with issue

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a4

* Feature/issue 100 Add option to 'compact' GeoJSON result into single feature (#177)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

* Update required query parameters based on current API functionality

* Enable return of 'compact' GeoJSON response

* Fix linting and add test data

* Update documentation for API accept headers and compact GeoJSON response

* Fix references to incorrect Accept header examples

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a5

* Feature/issue 183 (#185)

* Provide introduction to timeseries endpoint

* Remove _units in fields list

* Fix typo

* Update examples with Accept headers and compact query parameter

* Add issue to changelog

* Fix typo in timeseries documentation

* Update pymysql

* Update pymysql

* Provide clarity on accept headers and request parameter fields

* /version 1.3.0a6

* Feature/issue 186 Implement API keys (#188)

* API Gateway Lambda authorizer to facilitate API keys and usage plans

* Unit tests to test Lambda authorizer

* Fix terraform file formatting

* API Gateway Lambda Authorizer

- Lambda function
- API Keys and Authorizer definition in OpenAPI spec
- API gateway API keys
- API gateway usage plans
- SSM parameters for API keys

* Fix trailing whitespace

* Set default region environment variable

* Fix SNYK vulnerabilities

* Add issue to changelog

* Implement custom trusted partner header x-hydrocron-key

* Update cryptography for SNYK vulnerability

* Update documentation to include API key usage

* Update quota and throttle settings for API Gateway

* Update API keys documentation to indicate to be implemented

* Move API key lookup to Lambda INIT

* Remove API key authentication and update API key to x-hydrocron-key

* /version 1.3.0a7

* Update changelog for 1.3.0 release

* /version 1.4.0a0

* Feature/issue 198 (#207)

* Update pylint to deal with errors and fix collection reference

* Initial CMR and Hydrocron queries

- Includes placeholders for other operations needed to track granule
ingest.
- GranuleUR query for Hydrocron tables.

* Add and set up vcrpy for testing CMR API query

* Test track ingest operations

- Test CMR and hydrocron queries
- Test granuleUR query
- Update database to include granuleUR GSI

* Update to use track_ingest naming consistently

* Initial Lambda function and IAM role definition

* Replace deprecated path function with as_file

* Add SSM read IAM permissions

* Add DynamoDB read permissions

* Update track ingest lambda memory

* Remove duplicate IAM permissions

* Add in permissions to query index

* Update changelog

* Update changelog description

* Use python_cmr for CMR API queries

* /version 1.4.0a1

* Add doi to documentation pages (#216)

* Update intro.md with DOI

* Update overview.md with DOI

* /version 1.4.0a2

* issue-193: Add Dynamo DB Table for SWOT Prior Lakes (#209)

* add code to handle prior lakes shapefiles, add test prior lake data

* update terraform to add prior lake table

* fix tests, change to smaller test data file, changelog

* linting

* reconfigure main load_data method to make more readable and pass linting

* lint

* lint

* fix string casting to lower storage req & update test responses to handle different rounding pattern in coords

* update load benchmarking function for linting and add unit test

* try parent collection for lakes

* update version parsing for parent collection

* fix case error

* fix lake id reference

* add logging to troubleshoot too large features

* add item size logging and remove error raise for batch write

* clean up logging statements & move numeric_columns assignment

* update batch logging statement

* Rename constant

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* fix code coverage calculation

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a3

* Feature/issue 201 Create a table for tracking granule ingest status (#214)

* Define track ingest database and IAM permissions

* Update changelog with issue

* Modify table structure to support sparse status index

* Updated to only apply PITR in ops

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a4

* Feature/issue 210 - Load large geometry polygons (#219)

* add functions to handle null geometries and convert polygons to points

* update doi in docs

* fix fill null geometries

* fix tests and update changelog

* /version 1.4.0a5

* Feature/issue 222 - Add granule info to track ingest table on load (#223)

* adjust lambdas to populate track ingest table on granule load

* changelog

* remove test cnm

* lint

* change error caught when handling checksum

* update lambda role permissions to write to track ingest table

* fix typo on lake table terraform

* set default fill values for checksum and rev date in track status

* fix checksum handling in bulk load data

* lint

* add logging to debug

* /version 1.4.0a6

* Add SSM parameter read for last run time

* Feature/issue-225: Create one track ingest table per feature type (#226)

* add track ingest tables for each feature type and adjust load data to populate

* changelog

* /version 1.4.0a7

* Feature/issue 196 Add new feature type to query the API for lake data (#224)

* Initial API queries for lake data

* Unit tests for lake data

* Updates after center point calculations

- Removed temp code to calculate a point in API
- Implemented unit test to test lake data retrieval
- Updated fixtures to load in lake data for testing

* Add read lake table permissions to lambda timeseries and track ingest roles

* Update documenation to include lake data

* Updated documentation to include info on lake centerpoints

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a8

* Feature/issue 205 - Add Confluence API key (#221)

* Fix possible variable references before value is assigned

* Define Confluence API key and trusted partner plan limits

* Define a list of trusted partner keys and store under single parameter

* Define API keys as encrypted envrionment variables for Lambda authorizer

* Update authorizer and connection class to use KMS to retrieve API keys

* Hack to force lambda deployment when ssm value changes (#218)

* Add replace_triggered_by to hydrocron_lambda_authorizer

* Introduce environment variable that contains random id which will change whenever an API key value changes. This will force lambda to publish new version of the function.

* Remove unnecessary hash function

* Update to SSM parameter API key storage and null_resource enviroment variable

* Update Terraform and AWS provider

* Update API key documentation

* Set source_code_hash to force deployment of new image

* Downgrade AWS provider to 4.0 to remove inline policy errors

* Update docs/timeseries.md

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a9

* /version 1.4.0a10

* changelog for 1.4.0 release

* /version 1.5.0a0

* Initial track ingest table query

* Fix linting and code style

* Implement feature count operations

* Enable S3 permissions and set environment variable for track lambda

* Fix trailing white spaces and code format

* Update docstrings for class methods

* Implement run time storage in SSM

* Query track table unit tests

* Update CHANGELOG with issue

* Update SSM run time parameter

* Fix trailing whitespace

* Fix reference to IAM policy

* Enable specification of temporal range to search revision date by

* Fix SSM put parameter policy

* Update IAM permissions for reading track ingest

* Enable full temporal search on CMR granules

* Add capability to download shapefile granule to count features

* Update granule UR to include .zip

* Count features via Hydrocron table query

* Remove unnecessary s3 permissions

* Remove whitespace from blank line

* Update cryptography to 43.0.1

* update dependencies

* upgrade geopandas

* update dependencies

---------

Co-authored-by: nikki-t <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: frankinspace <[email protected]>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: Cassie Nickles <[email protected]>
Co-authored-by: cassienickles <[email protected]>
Co-authored-by: podaac-cicd[bot] <podaac-cicd[bot]@users.noreply.github.com>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: torimcd <[email protected]>
* /version 1.3.0a0

* Update build.yml

* /version 1.3.0a1

* /version 1.3.0a2

* Feature/issue 175 - Update docs to point to OPS (#176)

* changelog

* update examples, remove load_data readme, info moved to wiki

* Dependency update to fix snyk scan

* issues/101: Support for HTTP Accept header (#172)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a3

* issues/102: Support compression of API response (#173)

* Enable payload compression

* Update changelog with issue

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a4

* Feature/issue 100 Add option to 'compact' GeoJSON result into single feature (#177)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

* Update required query parameters based on current API functionality

* Enable return of 'compact' GeoJSON response

* Fix linting and add test data

* Update documentation for API accept headers and compact GeoJSON response

* Fix references to incorrect Accept header examples

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a5

* Feature/issue 183 (#185)

* Provide introduction to timeseries endpoint

* Remove _units in fields list

* Fix typo

* Update examples with Accept headers and compact query parameter

* Add issue to changelog

* Fix typo in timeseries documentation

* Update pymysql

* Update pymysql

* Provide clarity on accept headers and request parameter fields

* /version 1.3.0a6

* Feature/issue 186 Implement API keys (#188)

* API Gateway Lambda authorizer to facilitate API keys and usage plans

* Unit tests to test Lambda authorizer

* Fix terraform file formatting

* API Gateway Lambda Authorizer

- Lambda function
- API Keys and Authorizer definition in OpenAPI spec
- API gateway API keys
- API gateway usage plans
- SSM parameters for API keys

* Fix trailing whitespace

* Set default region environment variable

* Fix SNYK vulnerabilities

* Add issue to changelog

* Implement custom trusted partner header x-hydrocron-key

* Update cryptography for SNYK vulnerability

* Update documentation to include API key usage

* Update quota and throttle settings for API Gateway

* Update API keys documentation to indicate to be implemented

* Move API key lookup to Lambda INIT

* Remove API key authentication and update API key to x-hydrocron-key

* /version 1.3.0a7

* Update changelog for 1.3.0 release

* /version 1.4.0a0

* Feature/issue 198 (#207)

* Update pylint to deal with errors and fix collection reference

* Initial CMR and Hydrocron queries

- Includes placeholders for other operations needed to track granule
ingest.
- GranuleUR query for Hydrocron tables.

* Add and set up vcrpy for testing CMR API query

* Test track ingest operations

- Test CMR and hydrocron queries
- Test granuleUR query
- Update database to include granuleUR GSI

* Update to use track_ingest naming consistently

* Initial Lambda function and IAM role definition

* Replace deprecated path function with as_file

* Add SSM read IAM permissions

* Add DynamoDB read permissions

* Update track ingest lambda memory

* Remove duplicate IAM permissions

* Add in permissions to query index

* Update changelog

* Update changelog description

* Use python_cmr for CMR API queries

* /version 1.4.0a1

* Add doi to documentation pages (#216)

* Update intro.md with DOI

* Update overview.md with DOI

* /version 1.4.0a2

* issue-193: Add Dynamo DB Table for SWOT Prior Lakes (#209)

* add code to handle prior lakes shapefiles, add test prior lake data

* update terraform to add prior lake table

* fix tests, change to smaller test data file, changelog

* linting

* reconfigure main load_data method to make more readable and pass linting

* lint

* lint

* fix string casting to lower storage req & update test responses to handle different rounding pattern in coords

* update load benchmarking function for linting and add unit test

* try parent collection for lakes

* update version parsing for parent collection

* fix case error

* fix lake id reference

* add logging to troubleshoot too large features

* add item size logging and remove error raise for batch write

* clean up logging statements & move numeric_columns assignment

* update batch logging statement

* Rename constant

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* fix code coverage calculation

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a3

* Feature/issue 201 Create a table for tracking granule ingest status (#214)

* Define track ingest database and IAM permissions

* Update changelog with issue

* Modify table structure to support sparse status index

* Updated to only apply PITR in ops

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a4

* Feature/issue 210 - Load large geometry polygons (#219)

* add functions to handle null geometries and convert polygons to points

* update doi in docs

* fix fill null geometries

* fix tests and update changelog

* /version 1.4.0a5

* Feature/issue 222 - Add granule info to track ingest table on load (#223)

* adjust lambdas to populate track ingest table on granule load

* changelog

* remove test cnm

* lint

* change error caught when handling checksum

* update lambda role permissions to write to track ingest table

* fix typo on lake table terraform

* set default fill values for checksum and rev date in track status

* fix checksum handling in bulk load data

* lint

* add logging to debug

* /version 1.4.0a6

* Add SSM parameter read for last run time

* Feature/issue-225: Create one track ingest table per feature type (#226)

* add track ingest tables for each feature type and adjust load data to populate

* changelog

* /version 1.4.0a7

* Feature/issue 196 Add new feature type to query the API for lake data (#224)

* Initial API queries for lake data

* Unit tests for lake data

* Updates after center point calculations

- Removed temp code to calculate a point in API
- Implemented unit test to test lake data retrieval
- Updated fixtures to load in lake data for testing

* Add read lake table permissions to lambda timeseries and track ingest roles

* Update documenation to include lake data

* Updated documentation to include info on lake centerpoints

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a8

* Feature/issue 205 - Add Confluence API key (#221)

* Fix possible variable references before value is assigned

* Define Confluence API key and trusted partner plan limits

* Define a list of trusted partner keys and store under single parameter

* Define API keys as encrypted envrionment variables for Lambda authorizer

* Update authorizer and connection class to use KMS to retrieve API keys

* Hack to force lambda deployment when ssm value changes (#218)

* Add replace_triggered_by to hydrocron_lambda_authorizer

* Introduce environment variable that contains random id which will change whenever an API key value changes. This will force lambda to publish new version of the function.

* Remove unnecessary hash function

* Update to SSM parameter API key storage and null_resource enviroment variable

* Update Terraform and AWS provider

* Update API key documentation

* Set source_code_hash to force deployment of new image

* Downgrade AWS provider to 4.0 to remove inline policy errors

* Update docs/timeseries.md

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a9

* /version 1.4.0a10

* changelog for 1.4.0 release

* /version 1.5.0a0

* Initial track ingest table query

* Fix linting and code style

* Implement feature count operations

* Enable S3 permissions and set environment variable for track lambda

* Fix trailing white spaces and code format

* Update docstrings for class methods

* Implement run time storage in SSM

* Query track table unit tests

* Update CHANGELOG with issue

* Update SSM run time parameter

* Fix trailing whitespace

* Fix reference to IAM policy

* Enable specification of temporal range to search revision date by

* Fix SSM put parameter policy

* Update IAM permissions for reading track ingest

* Enable full temporal search on CMR granules

* Add capability to download shapefile granule to count features

* Update granule UR to include .zip

* Count features via Hydrocron table query

* Remove unnecessary s3 permissions

* Remove whitespace from blank line

* Update cryptography to 43.0.1

* Update track ingest table operations

* Update changelog with issue

* update dependencies

* upgrade geopandas

* update dependencies

---------

Co-authored-by: nikki-t <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: frankinspace <[email protected]>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: Cassie Nickles <[email protected]>
Co-authored-by: cassienickles <[email protected]>
Co-authored-by: podaac-cicd[bot] <podaac-cicd[bot]@users.noreply.github.com>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: torimcd <[email protected]>
* /version 1.3.0a0

* Update build.yml

* /version 1.3.0a1

* /version 1.3.0a2

* Feature/issue 175 - Update docs to point to OPS (#176)

* changelog

* update examples, remove load_data readme, info moved to wiki

* Dependency update to fix snyk scan

* issues/101: Support for HTTP Accept header (#172)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a3

* issues/102: Support compression of API response (#173)

* Enable payload compression

* Update changelog with issue

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a4

* Feature/issue 100 Add option to 'compact' GeoJSON result into single feature (#177)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

* Update required query parameters based on current API functionality

* Enable return of 'compact' GeoJSON response

* Fix linting and add test data

* Update documentation for API accept headers and compact GeoJSON response

* Fix references to incorrect Accept header examples

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a5

* Feature/issue 183 (#185)

* Provide introduction to timeseries endpoint

* Remove _units in fields list

* Fix typo

* Update examples with Accept headers and compact query parameter

* Add issue to changelog

* Fix typo in timeseries documentation

* Update pymysql

* Update pymysql

* Provide clarity on accept headers and request parameter fields

* /version 1.3.0a6

* Feature/issue 186 Implement API keys (#188)

* API Gateway Lambda authorizer to facilitate API keys and usage plans

* Unit tests to test Lambda authorizer

* Fix terraform file formatting

* API Gateway Lambda Authorizer

- Lambda function
- API Keys and Authorizer definition in OpenAPI spec
- API gateway API keys
- API gateway usage plans
- SSM parameters for API keys

* Fix trailing whitespace

* Set default region environment variable

* Fix SNYK vulnerabilities

* Add issue to changelog

* Implement custom trusted partner header x-hydrocron-key

* Update cryptography for SNYK vulnerability

* Update documentation to include API key usage

* Update quota and throttle settings for API Gateway

* Update API keys documentation to indicate to be implemented

* Move API key lookup to Lambda INIT

* Remove API key authentication and update API key to x-hydrocron-key

* /version 1.3.0a7

* Update changelog for 1.3.0 release

* /version 1.4.0a0

* Feature/issue 198 (#207)

* Update pylint to deal with errors and fix collection reference

* Initial CMR and Hydrocron queries

- Includes placeholders for other operations needed to track granule
ingest.
- GranuleUR query for Hydrocron tables.

* Add and set up vcrpy for testing CMR API query

* Test track ingest operations

- Test CMR and hydrocron queries
- Test granuleUR query
- Update database to include granuleUR GSI

* Update to use track_ingest naming consistently

* Initial Lambda function and IAM role definition

* Replace deprecated path function with as_file

* Add SSM read IAM permissions

* Add DynamoDB read permissions

* Update track ingest lambda memory

* Remove duplicate IAM permissions

* Add in permissions to query index

* Update changelog

* Update changelog description

* Use python_cmr for CMR API queries

* /version 1.4.0a1

* Add doi to documentation pages (#216)

* Update intro.md with DOI

* Update overview.md with DOI

* /version 1.4.0a2

* issue-193: Add Dynamo DB Table for SWOT Prior Lakes (#209)

* add code to handle prior lakes shapefiles, add test prior lake data

* update terraform to add prior lake table

* fix tests, change to smaller test data file, changelog

* linting

* reconfigure main load_data method to make more readable and pass linting

* lint

* lint

* fix string casting to lower storage req & update test responses to handle different rounding pattern in coords

* update load benchmarking function for linting and add unit test

* try parent collection for lakes

* update version parsing for parent collection

* fix case error

* fix lake id reference

* add logging to troubleshoot too large features

* add item size logging and remove error raise for batch write

* clean up logging statements & move numeric_columns assignment

* update batch logging statement

* Rename constant

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* fix code coverage calculation

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a3

* Feature/issue 201 Create a table for tracking granule ingest status (#214)

* Define track ingest database and IAM permissions

* Update changelog with issue

* Modify table structure to support sparse status index

* Updated to only apply PITR in ops

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a4

* Feature/issue 210 - Load large geometry polygons (#219)

* add functions to handle null geometries and convert polygons to points

* update doi in docs

* fix fill null geometries

* fix tests and update changelog

* /version 1.4.0a5

* Feature/issue 222 - Add granule info to track ingest table on load (#223)

* adjust lambdas to populate track ingest table on granule load

* changelog

* remove test cnm

* lint

* change error caught when handling checksum

* update lambda role permissions to write to track ingest table

* fix typo on lake table terraform

* set default fill values for checksum and rev date in track status

* fix checksum handling in bulk load data

* lint

* add logging to debug

* /version 1.4.0a6

* Add SSM parameter read for last run time

* Feature/issue-225: Create one track ingest table per feature type (#226)

* add track ingest tables for each feature type and adjust load data to populate

* changelog

* /version 1.4.0a7

* Feature/issue 196 Add new feature type to query the API for lake data (#224)

* Initial API queries for lake data

* Unit tests for lake data

* Updates after center point calculations

- Removed temp code to calculate a point in API
- Implemented unit test to test lake data retrieval
- Updated fixtures to load in lake data for testing

* Add read lake table permissions to lambda timeseries and track ingest roles

* Update documenation to include lake data

* Updated documentation to include info on lake centerpoints

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a8

* Feature/issue 205 - Add Confluence API key (#221)

* Fix possible variable references before value is assigned

* Define Confluence API key and trusted partner plan limits

* Define a list of trusted partner keys and store under single parameter

* Define API keys as encrypted envrionment variables for Lambda authorizer

* Update authorizer and connection class to use KMS to retrieve API keys

* Hack to force lambda deployment when ssm value changes (#218)

* Add replace_triggered_by to hydrocron_lambda_authorizer

* Introduce environment variable that contains random id which will change whenever an API key value changes. This will force lambda to publish new version of the function.

* Remove unnecessary hash function

* Update to SSM parameter API key storage and null_resource enviroment variable

* Update Terraform and AWS provider

* Update API key documentation

* Set source_code_hash to force deployment of new image

* Downgrade AWS provider to 4.0 to remove inline policy errors

* Update docs/timeseries.md

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a9

* /version 1.4.0a10

* changelog for 1.4.0 release

* /version 1.5.0a0

* Initial track ingest table query

* Fix linting and code style

* Implement feature count operations

* Enable S3 permissions and set environment variable for track lambda

* Fix trailing white spaces and code format

* Update docstrings for class methods

* Implement run time storage in SSM

* Query track table unit tests

* Update CHANGELOG with issue

* Update SSM run time parameter

* Fix trailing whitespace

* Fix reference to IAM policy

* Enable specification of temporal range to search revision date by

* Fix SSM put parameter policy

* Update IAM permissions for reading track ingest

* Enable full temporal search on CMR granules

* Add capability to download shapefile granule to count features

* Update granule UR to include .zip

* Count features via Hydrocron table query

* Remove unnecessary s3 permissions

* Remove whitespace from blank line

* Update cryptography to 43.0.1

* Update track ingest table operations

* Update changelog with issue

* update dependencies

* upgrade geopandas

* update dependencies

* Implement operations to publish CNM messages for granules requiring ingest

* Implement unit test of publication operations

* Fix linting

* Add issue to changelog and fix linting

* Add EventBridge schedules with appropriate Lambda permissions

* Set initial schedule expressions and fix assume policy

* Disable eventbridge schedules by default

* Update schedule to run weekly

* Define 1 hour latency to search by revision_date in CMR

---------

Co-authored-by: nikki-t <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: frankinspace <[email protected]>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: Cassie Nickles <[email protected]>
Co-authored-by: cassienickles <[email protected]>
Co-authored-by: podaac-cicd[bot] <podaac-cicd[bot]@users.noreply.github.com>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: torimcd <[email protected]>
* /version 1.3.0a0

* Update build.yml

* /version 1.3.0a1

* /version 1.3.0a2

* Feature/issue 175 - Update docs to point to OPS (#176)

* changelog

* update examples, remove load_data readme, info moved to wiki

* Dependency update to fix snyk scan

* issues/101: Support for HTTP Accept header (#172)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a3

* issues/102: Support compression of API response (#173)

* Enable payload compression

* Update changelog with issue

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a4

* Feature/issue 100 Add option to 'compact' GeoJSON result into single feature (#177)

* Reorganize timeseries code to  prep for Accept header

* Enable Accept header to return response of specific content-type

* Fix whitespace and string continuation

* Make error handling consistent and add an additional test where a reach can't be found

* Update changelog with issue for unreleased version

* Add 415 status code to API definition

* Few minor cleanup items

* Few minor cleanup items

* Update to [email protected]

* Fix dependencies

* Update required query parameters based on current API functionality

* Enable return of 'compact' GeoJSON response

* Fix linting and add test data

* Update documentation for API accept headers and compact GeoJSON response

* Fix references to incorrect Accept header examples

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.3.0a5

* Feature/issue 183 (#185)

* Provide introduction to timeseries endpoint

* Remove _units in fields list

* Fix typo

* Update examples with Accept headers and compact query parameter

* Add issue to changelog

* Fix typo in timeseries documentation

* Update pymysql

* Update pymysql

* Provide clarity on accept headers and request parameter fields

* /version 1.3.0a6

* Feature/issue 186 Implement API keys (#188)

* API Gateway Lambda authorizer to facilitate API keys and usage plans

* Unit tests to test Lambda authorizer

* Fix terraform file formatting

* API Gateway Lambda Authorizer

- Lambda function
- API Keys and Authorizer definition in OpenAPI spec
- API gateway API keys
- API gateway usage plans
- SSM parameters for API keys

* Fix trailing whitespace

* Set default region environment variable

* Fix SNYK vulnerabilities

* Add issue to changelog

* Implement custom trusted partner header x-hydrocron-key

* Update cryptography for SNYK vulnerability

* Update documentation to include API key usage

* Update quota and throttle settings for API Gateway

* Update API keys documentation to indicate to be implemented

* Move API key lookup to Lambda INIT

* Remove API key authentication and update API key to x-hydrocron-key

* /version 1.3.0a7

* Update changelog for 1.3.0 release

* /version 1.4.0a0

* Feature/issue 198 (#207)

* Update pylint to deal with errors and fix collection reference

* Initial CMR and Hydrocron queries

- Includes placeholders for other operations needed to track granule
ingest.
- GranuleUR query for Hydrocron tables.

* Add and set up vcrpy for testing CMR API query

* Test track ingest operations

- Test CMR and hydrocron queries
- Test granuleUR query
- Update database to include granuleUR GSI

* Update to use track_ingest naming consistently

* Initial Lambda function and IAM role definition

* Replace deprecated path function with as_file

* Add SSM read IAM permissions

* Add DynamoDB read permissions

* Update track ingest lambda memory

* Remove duplicate IAM permissions

* Add in permissions to query index

* Update changelog

* Update changelog description

* Use python_cmr for CMR API queries

* /version 1.4.0a1

* Add doi to documentation pages (#216)

* Update intro.md with DOI

* Update overview.md with DOI

* /version 1.4.0a2

* issue-193: Add Dynamo DB Table for SWOT Prior Lakes (#209)

* add code to handle prior lakes shapefiles, add test prior lake data

* update terraform to add prior lake table

* fix tests, change to smaller test data file, changelog

* linting

* reconfigure main load_data method to make more readable and pass linting

* lint

* lint

* fix string casting to lower storage req & update test responses to handle different rounding pattern in coords

* update load benchmarking function for linting and add unit test

* try parent collection for lakes

* update version parsing for parent collection

* fix case error

* fix lake id reference

* add logging to troubleshoot too large features

* add item size logging and remove error raise for batch write

* clean up logging statements & move numeric_columns assignment

* update batch logging statement

* Rename constant

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* Fix temp dir security risk https://rules.sonarsource.com/python/RSPEC-5443/

* fix code coverage calculation

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a3

* Feature/issue 201 Create a table for tracking granule ingest status (#214)

* Define track ingest database and IAM permissions

* Update changelog with issue

* Modify table structure to support sparse status index

* Updated to only apply PITR in ops

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a4

* Feature/issue 210 - Load large geometry polygons (#219)

* add functions to handle null geometries and convert polygons to points

* update doi in docs

* fix fill null geometries

* fix tests and update changelog

* /version 1.4.0a5

* Feature/issue 222 - Add granule info to track ingest table on load (#223)

* adjust lambdas to populate track ingest table on granule load

* changelog

* remove test cnm

* lint

* change error caught when handling checksum

* update lambda role permissions to write to track ingest table

* fix typo on lake table terraform

* set default fill values for checksum and rev date in track status

* fix checksum handling in bulk load data

* lint

* add logging to debug

* /version 1.4.0a6

* Add SSM parameter read for last run time

* Feature/issue-225: Create one track ingest table per feature type (#226)

* add track ingest tables for each feature type and adjust load data to populate

* changelog

* /version 1.4.0a7

* Feature/issue 196 Add new feature type to query the API for lake data (#224)

* Initial API queries for lake data

* Unit tests for lake data

* Updates after center point calculations

- Removed temp code to calculate a point in API
- Implemented unit test to test lake data retrieval
- Updated fixtures to load in lake data for testing

* Add read lake table permissions to lambda timeseries and track ingest roles

* Update documenation to include lake data

* Updated documentation to include info on lake centerpoints

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a8

* Feature/issue 205 - Add Confluence API key (#221)

* Fix possible variable references before value is assigned

* Define Confluence API key and trusted partner plan limits

* Define a list of trusted partner keys and store under single parameter

* Define API keys as encrypted envrionment variables for Lambda authorizer

* Update authorizer and connection class to use KMS to retrieve API keys

* Hack to force lambda deployment when ssm value changes (#218)

* Add replace_triggered_by to hydrocron_lambda_authorizer

* Introduce environment variable that contains random id which will change whenever an API key value changes. This will force lambda to publish new version of the function.

* Remove unnecessary hash function

* Update to SSM parameter API key storage and null_resource enviroment variable

* Update Terraform and AWS provider

* Update API key documentation

* Set source_code_hash to force deployment of new image

* Downgrade AWS provider to 4.0 to remove inline policy errors

* Update docs/timeseries.md

---------

Co-authored-by: Frank Greguska <[email protected]>

* /version 1.4.0a9

* /version 1.4.0a10

* changelog for 1.4.0 release

* update dependencies for 1.4.0 release

* /version 1.5.0a0

* fix CMR query in UAT

* /version 1.4.0rc1

* fix typo in load_data lambda

* /version 1.4.0rc2

* Initial track ingest table query

* Fix linting and code style

* Implement feature count operations

* Enable S3 permissions and set environment variable for track lambda

* Fix trailing white spaces and code format

* Update docstrings for class methods

* Implement run time storage in SSM

* Query track table unit tests

* Update CHANGELOG with issue

* Update SSM run time parameter

* Fix trailing whitespace

* Fix reference to IAM policy

* Enable specification of temporal range to search revision date by

* Fix SSM put parameter policy

* Update IAM permissions for reading track ingest

* Enable full temporal search on CMR granules

* Add capability to download shapefile granule to count features

* Update granule UR to include .zip

* Count features via Hydrocron table query

* Remove unnecessary s3 permissions

* Remove whitespace from blank line

* Update cryptography to 43.0.1

* Update track ingest table operations

* Update changelog with issue

* update dependencies

* upgrade geopandas

* update dependencies

* fix index on rev date in load data lambda

* update dependencies

* lint readme

* /version 1.4.0rc3

* /version 1.4.0rc4

* Implement operations to publish CNM messages for granules requiring ingest

* Implement unit test of publication operations

* Fix linting

* Add issue to changelog and fix linting

* Add EventBridge schedules with appropriate Lambda permissions

* Set initial schedule expressions and fix assume policy

* fix cmr env search by venue

* /version 1.4.0rc5

* Disable eventbridge schedules by default

* Update schedule to run weekly

* Define 1 hour latency to search by revision_date in CMR

* Allow CMR UAT query based on HYDROCRON_ENV environment variable

* Update unit tests to accomodate UAT CMR query

* Add earthdata login credentials to Lambda

* Add issue to changelog

* Fix linting white space

---------

Co-authored-by: nikki-t <[email protected]>
Co-authored-by: Frank Greguska <[email protected]>
Co-authored-by: frankinspace <[email protected]>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: Cassie Nickles <[email protected]>
Co-authored-by: cassienickles <[email protected]>
Co-authored-by: podaac-cicd[bot] <podaac-cicd[bot]@users.noreply.github.com>
Co-authored-by: Victoria McDonald <[email protected]>
Co-authored-by: torimcd <[email protected]>
… that aren't loaded into Hydrocron (#245)

* Raise an error if collection shortname does not match Hydrocron table names

* Raise an error unsupported lake data in load granule operations

* Remove trailing whitespace

* Fix code formatting

* Update CHANGELOG with issue

* Feature/issue 248 - Track ingest operations need to query UAT for granule files (#249)

* Query to return granule files should query UAT when running in SIT or UAT environments

* SIT execution should return UAT files for load granule operations

* Set venue environment variable before running test of query_cmr

* Add issue to CHANGELOG
* Handle overlapping times with unique CRIDs

* Define unit test to test when reprocessed granule arrives

* Add issue to CHANGELOG

* Add reprocessed CRID to eventbridge schedule input

* Handle cases with empty reprocessed_crid
… loadedd (#259)

* change assemble attrs function to avoid for loop

* change how attributes are concatenated during shp unpack to avoid slow looping

* remove unused import

* Update API test data with less precise data coordinates

* remove logging every item in batch writer

* lint

---------

Co-authored-by: Nikki <[email protected]>
…ons (#260)

* Add version d db tables, update load_data module to handle changing constants

* linting

* rework constants to sustain version changes, update tests

* lint

* fix changelog
* Add version d db tables, update load_data module to handle changing constants

* linting

* rework constants to sustain version changes, update tests

* lint

* fix changelog
Copy link
Collaborator

@nikki-t nikki-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are in pretty good shape to handle the version D collection but had a few pending items:

terraform/hydrocron-lambda.tf Show resolved Hide resolved
hydrocron/utils/constants.py Show resolved Hide resolved
@nikki-t
Copy link
Collaborator

nikki-t commented Jan 29, 2025

Test results recorded here: #270

@nikki-t
Copy link
Collaborator

nikki-t commented Feb 4, 2025

@torimcd and @frankinspace - I think we are ready for the 1.6.0 release. I updated this PR and the test issue (#270). - Feel free to add in your testing too.

To summarize all of the decisions and modifications we made:

  1. We will release version 1.6.0 into OPS but we will leave the EventBridge schedules disabled that execute the track ingest operations. This means that we will not reconcile Hydrocron granule ingests while we wait for the public release of the version D data.
    1. EventBridge schedules: https://github.com/podaac/hydrocron/blob/release/1.6.0/terraform/hydrocron-eventbridge.tf
    2. Track ingest constants:
      FEATURE_ID = {
      "SWOT_L2_HR_RiverSP_reach_D": "reach_id",
      "SWOT_L2_HR_RiverSP_node_D": "node_id",
      "SWOT_L2_HR_LakeSP_prior_D": "lake_id"
      }
      SHORTNAME = {
      "SWOT_L2_HR_RiverSP_reach_D": "SWOT_L2_HR_RiverSP_D",
      "SWOT_L2_HR_RiverSP_node_D": "SWOT_L2_HR_RiverSP_D",
      "SWOT_L2_HR_LakeSP_prior_D": "SWOT_L2_HR_LakeSP_D"
      }
  2. We will also disable the CNM message that gets sent to Hydrocron which triggers ingest to avoid ingesting version D data until it is publicly available. The DEFAULT_COLLECTION_VERSION for the timeseries Lambda has been set to 2.0 so that Hydrocron will continue to serve version 2.0 data by default. We will patch update this later when the version D data is released.
    1. Timeseries Lambda:
      DEFAULT_COLLECTION_VERSION = "2.0"
  3. Forward stream CRID is PID0 and we don't yet know the reprocessed CRID so this has been set to an empty string.
    1. EventBridge schedules: https://github.com/podaac/hydrocron/blob/release/1.6.0/terraform/hydrocron-eventbridge.tf
    2. Confirmed test of empty string in eventbridge schedule.
  4. Update track ingest SSM parameter name to include version D.
    1. SSM parameters:
      resource "aws_ssm_parameter" "hydrocron-reach-track-ingest-runtime" {
      name = "/service/${var.app_name}/track-ingest-runtime/SWOT_L2_HR_RiverSP_reach_D"
      description = "Hydrocron track ingest last time executed on reaches"
      type = "String"
      value = "no_data"
      }
      resource "aws_ssm_parameter" "hydrocron-node-track-ingest-runtime" {
      name = "/service/${var.app_name}/track-ingest-runtime/SWOT_L2_HR_RiverSP_node_D"
      description = "Hydrocron track ingest last time executed on nodes"
      type = "String"
      value = "no_data"
      }
      resource "aws_ssm_parameter" "hydrocron-priorlake-track-ingest-runtime" {
      name = "/service/${var.app_name}/track-ingest-runtime/SWOT_L2_HR_LakeSP_prior_D"
      description = "Hydrocron track ingest last time executed on lakes"
      type = "String"
      value = "no_data"
      }

Copy link
Member

@frankinspace frankinspace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Great work @torimcd and @nikki-t!

@torimcd
Copy link
Collaborator Author

torimcd commented Feb 4, 2025

Confirmed with IA that CNM messages are disabled on the version D RiverSP and LakeSP collections.

Also want to note here that changes in this release have only been tested with Rivers, not Lakes, due to an upstream issue with the Lake processing. so we do not have version D lake data to test with and are expecting that version D lake data will be released sometime (months) after version D rivers data. We decided to proceed with this release anyway and will do further testing with the version D lake data once we have an estimate for when it will be available.

@torimcd torimcd merged commit 3b130dd into main Feb 4, 2025
1 check passed
@torimcd torimcd deleted the release/1.6.0 branch February 4, 2025 18:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants