Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to delete records associated with a given granule #140

Open
torimcd opened this issue Apr 2, 2024 · 3 comments
Open

Provide a way to delete records associated with a given granule #140

torimcd opened this issue Apr 2, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@torimcd
Copy link
Collaborator

torimcd commented Apr 2, 2024

We can't currently easily query for all of the records in dynamodb that are associated with a granule. If a published granule were to be removed from Cumulus, there is not an easy way for us to remove the associated features from Hydrocron.

We can delete individual records manually, but as some granules contain ~20,000+ features, this isn't a good solution.

We could add the granuleUR as a field during ingest and then create a secondary index to allow querying on the granuleUR, or could get creative with a mapping between features and the cycle/pass, but we would still need a way to query on a field other than feature id.

@torimcd torimcd added the enhancement New feature or request label Apr 2, 2024
@nikki-t
Copy link
Collaborator

nikki-t commented Apr 2, 2024

I like the idea of adding the granuleUR to the database and creating a secondary index as I think that would allow the efficient retrieval of the data that can then be removed. It looks like you can batch delete items like you can batch write items.

Adding the granuleUR may also support the work in issue #71 Track granule ingest.

@torimcd
Copy link
Collaborator Author

torimcd commented Apr 3, 2024

#150 adds granuleUR as a field in the databases and sets up the secondary index to query on them. This ticket can be for implementing the batch writer with the delete option described in the link above? We could potentially set it up as another lambda that performs the query and delete given the granuleUR as input?

@nikki-t
Copy link
Collaborator

nikki-t commented Apr 4, 2024

I like the idea of creating a separate "delete" Lambda that can either be used by the track ingest architecture if we want to automate things or can be manually invoked via the AWS CLI or python script for cases where we want to delete data from the database. It would be good to prioritize some of this work during our tag up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants