Skip to content

[New Feature]: Users need an easy way to catalog data that exists in S3 but failed to catalog #539

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ngachung opened this issue Feb 25, 2025 · 0 comments
Labels
enhancement New feature or request U-DS

Comments

@ngachung
Copy link
Collaborator

When DS gets overwhelmed with too many catalog requests, catalog can fail due to ES timeouts, DB timeouts, lambda errors, which results in data that are in S3 but not cataloged and unavailable for search.

Fixing the root cause is something DS will work on i.e. setting up DLQ, retrying failed DB writes, processing a batch of messages with a single lambda instead of invoking hundreds of concurrent lambdas, etc. but providing users with an API endpoint to catalog data is still useful.

@ngachung ngachung added the enhancement New feature or request label Feb 25, 2025
@ngachung ngachung added the U-DS label Apr 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request U-DS
Projects
Status: No status
Development

No branches or pull requests

1 participant