Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Handling and dead letter queues for targets #133

Open
MeltyBot opened this issue May 26, 2021 · 7 comments
Open

Error Handling and dead letter queues for targets #133

MeltyBot opened this issue May 26, 2021 · 7 comments

Comments

@MeltyBot
Copy link
Contributor

Migrated from GitLab: https://gitlab.com/meltano/sdk/-/issues/134

Originally created by @vischous on 2021-05-26 17:40:34


Following up on our Office hours today. Not sure if we want this to be Target only or not your call @aaronsteers

Error Handling especially with SaaS style targets gets pretty interesting. Here's errors you'll hit at some point (one's that I can think about off the top of my head there's tons more, everything you can imagine when you run this stuff at scale)

Connection issues

  1. For HTTP requests: 500 Requests, timeouts in everyway you can imagine (hopefully your libraries have sane defaults for connection timeouts, read timeouts, targets will need to change these at timmes) "Server Busy", "Internal Error", etc
  2. Data Issues for HTTP you'll get response codes all over the place depending on the api but generally something like 406, 403, 404, 400, etc. "User already exists", "Name is invalid (over char limit)", "Unknown Error occured", "Cannot disable user due to them having xyz permissions"

Each of these errors needs to be handled slightly different. Some a simple retry with exponential backoff fixes your problem.

Data issues are something you can't get away from, and for a lot of SaaS apis (lots are not http based by the way, see Active Directory, and more) you'll get data errors that are masked as things like 500 errors.

Functionality that's probably needed:

  1. Error handling strategy for "hard" or "soft" errors. One record failing out of 1000 should still output something to stderr / stdout , and the target process should return a response code of something different than 0, but it's no where near as critical as all 1000 records failing which would need a response code of 1.
  2. Configuration for changing thresholds by users of targets. Everyone has different use cases. Thresholds could be percentage based, hard coded number of rows like >10 rows is a "hard" failure
  3. Retry logic

Some of this "maybe all?" could be handling by a dead letter queue of some sort.

Use cases that I know about today:

@MeltyBot
Copy link
Contributor Author

@labelsync-manager labelsync-manager bot added the kind/Feature New feature or request label Jun 23, 2022
@aaronsteers aaronsteers changed the title Error Handling Error Handling and dead letter queues for targets Nov 29, 2022
@louis-vines
Copy link

louis-vines commented Mar 24, 2023

What is the status on this feature? Seems like a pretty useful usecase.

@WillDaSilva
Copy link
Member

CC @visch @tayloramurphy @aaronsteers

@tayloramurphy
Copy link
Collaborator

@louis-vines a first pass for us would like be this issue:

With better exit codes for SDK-based connectors we can start to handle each error better overall. Likely we need to break this issue up into specific proposals and make progress on those. cc @aaronsteers

@stale
Copy link

stale bot commented Jul 23, 2023

This has been marked as stale because it is unassigned, and has not had recent activity. It will be closed after 21 days if no further activity occurs. If this should never go stale, please add the evergreen label, or request that it be added.

@stale stale bot added the stale label Jul 23, 2023
@tayloramurphy
Copy link
Collaborator

Still relevant

@stale stale bot removed the stale label Jul 24, 2023
Copy link

stale bot commented Jul 23, 2024

This has been marked as stale because it is unassigned, and has not had recent activity. It will be closed after 21 days if no further activity occurs. If this should never go stale, please add the evergreen label, or request that it be added.

@stale stale bot added the stale label Jul 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants