Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sending captured events should honor sending fail. #932

Open
1 of 3 tasks
stima opened this issue Jan 3, 2024 · 5 comments
Open
1 of 3 tasks

Sending captured events should honor sending fail. #932

stima opened this issue Jan 3, 2024 · 5 comments

Comments

@stima
Copy link
Contributor

stima commented Jan 3, 2024

Description

Sending captured events should honor sending fail.

When does the problem happen

  • During build
  • During run-time
  • When capturing a hard crash

Environment

  • OS: Unrelated
  • Compiler: Unrelated
  • CMake version and config: Unrelated

Steps To Reproduce

  1. Generate crash under Sentry SDK. That should trigger unsent event inside sentry database folder
  2. Turn off networking
  3. Run Sentry SDK again.

Expected: The event is not send and event metadata is not removed, appropriate log statement raised.
Actual: Event is not send, event metadata is removed.

Log output

@supervacuus
Copy link
Collaborator

There is currently no transport transactionality of events. If an event ends up in the send queue and the transport cannot send for whatever reason, the event is lost. If your use case expects regular network unavailability and requires delivery guarantees, the transport implementation can be customized.

Send errors should be visible in the logs.

@stima
Copy link
Contributor Author

stima commented Jan 8, 2024

Thanks I would consider to look to custom transport impelementation.

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 2 Jan 8, 2024
@supervacuus
Copy link
Collaborator

Btw since you use crashpad, implementing a custom transport will not help you with adding delivery guarantees for crashes since the crashpad_handler uses its own transport.

@daniel-falk
Copy link

Do I understand it correct that there is no way to cache messages if the network is currently not reachable? We have issues that our edge applications sometimes fails to handle loss of internet connectivity in a good way. This is exactly when we need the Sentry messages, but we don't get them since we don't have internet. I see that there is some dumping of the queue going on, but it is only done when we call the sentry_close method. Our problems sometimes leads to the system restarting, then we can't see any errors at all in Sentry.

What would be the options to handle this? Can we make a solution using callbacks, or do we need to modify the source code to add our own transportation backend? The latter seems hard to maintain..

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Oct 7, 2024
@supervacuus
Copy link
Collaborator

Do I understand it correct that there is no way to cache messages if the network is currently not reachable?

Yes.

We have issues that our edge applications sometimes fails to handle loss of internet connectivity in a good way. This is exactly when we need the Sentry messages, but we don't get them since we don't have internet.

I see. The current transport implementation doesn't really help with this use case.

. I see that there is some dumping of the queue going on, but it is only done when we call the sentry_close method. Our problems sometimes leads to the system restarting, then we can't see any errors at all in Sentry.

To be clear:

  • the transport currently does not have retry mechanisms beyond rate-limiting responses from the server
  • even the rate-limiting delays are not persistent, that means once an envelope was dequeued for sending it would be gone if the application shuts down while attempting send the envelope
  • all items in the queue will be persisted to disk for the next run either when closing or when the process crashed and our handler could run
  • the transport will attempt to send these items right after sentry_init started it, rinse and repeat.

What currently does not exist:

  • any persistence of envelopes or retry-/transmission-states once the transport thread took over

What would be the options to handle this? Can we make a solution using callbacks, or do we need to modify the source code to add our own transportation backend? The latter seems hard to maintain

Ideally sentry would provide this, but it is currently not a priority.

If you look into the transport implementations then you will see that these implementations use the same callback interface that is also available to users of the Native SDK:

/**
* This represents an interface for user-defined transports.
*
* Transports are responsible for sending envelopes to sentry and are the last
* step in the event pipeline.
*
* Envelopes will be submitted to the transport in a _fire and forget_ fashion,
* and the transport must send those envelopes _in order_.
*
* A transport has the following hooks, all of which
* take the user provided `state` as last parameter. The transport state needs
* to be set with `sentry_transport_set_state` and typically holds handles and
* other information that can be reused across requests.
*
* * `send_func`: This function will take ownership of an envelope, and is
* responsible for freeing it via `sentry_envelope_free`.
* * `startup_func`: This hook will be called by sentry inside of `sentry_init`
* and instructs the transport to initialize itself. Failures will bubble up
* to `sentry_init`.
* * `flush_func`: Instructs the transport to flush its queue.
* This hook receives a millisecond-resolution `timeout` parameter and should
* return `0` if the transport queue is flushed within the timeout.
* * `shutdown_func`: Instructs the transport to flush its queue and shut down.
* This hook receives a millisecond-resolution `timeout` parameter and should
* return `0` if the transport is flushed and shut down successfully.
* In case of a non-zero return value, sentry will log an error, but continue
* with freeing the transport.
* * `free_func`: Frees the transports `state`. This hook might be called even
* though `shutdown_func` returned a failure code previously.
*
* The transport interface might be extended in the future with hooks to flush
* its internal queue without shutting down, and to dump its internal queue to
* disk in case of a hard crash.
*/
struct sentry_transport_s;
typedef struct sentry_transport_s sentry_transport_t;
/**
* Creates a new transport with an initial `send_func`.
*/
SENTRY_API sentry_transport_t *sentry_transport_new(
void (*send_func)(sentry_envelope_t *envelope, void *state));
/**
* Sets the transport `state`.
*
* If the state is owned by the transport and needs to be freed, use
* `sentry_transport_set_free_func` to set an appropriate hook.
*/
SENTRY_API void sentry_transport_set_state(
sentry_transport_t *transport, void *state);
/**
* Sets the transport hook to free the transport `state`.
*/
SENTRY_API void sentry_transport_set_free_func(
sentry_transport_t *transport, void (*free_func)(void *state));
/**
* Sets the transport startup hook.
*
* This hook is called from within `sentry_init` and will get a reference to the
* options which can be used to initialize a transports internal state.
* It should return `0` on success. A failure will bubble up to `sentry_init`.
*/
SENTRY_API void sentry_transport_set_startup_func(sentry_transport_t *transport,
int (*startup_func)(const sentry_options_t *options, void *state));
/**
* Sets the transport flush hook.
*
* This hook will receive a millisecond-resolution timeout.
* It should return `0` if all the pending envelopes are
* sent within the timeout, or `1` if the timeout is hit.
*/
SENTRY_API void sentry_transport_set_flush_func(sentry_transport_t *transport,
int (*flush_func)(uint64_t timeout, void *state));
/**
* Sets the transport shutdown hook.
*
* This hook will receive a millisecond-resolution timeout.
* It should return `0` on success in case all the pending envelopes have been
* sent within the timeout, or `1` if the timeout was hit.
*/
SENTRY_API void sentry_transport_set_shutdown_func(
sentry_transport_t *transport,
int (*shutdown_func)(uint64_t timeout, void *state));
/**
* Generic way to free a transport.
*/
SENTRY_API void sentry_transport_free(sentry_transport_t *transport);
/**
* Create a new function transport.
*
* It is a convenience function which works with a borrowed `data`, and will
* automatically free the envelope, so the user provided function does not need
* to do that.
*
* This function is *deprecated* and will be removed in a future version.
* It is here for backwards compatibility. Users should migrate to the
* `sentry_transport_new` API.
*/
SENTRY_API sentry_transport_t *sentry_new_function_transport(
void (*func)(const sentry_envelope_t *envelope, void *data), void *data);

sentry_transport_t *transport
= sentry_transport_new(sentry__curl_transport_send_envelope);
if (!transport) {
sentry__bgworker_decref(bgworker);
return NULL;
}
sentry_transport_set_state(transport, bgworker);
sentry_transport_set_free_func(
transport, (void (*)(void *))sentry__bgworker_decref);
sentry_transport_set_startup_func(transport, sentry__curl_transport_start);
sentry_transport_set_flush_func(transport, sentry__curl_transport_flush);
sentry_transport_set_shutdown_func(
transport, sentry__curl_transport_shutdown);
sentry__transport_set_dump_func(transport, sentry__curl_dump_queue);

As you can see, beyond the concrete implementation, this is the same very stable and minimal interface where any changes inside the SDK shouldn't affect your transport implementation down the road. You're also not bound to any technology choice although we recommend to take inspiration and adapt to your needs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Status: No status
Status: Needs Discussion
Development

No branches or pull requests

4 participants