how to control connection retry attempts #1247
-
When trying to connect to an input service (say Kafka) that isn't immediately reachable, I find that Benthos repeatedly tries to reconnect within a span of 1-2 seconds with no signs of stopping. It's possible that the service has a fatal error and permanently unavailable. Ideally I want to be able to control the time between retry attempts and the number of retry attempts. I have read the documentation in many places but cannot quite locate how to do this. Is there a way I am missing? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Indeed, we don't seem to have anything in the configuration to control this and, for Kafka specifically, the timeout is hardcoded here to one second. If you'd like to be able to configure this, we can expose it via a configuration parameter if you open an issue. Regarding the number of connection attempts, I believe Benthos deliberately tries to reconnect forever unless it's explicitly told to shut down. While there are some input-specific errors that indicate a permanent failure, it's probably a bad idea to assume that Benthos will always behave properly and shut down. Imagine there could be some coding error in it which ends up preventing it from shutting down. Also, when running it as a microservice in a cluster, you likely want to be able to keep scraping metrics from each and every instance and take some action based on those (restart, alert, etc). If it shuts down, that could make failures harder to detect and it also requires thinking about exit codes. This is just my novice opinion, though. I'll defer to @Jeffail to chime in when he's back. |
Beta Was this translation helpful? Give feedback.
Indeed, we don't seem to have anything in the configuration to control this and, for Kafka specifically, the timeout is hardcoded here to one second. If you'd like to be able to configure this, we can expose it via a configuration parameter if you open an issue.
Regarding the number of connection attempts, I believe Benthos deliberately tries to reconnect forever unless it's explicitly told to shut down. While there are some input-specific errors that indicate a permanent failure, it's probably a bad idea to assume that Benthos will always behave properly and shut down. Imagine there could be some coding error in it which ends up preventing it from shutting down. Also, when running it as …