-
-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dramatiq retry sends more retries instead of max 3 that are configured #605
Comments
@bvidovic1 , Hi, have you found a solution on this issue? |
I rewrote retry handler logic, added custom intervals and changed a way a bit how/when retry is trigerred. If you need it, I can share code snippet when I am at my desk. |
@bvidovic1 , that would be very kind of you, thank you! |
This is my retry method and how I use it in the actor. It is customized to check certain job id and some other stuff but you will get the idea. Head up that version I am using is custom retry: def handle_retry(
job_id: str,
e: Exception,
backoff_values: Dict[int, int],
error_msg: str,
) -> None:
"""
Handle exceptions raised during a request to the ML service by managing retries and delays.
Args:
job_id (str): The ID of the job being processed.
e (Exception): The exception raised during the request.
backoff_values (dict[int, int]): A dictionary mapping retry counts to corresponding backoff delays.
error_msg (str): A descriptive error message associated with the exception.
Raises:
ServiceRequestHandlingError: If the number of retries so far is not in the predefined backoff values.
dramatiq.errors.Retry: If the exception should trigger a retry with a specified delay.
Notes:
This method logs information about the message, retries, and backoff.
Example:
try:
# Code that may raise requests.exceptions.RequestException
except requests.exceptions.RequestException as e:
backoff_values = {
0: 5_000, # backoff before first retry - 5 seconds
1: 10_000, # backoff before second retry - 10 seconds
2: 60_000, # backoff before third retry - 60 seconds
}
handle_retry(e, backoff_values, "Error in ML service request")
"""
msg: dramatiq.Message = dramatiq.middleware.CurrentMessage.get_current_message()
logger.debug(f"Message info:\n{msg.options}")
retries_so_far: int = msg.options.get("retries", 0)
logger.debug(f"Retries so far: {retries_so_far}")
if retries_so_far not in backoff_values:
logger.error(f"Exceeded maximum number of retries: {retries_so_far}")
save_status_to_db(job_id, JobProcessingStatus.ERROR, f"{error_msg}")
raise ServiceRequestHandlingError(error_msg) from e
# Log the retry and backoff delay
backoff_delay = backoff_values[retries_so_far]
logger.warning(f"Retrying after exception: {str(e)} | Backoff delay: {backoff_delay}ms")
raise dramatiq.errors.Retry(str(e), delay=backoff_delay) How it is used in the actor: """
Important:
* 'broker_priority' refers to global message prioritisation by RMQ, where
priority is in range [0, 255], `255` being the highest
local actor `priority` parameter refers to worker-local prioritisation of *prefetched* messages,
where priority is in range: [255, 0], `0` being the highest (opposite of `broker_priority`)
* throws argument in actor means that retry will not be started if exceptions from given tuple are thrown
"""
@dramatiq.actor(
**config.actors.handle_actor.params,
time_limit=1000 * TIMEOUT,
max_retries=MAX_RETRIES,
throws=(
ConfigurationError,
DocumentsMismatchInResponse,
InvalidProcessingRequestError,
InvalidMessagePayloadError,
ServiceRequestHandlingError,
ResponseValidationError,
NoDocumentsInBackofficeError,
),
)
def handle_actor(message: Mapping[str, Any]):
.
.
. So, I do not use default retry from the dramatiq but this one, for some exceptions I do not want to do the retry (stated in |
What OS are you using?
Running on python docker:
python:3.8-slim
What version of Dramatiq are you using?
v1.15.0
What did you do?
I have dramatiq setup to take a message from the queue and send it to relevant endpoint that performs some ML processing.
I start dramatiq with 1 process and 4 threads.
Dramatiq actor is setup like this:
Implementation of retry handler:
What did you expect would happen?
I expect retry to be tried 3 times and if unsuccessful it should exit.
What happened?
Firstly what happens is I receive a message that 2 retries will be tried in a specific amount of milliseconds based on the
min_backoff
andmax_backoff
variables. Why 2 right away? Why not 1 by 1 sequentially?Secondly, request retry happens more times (6,7,8 times) even though max is 3. This is causing a problem for my service, because it sends multiple requests towards the endpoint and clogs the workers there with same request. Is this happening because threads are used? Should I add some kind of threading lock mechanism?
Thanks in advance for any guidance.
The text was updated successfully, but these errors were encountered: