You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While using LiteLLM, I encountered an issue where the cooldown mechanism isn't functioning correctly. The debug logs indicate an error occurring when attempting to add the model to the cooldown list. The model was added to the cooldown due to a timeout in the stream response.
Relevant log output
10:48:42 - LiteLLM Router:DEBUG: cooldown_handlers.py:182 - Attempting to add ef8a6269c22774c736474b5562f40781bd9b91f7432c2d804c38a7f205c36208 to cooldown list
10:48:42 - LiteLLM Proxy:DEBUG: proxy_server.py:2882 - An error occurred:
Debug this by setting `--debug`, e.g. `litellm --model gpt-3.5-turbo --debug`
10:48:42 - LiteLLM Router:DEBUG: cooldown_handlers.py:117 - percent fails for deployment = ef8a6269c22774c736474b5562f40781bd9b91f7432c2d804c38a7f205c36208, percent fails = 1.0, num successes = 0, num fails = 8
10:48:42 - LiteLLM Router:DEBUG: cooldown_handlers.py:342 - Unable to cast exception status to int. Defaulting to status=500.
10:48:42 - LiteLLM:DEBUG: utils.py:289 - Custom Logger Error - Traceback (most recent call last):
File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 287, in log_event callback_func(
kwargs, # kwargs to func~ end_time,
~ )
File "/usr/lib/python3.13/site-packages/litellm/router.py", line 3523, in deployment_callback_on_failure
raise e
File "/usr/lib/python3.13/site-packages/litellm/router.py", line 3510, in deployment_callback_on_failure
result = _set_cooldown_deployments(
litellm_router_instance=self,
~
File "/usr/lib/python3.13/site-packages/litellm/router_utils/cooldown_handlers.py", line 198, in _set_cooldown_deployments
asyncio.create_task(
router_cooldown_event_callback(
~
File "/usr/lib/python3.13/asyncio/tasks.py", line 407, in create_task
loop = events.get_running_loop()
RuntimeError: no running event loop
The model was added to the cooldown due to a timeout in the stream response:
10:48:43 - LiteLLM Proxy:ERROR: proxy_server.py:2872 - litellm.proxy.proxy_server.async_data_generator(): Exception occured -
Traceback (most recent call last):
File "/usr/lib/python3.13/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions
yield
File "/usr/lib/python3.13/site-packages/httpx/_transports/default.py", line 254, in __aiter__
async forpartin self._httpcore_stream:
yield part
File "/usr/lib/python3.13/site-packages/httpcore/_async/connection_pool.py", line 407, in __aiter__
raise exc from None
File "/usr/lib/python3.13/site-packages/httpcore/_async/connection_pool.py", line 403, in __aiter__
async forpartin self._stream:
yield part
File "/usr/lib/python3.13/site-packages/httpcore/_async/http11.py", line 342, in __aiter__
raise exc
File "/usr/lib/python3.13/site-packages/httpcore/_async/http11.py", line 334, in __aiter__
async forchunkin self._connection._receive_response_body(**kwargs):
yield chunk
File "/usr/lib/python3.13/site-packages/httpcore/_async/http11.py", line 203, in _receive_response_body
event = await self._receive_event(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.13/site-packages/httpcore/_async/http11.py", line 217, in _receive_event
data = await self._network_stream.read(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
self.READ_NUM_BYTES, timeout=timeout
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/usr/lib/python3.13/site-packages/httpcore/_backends/anyio.py", line 32, inread
with map_exceptions(exc_map):
~~~~~~~~~~~~~~^^^^^^^^^
File "/usr/lib/python3.13/contextlib.py", line 162, in __exit__
self.gen.throw(value)
~~~~~~~~~~~~~~^^^^^^^
File "/usr/lib/python3.13/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
raise to_exc(exc) from exc
httpcore.ReadTimeout
Are you a ML Ops Team?
No
What LiteLLM version are you on ?
v1.58.2
Twitter / LinkedIn details
No response
The text was updated successfully, but these errors were encountered:
I wanted to follow up on this issue to see if there has been any progress or if there is any additional information I can provide to help with the investigation. This issue is causing some disruption in using the cooldown functionality, and any updates or guidance would be greatly appreciated.
What happened?
While using LiteLLM, I encountered an issue where the cooldown mechanism isn't functioning correctly. The debug logs indicate an error occurring when attempting to add the model to the cooldown list. The model was added to the cooldown due to a timeout in the stream response.
Relevant log output
The model was added to the cooldown due to a timeout in the stream response:
Are you a ML Ops Team?
No
What LiteLLM version are you on ?
v1.58.2
Twitter / LinkedIn details
No response
The text was updated successfully, but these errors were encountered: