Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Cooldown Not Working in LiteLLM #7779

Open
ZPerling opened this issue Jan 15, 2025 · 1 comment
Open

[Bug]: Cooldown Not Working in LiteLLM #7779

ZPerling opened this issue Jan 15, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@ZPerling
Copy link

ZPerling commented Jan 15, 2025

What happened?

While using LiteLLM, I encountered an issue where the cooldown mechanism isn't functioning correctly. The debug logs indicate an error occurring when attempting to add the model to the cooldown list. The model was added to the cooldown due to a timeout in the stream response.

Relevant log output

10:48:42 - LiteLLM Router:DEBUG: cooldown_handlers.py:182 - Attempting to add ef8a6269c22774c736474b5562f40781bd9b91f7432c2d804c38a7f205c36208 to cooldown list
10:48:42 - LiteLLM Proxy:DEBUG: proxy_server.py:2882 - An error occurred:

Debug this by setting `--debug`, e.g. `litellm --model gpt-3.5-turbo --debug`
10:48:42 - LiteLLM Router:DEBUG: cooldown_handlers.py:117 - percent fails for deployment = ef8a6269c22774c736474b5562f40781bd9b91f7432c2d804c38a7f205c36208, percent fails = 1.0, num successes = 0, num fails = 8
10:48:42 - LiteLLM Router:DEBUG: cooldown_handlers.py:342 - Unable to cast exception status to int. Defaulting to status=500.
10:48:42 - LiteLLM:DEBUG: utils.py:289 - Custom Logger Error - Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/litellm/integrations/custom_logger.py", line 287, in log_event callback_func(
    kwargs,  # kwargs to func
  ~   end_time,
  ~   )
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 3523, in deployment_callback_on_failure
    raise e
  File "/usr/lib/python3.13/site-packages/litellm/router.py", line 3510, in deployment_callback_on_failure
    result = _set_cooldown_deployments(
      litellm_router_instance=self,
    ~
  File "/usr/lib/python3.13/site-packages/litellm/router_utils/cooldown_handlers.py", line 198, in _set_cooldown_deployments
    asyncio.create_task(
      router_cooldown_event_callback(
    ~
  File "/usr/lib/python3.13/asyncio/tasks.py", line 407, in create_task
    loop = events.get_running_loop()
RuntimeError: no running event loop

The model was added to the cooldown due to a timeout in the stream response:

10:48:43 - LiteLLM Proxy:ERROR: proxy_server.py:2872 - litellm.proxy.proxy_server.async_data_generator(): Exception occured -
Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions
    yield
  File "/usr/lib/python3.13/site-packages/httpx/_transports/default.py", line 254, in __aiter__
    async for part in self._httpcore_stream:
        yield part
  File "/usr/lib/python3.13/site-packages/httpcore/_async/connection_pool.py", line 407, in __aiter__
    raise exc from None
  File "/usr/lib/python3.13/site-packages/httpcore/_async/connection_pool.py", line 403, in __aiter__
    async for part in self._stream:
        yield part
  File "/usr/lib/python3.13/site-packages/httpcore/_async/http11.py", line 342, in __aiter__
    raise exc
  File "/usr/lib/python3.13/site-packages/httpcore/_async/http11.py", line 334, in __aiter__
    async for chunk in self._connection._receive_response_body(**kwargs):
        yield chunk
  File "/usr/lib/python3.13/site-packages/httpcore/_async/http11.py", line 203, in _receive_response_body
    event = await self._receive_event(timeout=timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/httpcore/_async/http11.py", line 217, in _receive_event
    data = await self._network_stream.read(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        self.READ_NUM_BYTES, timeout=timeout
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.13/site-packages/httpcore/_backends/anyio.py", line 32, in read
    with map_exceptions(exc_map):
         ~~~~~~~~~~~~~~^^^^^^^^^
  File "/usr/lib/python3.13/contextlib.py", line 162, in __exit__
    self.gen.throw(value)
    ~~~~~~~~~~~~~~^^^^^^^
  File "/usr/lib/python3.13/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ReadTimeout

Are you a ML Ops Team?

No

What LiteLLM version are you on ?

v1.58.2

Twitter / LinkedIn details

No response

@ZPerling ZPerling added the bug Something isn't working label Jan 15, 2025
@ZPerling
Copy link
Author

I wanted to follow up on this issue to see if there has been any progress or if there is any additional information I can provide to help with the investigation. This issue is causing some disruption in using the cooldown functionality, and any updates or guidance would be greatly appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant