Skip to content

handle check_provider_endpoint:True with multiple wildcard models via openai like provider #10358

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

mkhludnev
Copy link
Contributor

@mkhludnev mkhludnev commented Apr 27, 2025

Title

handle custom/* models via openai/* or litellm_proxy/*

Relevant issues

Fixes #10357

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

TBC

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • I have added a screenshot of my new test passing locally
  • My PR passes all unit tests on (make test-unit)[https://docs.litellm.ai/docs/extras/contributing_code]
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🐛 Bug Fix

Changes

This allows to configure LitteLLM and a relay\proxy for OpenAI compatible providers via wildcard models. eg

model_list:
  - model_name: "foo/*"
    litellm_params:
      model: openai/*
      api_base: "https://thirpar.ty/openai_api"
      api_key: CAFEBABE
  - model_name: "bar/*"
    litellm_params:
      model: openai/*
      api_base: https://my.host/vllm/v1
      api_key: DEADBEEF

general_settings: 
  master_key: sk-6789
#.....
litellm_settings:
    check_provider_endpoint: true

This fix let to handle /models, pulling actual model names from these providers, prepend models with provider prefixes. And then handle /chat/completion with "model":"foo/my_llama_etc".

Copy link

vercel bot commented Apr 27, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback Apr 29, 2025 7:09am

@CLAassistant
Copy link

CLAassistant commented Apr 27, 2025

CLA assistant check
All committers have signed the CLA.

@mkhludnev mkhludnev marked this pull request as ready for review April 27, 2025 11:30
@mkhludnev
Copy link
Contributor Author

Please suggest the test approach for this. I used to test such cases with vcrpy, but can't see this lib usage here.

@mkhludnev mkhludnev changed the title handle custom/* models via openai/* handle multiple wildcard models via openai like provider Apr 27, 2025
@krrishdholakia
Copy link
Contributor

@mkhludnev please add corresponding unit tests inside tests/litellm - https://docs.litellm.ai/docs/extras/contributing_code#2-adding-testing-to-your-pr

@mkhludnev mkhludnev changed the title handle multiple wildcard models via openai like provider handle check_provider_endpoint:True with multiple wildcard models via openai like provider Apr 27, 2025
@mkhludnev
Copy link
Contributor Author

turns out only litellm_params.model:"openai/*" is supported, but litellm_proxy is not due to prepending models with own name [here](https://github.com/BerriAI/litellm/blob/f08a4e3c067d9f5a04e67ccefb97533652d1cea0/litellm/llms/litellm_proxy/chat/transformation.py#L50] it seems like a slight disagreement with #7538

@mkhludnev
Copy link
Contributor Author

@mkhludnev please add corresponding unit tests inside tests/litellm - https://docs.litellm.ai/docs/extras/contributing_code#2-adding-testing-to-your-pr

done

@mkhludnev
Copy link
Contributor Author

image

mkhludnev referenced this pull request Apr 28, 2025
…models based on key (#7538)

* test(test_utils.py): initial test for valid models

Addresses #7525

* fix: test

* feat(fireworks_ai/transformation.py): support retrieving valid models from fireworks ai endpoint

* refactor(fireworks_ai/): support checking model info on `/v1/models` route

* docs(set_keys.md): update docs to clarify check llm provider api usage

* fix(watsonx/common_utils.py): support 'WATSONX_ZENAPIKEY' for iam auth

* fix(watsonx): read in watsonx token from env var

* fix: fix linting errors

* fix(utils.py): fix provider config check

* style: cleanup unused imports
@krrishdholakia
Copy link
Contributor

Hi @mkhludnev trying to understand your PR

This fix let to handle /models, pulling actual model names from these providers, prepend models with provider prefixes.

Are you just trying to return the public wildcard route prefix in the /models

@@ -167,12 +171,12 @@ def get_known_models_from_wildcard(
return []
# get all known provider models
wildcard_models = get_provider_models(
provider=provider, litellm_params=litellm_params
provider=provider, model=model, litellm_params=litellm_params
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would you pass in the model to get provider models?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. got it

@@ -347,6 +347,15 @@ def validate_environment(

return headers

@staticmethod
def strip_v1(api_base:str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of inference servers serves at .../v1 url. So, to make /chat/completions work, I need to specify api_base: ../v1 like:

model_list:
  - model_name: "foo/*"
    litellm_params:
      model: openai/*
      api_base: https://my.host/proxy/v1
      api_key: os.environ/AI_PROXY_KEY

general_settings: 
  master_key: sk-foo

litellm_settings:
    check_provider_endpoint: true

In this case, get_models() fails due to requesting
https://my.host/proxy/v1/v1/models
see here

url=f"{api_base}/v1/models",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the test asserting that regardless of /v1 tail in api_base it requests /v1/models as expected.

@pytest.mark.parametrize("api_base, v1_models_url", [("http://foo.bar/baz", "http://foo.bar/baz/v1/models"),

@mkhludnev
Copy link
Contributor Author

@krrishdholakia may I ask if this PR is considered for merging?

@zkck
Copy link

zkck commented May 7, 2025

Hi everyone,

Thanks for the PR, we need this fix as well. Some points:

api_base with /v1 or not?

It seems we don't handle api_base very consistently, in terms of when a /v1 is needed at the end and when not. In this PR, we tweak get_models to support api_bases with or without /v1 at the end. IMO we should define that api_base, when working with an openapi provider, should be an OpenAI v1 compatible endpoint, and in the litellm code we work with suffixes like /models and /chat/completions (without the v1).

This would mean that this fix should instead tweak the adding of the suffix of /v1/models, and replace it with /models.

Get provider from litellm_params.model instead of model_name

In litellm, it seems that in general the provider is determined based on the prefix in litellm_params.model. But for model discovery, it checks if the model_name starts with a supported provider. I don't really understand why, is there a use case? Because ideally, I would be like to be able to write the following config, without the need of a prefix in model_name:

litellm_settings:
  check_provider_endpoint: true
model_list:
  - litellm_params:
      api_base: https://my-url.com/v1
      api_key: secret
      model: openai/*
    model_name: *

As I understand this PR, it extends the existing functionality by having a fallback in case a prefix in model_name is not found in the existing providers. But IMO, as explained above, we should not be looking at model_name for determining the provider, as it is inconsistent.

What do you guys think?

@mkhludnev
Copy link
Contributor Author

Hi @zkck

IMO we should define that api_base, when working with an openapi provider, should be an OpenAI v1 compatible endpoint, and in the litellm code we work with suffixes like /models and /chat/completions (without the v1).

Generally I agree, but I worry about backward compatibility. What about colleagues with existing configs w/o v1 ? That's why I decided to turn to this ugly hack strip_v1().

In litellm, it seems that in general the provider is determined based on the prefix in litellm_params.model. But for model discovery, it checks if the model_name starts with a supported provider.

I'm blind guessing: it seems like legacy to me. It looks like initially just model_name were used, and thenlitellm_params.model was introduced for more control.

litellm_settings:
  check_provider_endpoint: true
model_list:
  - litellm_params:
      api_base: https://my-url.com/v1
      api_key: secret
      model: openai/*
    model_name: *

Regarding this config, how to handle two openai endpoints distinguishing between them?

As I understand this PR, it extends the existing functionality by having a fallback in case a prefix in model_name is not found in the existing providers.

Right . It's a fallback logic letting to minimize changes and support existing configs as much as possible. Pardon for continuing this awkwardness.

@mkhludnev
Copy link
Contributor Author

Hi,
Gently asking for feedback. Appreciate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handling mycustom/* by {provider}/*
4 participants