[Model] Support Gemma 3 backend for Ultravox #15728

farzadab · 2025-03-28T22:06:53Z

This PR enables using Ultravox with Gemma 3 backend.
It also adds a new <|audio|> token to the tokenizer itself instead of relying on the Llama-specific <|reserved_special_token_0|> token.

Signed-off-by: Farzad Abdolhosseini <[email protected]>

github-actions · 2025-03-28T22:07:02Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

vllm/model_executor/models/ultravox.py

Signed-off-by: Farzad Abdolhosseini <[email protected]>

farzadab · 2025-04-02T20:04:21Z

@ywang96 @DarkLight1337 PTAL.
I have not made the model public yet. If required, I can open source the 1B model for testing purposes.

DarkLight1337

Looks reasonable to me. The tests should be able to catch if this breaks older models.

DarkLight1337 · 2025-04-03T09:29:07Z

Can you merge from main to fix docker build?

DarkLight1337 · 2025-04-04T02:06:11Z

Can you fix the failing test?

farzadab · 2025-04-04T23:14:34Z

It turns out this method is not working when using V1 on Llama-based backends.

I never noticed this because the embedding size of Gemma-3-4B is larger than their actual text vocab_size.
With V0, I can capture the extra dummy audio_token_id and replace it with 0 before computing inputs_embeds.

I can't do that in V1 however which leads to the error: Token id 128256 is out of vocabulary.

farzadab · 2025-05-08T05:45:33Z

Closing in favour of #17818

ultravox: support gemma and use new audio placeholder token

9d3407d

Signed-off-by: Farzad Abdolhosseini <[email protected]>

farzadab commented Mar 28, 2025

View reviewed changes

vllm/model_executor/models/ultravox.py Outdated Show resolved Hide resolved

farzadab added 4 commits March 28, 2025 15:32

revert loader changes as WeightsMapper can handle it already

e17eaf9

Signed-off-by: Farzad Abdolhosseini <[email protected]>

better type handling

c8f37ff

Signed-off-by: Farzad Abdolhosseini <[email protected]>

guess audio_token_id based on model name

cbcc691

Signed-off-by: Farzad Abdolhosseini <[email protected]>

make sure audio_token matches across processor and config

8551cc4

Signed-off-by: Farzad Abdolhosseini <[email protected]>

farzadab changed the title ~~[WIP][Model] Support Gemma 3 backend for Ultravox~~ [Model] Support Gemma 3 backend for Ultravox Apr 2, 2025

farzadab marked this pull request as ready for review April 2, 2025 18:23

DarkLight1337 approved these changes Apr 3, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) April 3, 2025 09:28

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 3, 2025

Merge branch 'main-vllm' into farzad-gemma3

2418ff8

farzadab marked this pull request as draft April 18, 2025 22:03

auto-merge was automatically disabled April 18, 2025 22:03
Pull request was converted to draft

farzadab closed this May 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Model] Support Gemma 3 backend for Ultravox #15728

[Model] Support Gemma 3 backend for Ultravox #15728

Uh oh!

farzadab commented Mar 28, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Mar 28, 2025

Uh oh!

Uh oh!

farzadab commented Apr 2, 2025

Uh oh!

DarkLight1337 left a comment

Uh oh!

DarkLight1337 commented Apr 3, 2025

Uh oh!

DarkLight1337 commented Apr 4, 2025

Uh oh!

farzadab commented Apr 4, 2025

Uh oh!

farzadab commented May 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[Model] Support Gemma 3 backend for Ultravox #15728

[Model] Support Gemma 3 backend for Ultravox #15728

Uh oh!

Conversation

farzadab commented Mar 28, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 28, 2025

Uh oh!

Uh oh!

farzadab commented Apr 2, 2025

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 commented Apr 3, 2025

Uh oh!

DarkLight1337 commented Apr 4, 2025

Uh oh!

farzadab commented Apr 4, 2025

Uh oh!

farzadab commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

farzadab commented Mar 28, 2025 •

edited by github-actions bot

Loading

farzadab commented May 8, 2025 •

edited

Loading