Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Real Time Demo that allows natural conversations #91

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

freddyaboulton
Copy link

Overview

This PR adds an interactive demo that enables natural, continuous conversations with Qwen2-Audio. Users can engage in fluid dialogue with the model through their microphone. Responses are automatically generated when they finish speaking. This enhancement makes the model more accessible and natural to interact with.

Key Features

  • Real-time audio streaming using WebRTC
  • Automatic speech detection and processing
  • Support for both local and cloud deployment

Dependencies

Added requirements:

  • gradio-webrtc (gradio custom component that enables real time audio/video streaming). Disclaimer - I am the author of this extension.
  • twilio (optional, for cloud deployment)

Demo

qwen2-audio.mp4

There is some delay in processing the response due to acquiring the shared GPU on HuggingFace spaces. On dedicated hardware it should be much faster but I don't have the GPUs to verify myself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant