[vad] Previous response is returned by server if no voice activity is detected in the sample

Hi, I'm using whisper.cpp in conjunction with `whisper-api`. I've tried the new VAD feature introduced into the server component of whisper.cpp by @danbev, and while it works well with samples that contain voice, there is an issue with audio that contains no voice.
It seems that if no voice is detected in the sample, the server just returns transcript of the last voice sample that was successfully transcribed
For example:
```
whisper-cpp-1  | whisper_vad_segments_from_probs: detecting speech timestamps using 145 probabilities
whisper-cpp-1  | whisper_vad_segments_from_probs: Final speech segments after filtering: 0
wyoming-api-1  | INFO:httpx:HTTP Request: POST http://whispercpp:8910/inference?temperature=0.0&temperature_inc=0.2&response_format=json "HTTP/1.1 200 OK"
wyoming-api-1  | INFO:wyoming_whisper_api_client.handler: set a timer for 30 minutes
wyoming-api-1  | 
```
The behavior I would expect instead is for the transcript to be empty, since no voice was detected.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[vad] Previous response is returned by server if no voice activity is detected in the sample #3250

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[vad] Previous response is returned by server if no voice activity is detected in the sample #3250

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions