You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"Hi! @juanmc2005
I want to save the segments that the model predicts as containing speech. The model detects segments in real-time where someone is talking, and I specifically want to save those audio segments where the model indicates 'yes' for a spoken label. Please save these detected sounds in WAV format."
The text was updated successfully, but these errors were encountered:
The SpeakerDiarization pipeline does provide the waveform aligned to the current diarization output (see here).
The StreamingInference class provides you a way to execute some code when a new pair of "output-audio" is available.
You can achieve this with the attach_hooks() method (see here) by passing a function to execute whenever a new tuple[Annotation, SlidingWindowFeature] is available. Then it would be a matter of cropping the audio according to the speech in the annotation.
"Hi! @juanmc2005
I want to save the segments that the model predicts as containing speech. The model detects segments in real-time where someone is talking, and I specifically want to save those audio segments where the model indicates 'yes' for a spoken label. Please save these detected sounds in WAV format."
The text was updated successfully, but these errors were encountered: