Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

save a segment of that model is predicted #246

Open
m15kh opened this issue Oct 24, 2024 · 2 comments
Open

save a segment of that model is predicted #246

m15kh opened this issue Oct 24, 2024 · 2 comments
Labels
question Further information is requested

Comments

@m15kh
Copy link

m15kh commented Oct 24, 2024

"Hi! @juanmc2005
I want to save the segments that the model predicts as containing speech. The model detects segments in real-time where someone is talking, and I specifically want to save those audio segments where the model indicates 'yes' for a spoken label. Please save these detected sounds in WAV format."

@m15kh
Copy link
Author

m15kh commented Oct 26, 2024

@juanmc2005
can you help me for this issue?

@juanmc2005 juanmc2005 added the question Further information is requested label Nov 6, 2024
@juanmc2005
Copy link
Owner

Hi @m15kh,

The SpeakerDiarization pipeline does provide the waveform aligned to the current diarization output (see here).

The StreamingInference class provides you a way to execute some code when a new pair of "output-audio" is available.
You can achieve this with the attach_hooks() method (see here) by passing a function to execute whenever a new tuple[Annotation, SlidingWindowFeature] is available. Then it would be a matter of cropping the audio according to the speech in the annotation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants