Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Returning real time waveform #250

Open
m15kh opened this issue Nov 3, 2024 · 5 comments
Open

Returning real time waveform #250

m15kh opened this issue Nov 3, 2024 · 5 comments
Labels
duplicate This issue or pull request already exists

Comments

@m15kh
Copy link

m15kh commented Nov 3, 2024

Hi,

Regarding this issue: link, it returns annotations in real time.

How can I achieve the same for sound? I want to save the exact waveform in WAV format in real time.

@yeahphil
Copy link

yeahphil commented Nov 4, 2024

That audio is what you're feeding in to the model -- you'll want to tee it before, not try and get audio back out alongside predictions.

@m15kh
Copy link
Author

m15kh commented Nov 4, 2024

@yeahphil
Please see the image below:
image
In the image above, my custom model detects something as I intended. However, I would like the model to save the exact audio corresponding as wav or mp3 format to the predicted label (highlighted in blue) when it makes a prediction.

How can I achieve this?

@juanmc2005
Copy link
Owner

Is this a duplicate of #246 ?
I just posted an answer there that should help you get the bits of audio alongside the predictions.
Let me know how it works.

@juanmc2005 juanmc2005 added the duplicate This issue or pull request already exists label Nov 6, 2024
@m15kh
Copy link
Author

m15kh commented Nov 8, 2024

@juanmc2005
Yes it's duplicate question becuase i thought i haden't explained well in issue 246

first of all thanks for your answering

i did this i now i can save a segment audio that model predicted

count = 0
def use_real_time_prediction(results):
    global count 
    prediction, waveform = results  
    
    if prediction:
        #NOTE for save audio
        filename = f'output/waveform{count}.wav'  
        sf.write(filename, waveform.data, samplerate=16000)  
        print(f"Waveform saved to {filename}")
        count += 1  



config = PipelineConfig
path_model  = 'checkpoint.ckpt'
config = VoiceActivityDetectionConfig(segmentation=m.SegmentationModel.from_pyannote(path_model))
pipeline = VoiceActivityDetection(config=config)
mic = MicrophoneAudioSource()
inference = StreamingInference(pipeline, mic, do_plot=False, do_profile=True) 

inference.attach_hooks(use_real_time_prediction)

total_prediction = inference()

@m15kh
Copy link
Author

m15kh commented Nov 8, 2024

@juanmc2005
i also added this feature and sent request for pull requestslink

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

3 participants