Skip to content

Diarization paramter updates (prefer current speaker) #120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

chrisw77
Copy link

Added new parameter prefer_current_speaker for Diarization. This boolean parameter defaults to false, but when enabled instructs the clustering algorithm to ask an extra question "Am I close enough to current speaker?" before going on to the usual "Which speaker is closest?". Thus, if the current word level speaker embedding is similar enough to the current active speaker (i.e. that which was output for the previous word) it will stay with that speaker, even if another speaker is now closer. This is to reduce occasional situations where we can flip-flop back and forth between similar speakers even when only one speaker is speaking (typically on long, noisy audio)

Also exposed speaker_sensitivity for RT (was batch only).

@chrisw77 chrisw77 force-pushed the Feature/speaker_diarization_prefer_current_speaker branch 2 times, most recently from 38ce8c3 to 7f94d0a Compare April 17, 2025 12:05
…posed [speaker_sensitivity] for RT (was batch only).
@chrisw77 chrisw77 force-pushed the Feature/speaker_diarization_prefer_current_speaker branch from 7f94d0a to e62b963 Compare April 17, 2025 13:53
@chrisw77 chrisw77 requested a review from dumitrugutu April 17, 2025 14:03
@dumitrugutu
Copy link
Contributor

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants