GitHub - GnafiY/TPDT-SS-KWS: This speech separation based framework is for multi-talker keyword spotting tasks and is implemented in the ESPnet2 toolkit.

Overview

This is the official repository for Interspeech 2024 paper Text-aware Speech Separation for Multi-talker Keyword Spotting. The implementaion of the front-end model is based on ESPnet. All unused examples in egs and egs2 are removed. As for the KWS backend, We directly apply the default setup of MDTC from WeKws examples/hey_snips/s0.

I apologize that the email address of the primary author is wrong, which should be [email protected] instead of [email protected]. Feel free to mail to me if you have any question!

Setup

Clone this repository.
Install ESPnet dependencies, please refer to ESPnet official repository.
Change directory to espnet/egs2/librimix/enh1.
Generate Libri2Mix scp data by running bash run.sh --stage 1 --stop_stage 4.
Generate Snips2Mix data with instruction in local/Snips2Mix.
Train and run inference by bash run.sh --stage 5 --stop_stage 6 and bash run.sh --stage 7 --stop_stage 8, respectively.
If you wish to run KWS inference, please refer to the snips recipe in WeKws.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
espnet		espnet
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Setup

About

Languages

GnafiY/TPDT-SS-KWS

Folders and files

Latest commit

History

Repository files navigation

Overview

Setup

About

Topics

Resources

Stars

Watchers

Forks

Languages