Skip to content

This is a repository of chinese/mandarin tts (text-to-speech) .

License

Notifications You must be signed in to change notification settings

joan126/tacotron2-mandarin-griffin-lim

 
 

Repository files navigation

tacotron-2-mandarin-griffin-lim

Tensorflow implementation of DeepMind's Tacotron-2. A deep neural network architecture described in this paper: Natural TTS synthesis by conditioning Wavenet on MEL spectogram predictions

Repo Structure

tacotron-2-mandarin-griffin-lim
|--- datasets
|--- logs-Tacotron
     |--- eval-dir
     |--- plots
     |--- taco_pretrained
     |--- wavs
|--- papers
|--- prepare
|--- tacotron
     |--- models
     |--- utils
|--- tacotron_output
     |--- eval
     |--- logs-eval
          |--- plots
          |--- wavs
|--- training_data
     |--- audio
     |--- linear
     |--- mels

Samples

There are some synthesis samples here.

Pretrained

you can get pretrained model here.

Quick Start

OS: Ubuntu 16.04

Step (0) - Git clone repository

git clone https://github.com/Joee1995/tacotron-2-mandarin-griffin-lim.git
cd tacotron-2-mandarin-griffin-lim/

Step (1) - Install dependencies

  1. Install Python 3 (python-3.5.5 for me)

  2. Install TensorFlow (tensorflow-1.10.0 for me)

  3. Install other dependencies

    pip install -r requirements.txt
    

Step (2) - Prepare dataset

  1. Download dataset BIAOBEI or THCHS-30

    After that, your doc tree should be:

    tacotron-2-mandarin-griffin-lim
    |--- ...
    |--- BZNSYP
         |--- ProsodyLabeling
              |--- 000001-010000.txt
         |--- Wave
    |--- ...
    
  2. Prepare dataset (default is BIAOBEI)

    python prepare_dataset.py
    

    If preparing THCHS-30, you can use parameter --dataset=THCHS-30.

    After that, you can get a folder BIAOBEI as follow:

    tacotron-2-mandarin-griffin-lim
    |--- ...
    |--- BIAOBEI
         |--- biaobei_48000
    |--- ...
    
  3. Preprocess dataset (default is BIAOBEI)

    python preprocess.py
    

    If prrprocessing THCHS-30, you can use parameter --dataset=THCHS-30.

    After that, you can get a folder training_data as follow:

    tacotron-2-mandarin-griffin-lim
    |--- ...
    |--- training_data
         |--- audio
         |--- linear
         |--- mels
         |--- train.txt
    |--- ...
    

Step (3) - Train tacotron model

python train.py

More parameters, please see train.py.

After that, you can get a folder logs-Tacotron as follow:

tacotron-2-mandarin-griffin-lim
|--- ...
|--- logs-Tacotron
     |--- eval-dir
     |--- plots
     |--- taco_pretrained
     |--- wavs
|--- ...

Step (4) - Synthesize audio

python synthesize.py

More parameters, please see synthesize.py.

After that, you can get a folder tacotron_output as follow:

tacotron-2-mandarin-griffin-lim
|--- ...
|--- tacotron_output
     |--- eval
     |--- logs-eval
          |--- plots
          |--- wavs
|--- ...

References & Resources

Rayhane-mamah/Tacotron-2

About

This is a repository of chinese/mandarin tts (text-to-speech) .

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.1%
  • Jupyter Notebook 0.9%