A single speaker, multi-style corpus of German speech, with a large neutral subset, and subsets acting out four different expressive speaking styles, named for virtual characters in the SEMAINE and IDEAS4GAMES projects (quoting the original directors instructions):
-
Poppy ist fröhlich, optimistisch und sieht das Gute an allen Dingen! (Poppy is cheerful and optimistic.)
-
Obadiah ist von Natur aus niedergeschlagen und blickt pessimistisch in die Zukunft... (Obadiah is gloomy and pessimistic.)
-
Spike ist aggressiv und geht keinem Streit aus dem Weg! (Spike is aggressive and confrontational.)
-
Max ist ein ausgekochter Pokerspieler. Er ist cool, ihn bringt nichts aus der Ruhe. (Max is a hard-boiled poker player. He is cool and laid-back.)
The speaker is Stefan Röttig, a male native speaker of German trained as a professional actor and baritone opera singer.
The audio data is provided in the losslessly compressed FLAC format, which can be played by a myriad of software, including Praat. The speaker was recorded at a 44.1 kHz sampling rate, 24 bits per sample, in mono. No filters of any sort have been applied to this raw data, but low-pass filtering at 50 Hz is recommended.
The manually corrected phonetic labels are provided as Xwaves .lab
files;
each line (after the header) has the fields
[ENDTIME] [NUMBER] [LABEL]
where ENDTIME
is in seconds, NUMBER
has no significance, and LABEL
uses a variant of the SAMPA phonetic alphabet.
(These files can also be opened in Praat using the command Read IntervalTier from Xwaves...
.)
The best way of obtaining the PAVOQUE data is to use Git. To make the most of the features offered by GitHub, you should fork this repository to your own GitHub account, then use your favorite Git client to clone that repo to your local machine. Please refer to GitHub Help and the Web as required.
If all of that stuff about Git is just too technical, you can still download the PAVOQUE corpus in a few simple steps.
-
Download the most recent version (the master branch) of this repository as a
.zip
file by clicking on the "Download ZIP" button, or from the following URL: -
Unpack the
.zip
file.
Because Git is not designed to efficiently handle large amounts of binary data, specifically the audio files in the PAVOQUE corpus, those files are retrieved from the internet using Gradle.
For performance reasons, the .flac
files have been packaged into tarballs, one per speaking style.
(Use tar
or your favorite file archive utility to unpack.)
You will need Java to use Gradle.
Once you've cloned (or manually downloaded and unpacked) the repository, you can use the command ./gradlew download
to download the audio data into the Recordings
subdirectory.
To get the data for only one of the speaking styles, add its name as a prefix, i.e., use ./gradlew Neutral:download
or ./gradlew Spike:download
to download only the neutral or Spike (aggressive) subsets, respectively.
On Windows, type gradlew.bat
instead of ./gradlew
.
Note that this will cache the downloaded audio files in your home directory. (You could customize this using the -g
command line option (see the Gradle User Guide for details.)
For Linux users, simply run the download-all.sh
script.
You can also download the audio data directly from one of the web remotes, such as http://mary.dfki.de/download/pavoque-data/Recordings/
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
In case of issues, please open a new issue right here in this repository on GitHub.
If this does not resolve your issue, you may send an email to ingmar.steiner(at)dfki.de
.