14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

TUNDRA: A Multilingual Corpus of Found Data for TTS Research Created with Light Supervision

Adriana Stan (1), O. Watts (2), Y. Mamiya (2), M. Giurgiu (1), Robert A. J. Clark (2), Junichi Yamagishi (2), Simon King (2)

(1) Universitatea Tehnică din Cluj-Napoca, Romania
(2) University of Edinburgh, UK

Simple4All Tundra (version 1.0) is the first release of a standardised multilingual corpus designed for text-to-speech research with imperfect or found data. The corpus consists of approximately 60 hours of speech data from audiobooks in 14 languages, as well as utterance-level alignments obtained with a lightly-supervised process. Future versions of the corpus will include finer-grained alignment and prosodic annotation, all of which will be made freely available. This paper gives a general outline of the data collected so far, as well as a detailed description of how this has been done, emphasizing the minimal language-specific knowledge and manual intervention used to compile the corpus. To demonstrate its potential use, text-to-speech systems have been built for all languages using unsupervised or lightly supervised methods, also briefly presented in the paper.

Full Paper

Bibliographic reference.  Stan, Adriana / Watts, O. / Mamiya, Y. / Giurgiu, M. / Clark, Robert A. J. / Yamagishi, Junichi / King, Simon (2013): "TUNDRA: a multilingual corpus of found data for TTS research created with light supervision", In INTERSPEECH-2013, 2331-2335.