8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Automatic Building of Synthetic Voices from Large Multi-Paragraph Speech Databases

Kishore Prahallad, Arthur R. Toth, Alan W. Black

Carnegie Mellon University, USA

Large multi paragraph speech databases encapsulate prosodic and contextual information beyond the sentence level which could be exploited to build natural sounding voices. This paper discusses our efforts on automatic building of synthetic voices from large multi-paragraph speech databases. We show that the primary issue of segmentation of large speech file could be addressed with modifications to forced-alignment technique and that the proposed technique is independent of the duration of the audio file. We also discuss how this framework could be extended to build a large number of voices from public domain large multi-paragraph recordings.

Full Paper

Bibliographic reference.  Prahallad, Kishore / Toth, Arthur R. / Black, Alan W. (2007): "Automatic building of synthetic voices from large multi-paragraph speech databases", In INTERSPEECH-2007, 2901-2904.