Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Diphone Collection and Synthesis

Kevin A. Lenzo (1), Alan W. Black (2)

(1) International Software Research Institute; (2) Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, USA

In this paper, we describe the design and collection of corpora for diphone synthesis, the voice building process, and our experience in the creation of a new, publically available database of ten diphone sets of one American English speaker for the Festival Speech Synthesis System, using the FestVox document and tools. In support of our goal to make the tools and techniques available for anyone to build their own synthetic voices, we have generalized and streamlined the tasks involved from what were once arcane anecdotes, half-written one-off scripts, and partial descriptions, to detailed, complete instructions that others have followed with good results.

Full Paper

Bibliographic reference.  Lenzo, Kevin A. / Black, Alan W. (2000): "Diphone collection and synthesis", In ICSLP-2000, vol.3, 306-309.