Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Speech Synthesis Using HMM-Based Acoustic Unit Inventory

Jindrich Matousek

University of West Bohemia, Department of Cybernetics, Plzen, Czech Republic

The usage of multiple Hidden Markov Models (HMMs) to prepare a Czech acoustic unit inventory and speech synthesis based on this inventory are presented in this paper. Triphone HMMs are trained on the basis of the speech corpus spoken by a single speaker. The states of triphone HMMs are automatically clustered down using binary decision trees. The clustered states are then used to automatically segment the speech corpus and to create a speech segment database. The acoustic unit inventory constructed in this way is assumed to enable more precise context modeling than was previously possible. Concatenation-based speech synthesizer can be designed on the basis of the speech segment database. Several speech synthesis techniques are discussed for this purpose. In the end, a Czech text-to-speech (TTS) system is presented.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Matousek, Jindrich (1999): "Speech synthesis using HMM-based acoustic unit inventory", In EUROSPEECH'99, 2323-2326.