European Conference on Speech Technology
Edinburgh, Scotland, UK
We are developing a text to speech synthesis system for the German language, which has the same type of speech elements in both, the linguistic-phonetic transcription and the phonemization level, namely 'clusters'. They are defined as sequences of graphemes or phonemes of the same type (vowel or consonant clusters). Compared to other speech elements (e.g. diphones or demisyllables), the number of clusters is remarkably lower. Another advantage of clusters is the fact that cluster borders are usually correlated with stress borders, which reduces concatenating problems and improves the naturalness of synthesized speech. This paper describes our transcription and phonemization techniques and some hardware aspects.
Bibliographic reference. Fellbaum, Klaus / Rook, J. (1987): "Text-to-speech synthesis based on grapheme and phoneme clusters", In ECST-1987, 1067-1070.