The ESCA Workshop on Speech Synthesis

September 25-28, 1990
Autrans, France

Spectral Transitions in Rule-Based and Diphone Synthesis

Douglas O'Shaughnessy

INRS-Telecommunications, Université du Quebec, Nuns Island, Quebec, Canada

The problem of adequate dynamic modeling of the speech spectrum is explored for general text-to-speech applications. Using analysis of formant patterns from English speech, natural formant patterns in time are compared with those produced by the MITalk system, noting where the system has difficulties in modeling spectral transitions. Phonetic contexts where a diphone approach would have the most difficulty are noted, i.e., where the diphone coarticulation assumption is invalid. To improve phoneme-based synthesis systems, better rules are needed to model coarticulation for phoneme-concatenation synthesis. To improve diphone synthesis, I enumerate contexts where triphones would better model natural speech.

