PHRITTS is a text-to-speech synthesizer consisting of a grapheme-to-phoneme conversion, an accentuation, a duration control, an intonation contour generator, and a speech synthesizer based on diphones. After some text preprocessing the grapheme-to-phoneme conversion performs a moiph decomposition. For the rule-based conversion of the resulting morphs a special compiler is used. A word class assignment is made for the following focus-accent-based accentuation, which provides locations for pitch accents and phrase boundaries, followed by a simple context dependent calculation of the phoneme durations. The intonation contour is calculated from the accent pattern using rules derived from natural German speech. It consists of triangular or trapezium shaped pitch moves, placed on a simple pitch declination. They are calculated by linear interpolation in the logarithmic frequency domain. The diphone synthesis is based on 1573 German diphones, extracted from nonsense words spoken with a constant monotonous pitch, which makes it possible to directly concatenate the diphones by interpolation in the time domain, after calculation of the optimal phase at diphone boundaries. Pitch and duration are manipulated using a TD-PSOLA technique. Compared to an existing LPC-based synthesizer version, TD-PSOLA results in much more natural speech. The TTS-synthesizer is written in C-Code and runs in real time on a standard 486 PC with a DSP-board for AD-conversion.
Keywords: German text-to-speech synthesis, accentuation, intonation, diphone synthesis
Bibliographic reference. Meyer, P. / Rühl, Hans-Wilhelm / Krüger, R. / Kugler, M. / Vogten, L. L. M. / Dirksen, A. / Belhoula, Karim (1993): "PHRITTS - a text-to-speech synthesizer for the German language", In EUROSPEECH'93, 877-880.