Sixth International Conference on Spoken Language Processing
New speech synthesis algorithms capable of flexible prosody (especially F0) modification are desired for a high quality TTS system. TD-PSOLA is the most popular synthesis algorithm. The algorithm shows very high quality when F0 modification is limited. However, the quality degradation due to pitch epoch detection error becomes severe as the F0 modification factor becomes large. On the other hand, the vocoder framework is very flexible in F0 manipulation. The synthesized speech quality from the vocoder is far from natural human speech and suffers from buzziness. To remedy buzzy quality from the vocoder and make more natural synthetic speech, we propose a mixed phase vocoder.
Bibliographic reference. Kwon, Chul H. / Lee, Minkyu / Olive, Joseph P. (2000): "A new synthesis algorithm using phase information for TTS systems", In ICSLP-2000, vol.3, 298-301.