ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

A v-CV waveform based speech synthesis using global minimization of pitch conversion and concatenation distortion in v-CV unit sequence

Takao Koyama, Jun-ichi Takahashi

This paper proposes a new speech synthesis method for high-quality Japanese TTS (Text-to-speech) based on the waveform synthesis. The method uses V-CV as a basic synthesis unit to preserve the intelligibility of consonant. An efficient unit reconstruction method is newly adopted both to minimize pitch conversion and concatenation distortion when selecting waveforms. The minimization can provide fluency for synthesized speech. Furthermore, the proposed method enables to make a compact waveform dictionary keeping with high quality of synthesized speech. Using the waveform generation function of the method, the size of waveform dictionary can be drastically reduced by 1/40. Experimental evaluation using 32 ordinary peoples showed that high intelligibility of 97% was attained by the proposed V-CV speech synthesis method.


doi: 10.21437/Eurospeech.1999-504

Cite as: Koyama, T., Takahashi, J.-i. (1999) A v-CV waveform based speech synthesis using global minimization of pitch conversion and concatenation distortion in v-CV unit sequence. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 2311-2314, doi: 10.21437/Eurospeech.1999-504

@inproceedings{koyama99_eurospeech,
  author={Takao Koyama and Jun-ichi Takahashi},
  title={{A v-CV waveform based speech synthesis using global minimization of pitch conversion and concatenation distortion in v-CV unit sequence}},
  year=1999,
  booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)},
  pages={2311--2314},
  doi={10.21437/Eurospeech.1999-504}
}