In order to synthesize natural-sounding Japanese phonetic words, a novel VCV-concatenation synthesis with an advanced word database is proposed. The word database consists of VCV-balanced phonetic words which are uttered forcibly in type-0 and type-1 pitch accents. The advantage of using the advanced word database is that a variety of VCV-segments with the same phonetic chains and the different pitch patterns could be collected efficiently at the same time. The following pitch modification techniques are used to achieve the sound quality: (1) The optimal VCV-segment set which minimizes the pitch modification rate is selected. (2) Pitch waveforms are extracted by referring to excitation points. (3) Wavelengths of pitch waveforms are adjusted depending on the pitch modification rates. (4) Natural prosody in the VCV-segments in the database is effectively used. Superiority of the proposed database is ensured by means of the pitch pattern matching measurement and the subjective quality evaluation.
Cite as: Mochizuki, R., Arai, Y., Honda, T. (1998) A study on the natural-sounding Japanese phonetic word synthesis by using the VCV-balanced word database that consists of the words uttered forcibly in two types of pitch accent. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0247, doi: 10.21437/ICSLP.1998-38
@inproceedings{mochizuki98_icslp, author={Ryo Mochizuki and Yasuhiko Arai and Takashi Honda}, title={{A study on the natural-sounding Japanese phonetic word synthesis by using the VCV-balanced word database that consists of the words uttered forcibly in two types of pitch accent}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0247}, doi={10.21437/ICSLP.1998-38} }