ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

HMM-based TTS for hanoi vietnamese: issues in design and evaluation

Thi Thu Trang Nguyen, Christophe D'Alessandro, Albert Rilliard, Do Dat Tran

This paper presents the development and evaluation of an HMMbased TTS system for the modern Hanoi dialect of Northern Vietnamese, a tonal language. A study of specific phonetic and prosodic features of Hanoi Vietnamese is discussed. Consequences on the design of an HMM-based TTS system are derived. Using this knowledge, a TTS system, called VTed, is then developed under the Mary TTS platform. The second part of the paper is devoted to perceptual evaluations of Vietnamese speech synthesis. Three kinds of evaluations are considered necessary for quality assessment of this tonal language. The general MOS assessment, utterance-level intelligibility, and tone-level intelligibility tests are conducted on the VTed system under a "natural speech reference" condition. The results show 1.21 points difference between natural and synthetic speech for the MOS test, a 0.2%.0.9% difference for the utterance-level intelligibility test, 23% on average and . depending on the tone type . from 0% to 37% difference for the tone-level intelligibility test. These results demonstrate the need for more specific works on tonal/prosodic level to improve automatic synthesis of Vietnamese and other tonal languages.


doi: 10.21437/Interspeech.2013-541

Cite as: Nguyen, T.T.T., D'Alessandro, C., Rilliard, A., Tran, D.D. (2013) HMM-based TTS for hanoi vietnamese: issues in design and evaluation. Proc. Interspeech 2013, 2311-2315, doi: 10.21437/Interspeech.2013-541

@inproceedings{nguyen13_interspeech,
  author={Thi Thu Trang Nguyen and Christophe D'Alessandro and Albert Rilliard and Do Dat Tran},
  title={{HMM-based TTS for hanoi vietnamese: issues in design and evaluation}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={2311--2315},
  doi={10.21437/Interspeech.2013-541}
}