ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Speaker adaptation using a parallel phone set pronunciation dictionary for Thai-English bilingual TTS

Anocha Rugchatjaroen, Nattanun Thatphithakkul, Ananlada Chotimongkol, Ausdang Thangthai, Chai Wutiwiwatchai

This paper develops a bilingual Thai-English TTS system from two monolingual HMM-based TTS systems. An English Nagoya HMM-based TTS system (HTS) provides correct pronunciations of English words but the voice is different from the voice in a Thai HTS system. We apply a CSMAPLR adaptation technique to make the English voice sounds more similar to the Thai voice. To overcome a phone mapping problem normally occurs with a pair of languages that have dissimilar phone sets, we utilize a cross-language pronunciation mapping through a parallel phone set pronunciation dictionary. The results from the subjective listening test show that English words synthesized by our proposed system are more intelligible (with 0.61 higher MOS) than the existing bilingual Thai-English TTS. Moreover, with the proposed adaptation method, the synthesized English words sound more similar to synthesized Thai words.


doi: 10.21437/Interspeech.2009-152

Cite as: Rugchatjaroen, A., Thatphithakkul, N., Chotimongkol, A., Thangthai, A., Wutiwiwatchai, C. (2009) Speaker adaptation using a parallel phone set pronunciation dictionary for Thai-English bilingual TTS. Proc. Interspeech 2009, 1795-1798, doi: 10.21437/Interspeech.2009-152

@inproceedings{rugchatjaroen09_interspeech,
  author={Anocha Rugchatjaroen and Nattanun Thatphithakkul and Ananlada Chotimongkol and Ausdang Thangthai and Chai Wutiwiwatchai},
  title={{Speaker adaptation using a parallel phone set pronunciation dictionary for Thai-English bilingual TTS}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={1795--1798},
  doi={10.21437/Interspeech.2009-152}
}