ISCA Archive ISCSLP 2004
ISCA Archive ISCSLP 2004

An Acoustic and Articulatory Knowledge Integrated Method for Improving Synthetic Mandarin Speech's Fluency

HungYan Gu, KuoHsian Wang

In synthetic Mandarin speech, discontinuity of formant traces at syllable boundaries is a key factor that lowers fluency level. Therefore, we study an acoustic and articulatory knowledge integrated method to solve this discontinuity problem. First, representative trisyllable contexts are selected and their signals are recorded. The middle syllable’s signal of each trisyllable pronunciation is then extracted to make a synthesis unit. To select a synthesis unit among multiple candidates, a distance function is defined to measure the spectral similarity between two synthesis units to be concatenated. In addition, several linking-restriction rules are derived, according to articulatory knowledge, to prevent some synthesis units being linked into a sequence. Then, a globally best synthesis-unit sequence is searched by using a dynamic programming based algorithm. When the method above is applied, the formant traces at syllable boundaries will become smoother. Also, subject evaluation shows that the fluency level of synthetic Mandarin speech can indeed be improved a lot.


Cite as: Gu, H., Wang, K. (2004) An Acoustic and Articulatory Knowledge Integrated Method for Improving Synthetic Mandarin Speech's Fluency. Proc. International Symposium on Chinese Spoken Language Processing, 205-208

@inproceedings{gu04c_iscslp,
  author={HungYan Gu and KuoHsian Wang},
  title={{An Acoustic and Articulatory Knowledge Integrated Method for Improving Synthetic Mandarin Speech's Fluency}},
  year=2004,
  booktitle={Proc. International Symposium on Chinese Spoken Language Processing},
  pages={205--208}
}