ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Multistage coarticulation model combining articulatory, formant and cepstral features

Yuqing Gao, Raimo Bakis, Jing Huang, Bing Xiang

We describe a multi-stage speech production model containing a linear, phoneme-independent coarticulation filter, followed by a nonlinear component. The latter generates two cepstra which are then additively combined: one corresponding to a relatively smooth background spectrum, and the other representing three formant-like spectral peaks. A neural net is used for both parts, but the second part also utilizes a hard-coded function that generates exactly three spectral peaks. A unified model of training, adaptation, and decoding is developed, each operation di ering only with respect to prior probability distributions. Prior probabilities can be introduced at each stage of the model, providing a flexible framework for utilizing both specific and general prior knowledge. We demonstrate the use of this model for speech synthesis as well as recognition.


doi: 10.21437/ICSLP.2000-7

Cite as: Gao, Y., Bakis, R., Huang, J., Xiang, B. (2000) Multistage coarticulation model combining articulatory, formant and cepstral features. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 1, 25-28, doi: 10.21437/ICSLP.2000-7

@inproceedings{gao00_icslp,
  author={Yuqing Gao and Raimo Bakis and Jing Huang and Bing Xiang},
  title={{Multistage coarticulation model combining articulatory, formant and cepstral features}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 1, 25-28},
  doi={10.21437/ICSLP.2000-7}
}