ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

An automatic speech recognition system using neural networks and linear dynamic models to recover and model articulatory traces

Joe Frankel, Korin Richmond, Simon King, Paul Taylor

We describe a speech recognition system which uses articulatory parameters as basic features and phone-dependent linear dynamic models. The system first estimates articulatory trajectories from the speech signal. Estimations of x and y coordinates of 7 actual articulator positions in the midsagittal plane are produced every 2 milliseconds by a recurrent neural network, trained on real articulatory data. The output of this network is then passed to a set of linear dynamic models, which perform phone recognition.


Cite as: Frankel, J., Richmond, K., King, S., Taylor, P. (2000) An automatic speech recognition system using neural networks and linear dynamic models to recover and model articulatory traces. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 4, 254-257

@inproceedings{frankel00_icslp,
  author={Joe Frankel and Korin Richmond and Simon King and Paul Taylor},
  title={{An automatic speech recognition system using neural networks and linear dynamic models to recover and model articulatory traces}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 4, 254-257}
}