ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

The entropy of the articulatory phonological code: recognizing gestures from tract variables

Xiaodan Zhuang, Hosung Nam, Mark Hasegawa-Johnson, Louis M. Goldstein, Elliot Saltzman

We propose an instantaneous "gestural pattern vector" to encode the instantaneous pattern of gesture activations across tract variables in the gestural score. The design of these gestural pattern vectors is the first step towards an automatic speech recognizer motivated by articulatory phonology, which is expected to be more invariant to speech coarticulation and reduction than conventional speech recognizers built with the sequence-of-phones assumption. We use a tandem model to recover the instantaneous gestural pattern vectors from tract variable time functions in local time windows, and achieve classification accuracy up to 84.5% for synthesized data from one speaker. Recognizing all gestural pattern vectors is equivalent to recognizing the ensemble of gestures. This result suggests that the proposed gestural pattern vector might be a viable unit in statistical models for speech recognition.


doi: 10.21437/Interspeech.2008-428

Cite as: Zhuang, X., Nam, H., Hasegawa-Johnson, M., Goldstein, L.M., Saltzman, E. (2008) The entropy of the articulatory phonological code: recognizing gestures from tract variables. Proc. Interspeech 2008, 1489-1492, doi: 10.21437/Interspeech.2008-428

@inproceedings{zhuang08_interspeech,
  author={Xiaodan Zhuang and Hosung Nam and Mark Hasegawa-Johnson and Louis M. Goldstein and Elliot Saltzman},
  title={{The entropy of the articulatory phonological code: recognizing gestures from tract variables}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={1489--1492},
  doi={10.21437/Interspeech.2008-428}
}