11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Robust Word Recognition Using Articulatory Trajectories and Gestures

Vikramjit Mitra (1), Hosung Nam (2), Carol Espy-Wilson (1), Elliot Saltzman (2), Louis Goldstein (3)

(1) University of Maryland, USA
(2) Haskins Laboratories, USA
(3) University of Southern California, USA

Articulatory Phonology views speech as an ensemble of constricting events along the vocal tract. This study shows that articulatory information in the form of gestures and their output trajectories can help to improve the performance of automatic speech recognition systems. Lack of any natural speech database containing such articulatory information prompted us to use a synthetic speech dataset that contains gesture and their output trajectory information. We propose neural network based models to obtain articulatory information from the speech signal and show that such estimated articulatory information helps to improve the noise robustness of a word recognition system.

Full Paper

Bibliographic reference.  Mitra, Vikramjit / Nam, Hosung / Espy-Wilson, Carol / Saltzman, Elliot / Goldstein, Louis (2010): "Robust word recognition using articulatory trajectories and gestures", In INTERSPEECH-2010, 2038-2041.