Articulatory Phonology views speech as an ensemble of constricting events along the vocal tract. This study shows that articulatory information in the form of gestures and their output trajectories can help to improve the performance of automatic speech recognition systems. Lack of any natural speech database containing such articulatory information prompted us to use a synthetic speech dataset that contains gesture and their output trajectory information. We propose neural network based models to obtain articulatory information from the speech signal and show that such estimated articulatory information helps to improve the noise robustness of a word recognition system.
Bibliographic reference. Mitra, Vikramjit / Nam, Hosung / Espy-Wilson, Carol / Saltzman, Elliot / Goldstein, Louis (2010): "Robust word recognition using articulatory trajectories and gestures", In INTERSPEECH-2010, 2038-2041.