ISCA Archive SPASR 2013
ISCA Archive SPASR 2013

The invariant property of gestures

Carol Espy-Wilson

Variability in speech particularly as a consequence of production rate is still a great challenge in the development of automatic speech recognition (ASR) systems that perform well with minimal constraints. Articulatory Phonology provides a unified framework for understanding the resulting acoustic consequences of changes in speech production due to gestural overlap and gestural reduction that are often reported as assimilations, insertions, deletions and substitutions. In this talk, I will discuss the development of our speech inversion system, and its ability to extract vocal tract constriction variables and, hence, gestures from speech spoken at different speaking rates. We have conducted several studies to show that augmenting acoustic features with such articulatory information improves the robustness of ASR systems in noise. An additional goal is to provide a framework that models in a seamless way speech variability due to coarticulation and lenition.

Cite as: Espy-Wilson, C. (2013) The invariant property of gestures. Proc. Speech Production in Automatic Speech Recognition (SPASR-2013)

  author={Carol Espy-Wilson},
  title={{The invariant property of gestures}},
  booktitle={Proc. Speech Production in Automatic Speech Recognition (SPASR-2013)}