ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Garbage modeling for on-device speech recognition

Christophe Van Gysel, Leonid Velikovich, Ian McGraw, Fran├žoise Beaufays

User interactions with mobile devices increasingly depend on voice as a primary input modality. Due to the disadvantages of sending audio across potentially spotty network connections for speech recognition, in recent years there has been growing attention to performing recognition on-device. The limited computational resources, however, typically require additional model constraints. In this work, we explore the task of on-device utterance verification, wherein the recognizer must transcribe an utterance if it is in a target set or reject it as being out of domain. We present a data-driven methodology for mining tens of thousands of target phrases from an existing corpus. We then compare two common garbage-modeling approaches to utterance verification: a sub-word rejection model and a white-listed n-gram model. We examine a deficiency of the sub-word modeling approach and introduce a novel modification that makes use of common prefixes between targeted phrases and non-targeted phrases. We show good performance in the trade-off between recall and word error rate using both the prefix and white-listed n-gram approaches. Finally, we evaluate the prefix-based approach in a hybrid setting where rejected instances are sent to a server-side recognizer.


doi: 10.21437/Interspeech.2015-480

Cite as: Gysel, C.V., Velikovich, L., McGraw, I., Beaufays, F. (2015) Garbage modeling for on-device speech recognition. Proc. Interspeech 2015, 2127-2131, doi: 10.21437/Interspeech.2015-480

@inproceedings{gysel15_interspeech,
  author={Christophe Van Gysel and Leonid Velikovich and Ian McGraw and Fran├žoise Beaufays},
  title={{Garbage modeling for on-device speech recognition}},
  year=2015,
  booktitle={Proc. Interspeech 2015},
  pages={2127--2131},
  doi={10.21437/Interspeech.2015-480}
}