16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Garbage Modeling for On-Device Speech Recognition

Christophe Van Gysel (1), Leonid Velikovich (2), Ian McGraw (2), Françoise Beaufays (2)

(1) University of Amsterdam, The Netherlands
(2) Google, USA

User interactions with mobile devices increasingly depend on voice as a primary input modality. Due to the disadvantages of sending audio across potentially spotty network connections for speech recognition, in recent years there has been growing attention to performing recognition on-device. The limited computational resources, however, typically require additional model constraints. In this work, we explore the task of on-device utterance verification, wherein the recognizer must transcribe an utterance if it is in a target set or reject it as being out of domain. We present a data-driven methodology for mining tens of thousands of target phrases from an existing corpus. We then compare two common garbage-modeling approaches to utterance verification: a sub-word rejection model and a white-listed n-gram model. We examine a deficiency of the sub-word modeling approach and introduce a novel modification that makes use of common prefixes between targeted phrases and non-targeted phrases. We show good performance in the trade-off between recall and word error rate using both the prefix and white-listed n-gram approaches. Finally, we evaluate the prefix-based approach in a hybrid setting where rejected instances are sent to a server-side recognizer.

Full Paper

Bibliographic reference.  Gysel, Christophe Van / Velikovich, Leonid / McGraw, Ian / Beaufays, Françoise (2015): "Garbage modeling for on-device speech recognition", In INTERSPEECH-2015, 2127-2131.