9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Structured Models for Joint Decoding of Repeated Utterances

Geoffrey Zweig, Dan Bohus, Xiao Li, Patrick Nguyen

Microsoft Research, USA

Due to speech recognition errors, repetition can be a frequent occurrence in voice-search applications. While a proper treatment of this phenomenon requires the joint modeling of two or more utterances simultaneously, currently deployed systems typically treat the utterances independently. In this paper, we analyze the structure of repetitions and find that in at least one commercial directory assistance application, repetitions follow simple structural transformations more than 70% of the time. We present preliminary results that suggest that significant gains are possible by explicitly modeling this structure in a joint decoding process.

