The identification of keyword queries in speech data from low-resources languages poses a challenge for current methods as speech recognition algorithms lack sufficient training data to produce high accuracy transcript. To compensate for these shortcomings, we extract signals from the data that are useful in keyword identification but are not being used by the speech recognizer. These signals take multiple forms word burstiness, rescored confusion network posteriors and acoustic/prosodic qualities. The former denotes the tendency for keywords to occur in bursts within a conversational topic. We employ three different strategies to exploit this information: 1) a four-way classification of keyword hypotheses that targets low-scoring correct hits and high-scoring false alarms, 2) ranking algorithms, and 3) a direct adjustment of keyword hit scores based on hypothesized repetition. We find that interpolating the results of these three strategies in an ensemble provides a reliable way to improve the results of keyword search.
Bibliographic reference. Ma, Min / Richards, Justin / Soto, Victor / Hirschberg, Julia / Rosenberg, Andrew (2014): "Strategies for rescoring keyword search results using word-burst and acoustic features", In INTERSPEECH-2014, 2769-2773.