In this work we propose an acoustic-similarity based technique to improve the recognition of in-grammar utterances in typical directed-dialog applications where the Automatic Speech Recognition (ASR) system consists of one or more class-grammars embedded in the Language Model (LM). The proposed technique increases the transition cost of LM paths by a value proportional to the average acoustic similarity between that LM path and all the in-grammar utterances. Proposed modifications improve the in-grammar concept recognition rate by 0.5% absolute at lower grammar fanouts and by about 2% at higher fanouts as compared to a technique which reduces the probability of entering all the LM paths by a uniform value. The improvements are more pronounced as the fanout size of the grammar is increased and especially at operating points corresponding to lower False Accept (FA) values.
Bibliographic reference. Deshmukh, Om D. / Ikbal, Shajith / Verma, Ashish / Marcheret, Etienne (2011): "Acoustic-similarity based technique to improve concept recognition", In INTERSPEECH-2011, 1013-1016.