Using context in automatic speech recognition allows the recognition
system to dynamically task-adapt and bring gains to a broad variety
of use-cases. An important mechanism of context-inclusion is on-the-fly
rescoring of hypotheses with contextual language model content available
only in real-time.
In systems where rescoring occurs on the lattice during its construction
as part of beam search decoding, hypotheses eligible for rescoring
may be missed due to pruning. This can happen for many reasons: the
language model and rescoring model may assign significantly different
scores, there may be a lot of noise in the utterance, or word prefixes
with a high out-degree may necessitate aggressive pruning to keep the
search tractable. This results in misrecognitions when contextually-relevant
hypotheses are pruned before rescoring, even if a contextual rescoring
model favors those hypotheses by a large margin.
We present a technique
to adapt the beam search algorithm to preserve hypotheses when they
may benefit from rescoring. We show that this technique significantly
reduces the number of search pruning errors on rescorable hypotheses,
without a significant increase in the search space size. This technique
makes it feasible to use one base language model, but still achieve
high-accuracy speech recognition results in all contexts.
Cite as: Williams, I., Aleksic, P. (2017) Rescoring-Aware Beam Search for Reduced Search Errors in Contextual Automatic Speech Recognition. Proc. Interspeech 2017, 508-512, doi: 10.21437/Interspeech.2017-1671
@inproceedings{williams17_interspeech, author={Ian Williams and Petar Aleksic}, title={{Rescoring-Aware Beam Search for Reduced Search Errors in Contextual Automatic Speech Recognition}}, year=2017, booktitle={Proc. Interspeech 2017}, pages={508--512}, doi={10.21437/Interspeech.2017-1671} }