ISCA Archive Interspeech 2015
ISCA Archive Interspeech 2015

Real-time integration of dynamic context information for improving automatic speech recognition

Youssef Oualil, Marc Schulder, Hartmut Helmke, Anna Schmidt, Dietrich Klakow

The use of prior situational/contextual knowledge about a given task can significantly improve Automatic Speech Recognition (ASR) performance. This is typically done through adaptation of acoustic or language models if data is available, or using knowledge-based rescoring. The main adaptation techniques, however, are either domain-specific, which makes them inadequate for other tasks, or static and offline, and therefore cannot deal with dynamic knowledge. To circumvent this problem, we propose a real-time system which dynamically integrates situational context into ASR. The context integration is done either post-recognition, in which case a weighted Levenshtein distance between the ASR hypotheses and the context information, based on the ASR confidence scores, is proposed to extract the most likely sequence of spoken words;, or pre-recognition, where the search space is adjusted to the new situational knowledge through adaptation of the finite state machine modeling the spoken language. Experiments conducted on 3 hours of Air Traffic Control (ATC) data achieved a reduction of the Command Error Rate (CmdER), which is used as evaluation metric in the ATC domain, by a factor of 4 compared to using no contextual knowledge.


doi: 10.21437/Interspeech.2015-476

Cite as: Oualil, Y., Schulder, M., Helmke, H., Schmidt, A., Klakow, D. (2015) Real-time integration of dynamic context information for improving automatic speech recognition. Proc. Interspeech 2015, 2107-2111, doi: 10.21437/Interspeech.2015-476

@inproceedings{oualil15_interspeech,
  author={Youssef Oualil and Marc Schulder and Hartmut Helmke and Anna Schmidt and Dietrich Klakow},
  title={{Real-time integration of dynamic context information for improving automatic speech recognition}},
  year=2015,
  booktitle={Proc. Interspeech 2015},
  pages={2107--2111},
  doi={10.21437/Interspeech.2015-476}
}