ISCA Archive Interspeech 2021
ISCA Archive Interspeech 2021

Text Augmentation for Language Models in High Error Recognition Scenario

Karel Beneš, Lukáš Burget

In this paper, we explore several data augmentation strategies for training of language models for speech recognition. We compare augmentation based on global error statistics with one based on unigram statistics of ASR errors and with label-smoothing and its sampled variant. Additionally, we investigate the stability and the predictive power of perplexity estimated on augmented data. Despite being trivial, augmentation driven by global substitution, deletion and insertion rates achieves the best rescoring results. On the other hand, even though the associated perplexity measure is stable, it gives no better prediction of the final error rate than the vanilla one. Our best augmentation scheme increases the WER improvement from second-pass rescoring from 1.1% to 1.9% absolute on the CHiMe-6 challenge.


doi: 10.21437/Interspeech.2021-627

Cite as: Beneš, K., Burget, L. (2021) Text Augmentation for Language Models in High Error Recognition Scenario. Proc. Interspeech 2021, 1872-1876, doi: 10.21437/Interspeech.2021-627

@inproceedings{benes21_interspeech,
  author={Karel Beneš and Lukáš Burget},
  title={{Text Augmentation for Language Models in High Error Recognition Scenario}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={1872--1876},
  doi={10.21437/Interspeech.2021-627}
}