Improving the Automatic Speech Recognition through the improvement of Laguage Models

Andrés Piñeiro-Martín, Carmen García-Mateo, Laura Docío-Fernández


Language Models are one of the pillars on which the performance of automatic speech recognizer systems is based. Statistical language models based on the probability of word sequence (n-grams) are the most used, although deep neural networks begin to be applied. This is possible due to the increase of computation power along with improvements of algorithms. In this paper, the impact they have on the recognition result is studied in the following situations: 1) when they are adjusted to the work environment of the final application, and 2) when the complexity of these models grows by increasing the order of the n-gram models or applying deep neural networks. Specifically, an automatic speech recognition system with the different language models has been applied to audio recordings corresponding to three experimental frameworks: formal orality, talk on newscasts, and TED talks in Galician. The experimental results showed that improving the language models quality gives an improvement on the recognition performance.


 DOI: 10.21437/IberSPEECH.2018-8

Cite as: Piñeiro-Martín, A., García-Mateo, C., Docío-Fernández, L. (2018) Improving the Automatic Speech Recognition through the improvement of Laguage Models. Proc. IberSPEECH 2018, 35-39, DOI: 10.21437/IberSPEECH.2018-8.


@inproceedings{Piñeiro-Martín2018,
  author={Andrés Piñeiro-Martín and Carmen García-Mateo and Laura Docío-Fernández},
  title={{Improving the Automatic Speech Recognition through the improvement of Laguage Models}},
  year=2018,
  booktitle={Proc. IberSPEECH 2018},
  pages={35--39},
  doi={10.21437/IberSPEECH.2018-8},
  url={http://dx.doi.org/10.21437/IberSPEECH.2018-8}
}