2016 BUT Babel System: Multilingual BLSTM Acoustic Model with i-Vector Based Adaptation

Martin Karafiát, Murali Karthick Baskar, Pavel Matějka, Karel Veselý, František Grézl, Lukáš Burget, Jan Černocký


The paper provides an analysis of BUT automatic speech recognition systems (ASR) built for the 2016 IARPA Babel evaluation. The IARPA Babel program concentrates on building ASR system for many low resource languages, where only a limited amount of transcribed speech is available for each language. In such scenario, we found essential to train the ASR systems in a multilingual fashion. In this work, we report superior results obtained with pre-trained multilingual BLSTM acoustic models, where we used multi-task training with separate classification layer for each language. The results reported on three Babel Year 4 languages show over 3% absolute WER reductions obtained from such multilingual pre-training. Experiments with different input features show that the multilingual BLSTM performs the best with simple log-Mel-filter-bank outputs, which makes our previously successful multilingual stack bottleneck features with CMLLR adaptation obsolete. Finally, we experiment with different configurations of i-vector based speaker adaptation in the mono- and multi-lingual BLSTM architectures. This results in additional WER reductions over 1% absolute.


 DOI: 10.21437/Interspeech.2017-1775

Cite as: Karafiát, M., Baskar, M.K., Matějka, P., Veselý, K., Grézl, F., Burget, L., Černocký, J. (2017) 2016 BUT Babel System: Multilingual BLSTM Acoustic Model with i-Vector Based Adaptation. Proc. Interspeech 2017, 719-723, DOI: 10.21437/Interspeech.2017-1775.


@inproceedings{Karafiát2017,
  author={Martin Karafiát and Murali Karthick Baskar and Pavel Matějka and Karel Veselý and František Grézl and Lukáš Burget and Jan Černocký},
  title={2016 BUT Babel System: Multilingual BLSTM Acoustic Model with i-Vector Based Adaptation},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={719--723},
  doi={10.21437/Interspeech.2017-1775},
  url={http://dx.doi.org/10.21437/Interspeech.2017-1775}
}