15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

BUT 2014 Babel System: Analysis of Adaptation in NN Based Systems

Martin Karafiát, František Grézl, Karel Veselý, Mirko Hannemann, Igor Szőke, Jan Černocký

Brno University of Technology, Czech Republic

Features based on a hierarchy of neural networks with compressive layers — Stacked Bottle-Neck (SBN) features — were recently shown to provide excellent performance in LVCSR systems. This paper summarizes several techniques investigated in our work towards Babel 2014 evaluations: (1) using several versions of fundamental frequency (F0) estimates, (2) semi-supervised training on un-transcribed data and mainly (3) adapting the NN structure at different levels. They are tested on three 2014 Babel languages with full GMM- and DNN-based systems. Separately and in combination, they are shown to outperform the baselines and confirm the usefulness of bottle-neck features in current ASR systems.

Full Paper

Bibliographic reference.  Karafiát, Martin / Grézl, František / Veselý, Karel / Hannemann, Mirko / Szőke, Igor / Černocký, Jan (2014): "BUT 2014 Babel system: analysis of adaptation in NN based systems", In INTERSPEECH-2014, 3002-3006.