Domain Adaptation Using Factorized Hidden Layer for Robust Automatic Speech Recognition

Khe Chai Sim, Arun Narayanan, Ananya Misra, Anshuman Tripathi, Golan Pundak, Tara Sainath, Parisa Haghani, Bo Li, Michiel Bacchiani


Domain robustness is a challenging problem for automatic speech recognition (ASR). In this paper, we consider speech data collected for different applications as separate domains and investigate the robustness of acoustic models trained on multi-domain data on unseen domains. Specifically, we use Factorized Hidden Layer (FHL) as a compact low-rank representation to adapt a multi-domain ASR system to unseen domains. Experimental results on two unseen domains show that FHL is a more effective adaptation method compared to selectively fine-tuning part of the network, without dramatically increasing the model parameters. Furthermore, we found that using singular value decomposition to initialize the low-rank bases of an FHL model leads to a faster convergence and improved performance.


 DOI: 10.21437/Interspeech.2018-2246

Cite as: Sim, K.C., Narayanan, A., Misra, A., Tripathi, A., Pundak, G., Sainath, T., Haghani, P., Li, B., Bacchiani, M. (2018) Domain Adaptation Using Factorized Hidden Layer for Robust Automatic Speech Recognition. Proc. Interspeech 2018, 892-896, DOI: 10.21437/Interspeech.2018-2246.


@inproceedings{Sim2018,
  author={Khe Chai Sim and Arun Narayanan and Ananya Misra and Anshuman Tripathi and Golan Pundak and Tara Sainath and Parisa Haghani and Bo Li and Michiel Bacchiani},
  title={Domain Adaptation Using Factorized Hidden Layer for Robust Automatic Speech Recognition},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={892--896},
  doi={10.21437/Interspeech.2018-2246},
  url={http://dx.doi.org/10.21437/Interspeech.2018-2246}
}