DNNs for Unsupervised Extraction of Pseudo FMLLR Features Without Explicit Adaptation Data

Neethu Mariam Joy, Murali Karthick Baskar, S. Umesh, Basil Abraham


In this paper, we propose the use of deep neural networks (DNN) as a regression model to estimate feature-space maximum likelihood linear regression (FMLLR) features from unnormalized features. During training, the pair of unnormalized features as input and corresponding FMLLR features as target are provided and the network is optimized to reduce the mean-square error between output and target FMLLR features. During test, the unnormalized features are passed through this DNN feature extractor to obtain FMLLR-like features without any supervision or first pass decode. Further, the FMLLR-like features are generated frame-by-frame, requiring no explicit adaptation data to extract the features unlike in FMLLR or i-vector. Our proposed approach is therefore suitable for scenarios where there is little adaptation data. The proposed approach provides sizable improvements over basis-FMLLR and conventional FMLLR when normalization is done at utterance level on TIMIT and Switchboard-33hour data sets.


DOI: 10.21437/Interspeech.2016-904

Cite as

Joy, N.M., Baskar, M.K., Umesh, S., Abraham, B. (2016) DNNs for Unsupervised Extraction of Pseudo FMLLR Features Without Explicit Adaptation Data. Proc. Interspeech 2016, 3479-3483.

Bibtex
@inproceedings{Joy+2016,
author={Neethu Mariam Joy and Murali Karthick Baskar and S. Umesh and Basil Abraham},
title={DNNs for Unsupervised Extraction of Pseudo FMLLR Features Without Explicit Adaptation Data},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-904},
url={http://dx.doi.org/10.21437/Interspeech.2016-904},
pages={3479--3483}
}