This paper presents the JHU HLTCOE submission to the NIST 2015 Language Recognition Evaluation, including critical and novel algorithmic components, use of limited and augmented training data, and additional post-evaluation analysis and improvements. All of our systems used i-vectors based on Deep Neural Networks (DNNs) with discriminatively-trained Gaussian classifiers, and linear fusion was performed with duration-dependent scaling. A key innovation was the use of three different kinds of i-vectors: acoustic, phonotactic, and joint. In addition, data augmentation was used to overcome the limited training data of this evaluation. Post-evaluation analysis shows the benefits of these design decisions, as well as further potential improvements.
Mccree, A., Sell, G., Garcia-Romero, D. (2016) Augmented Data Training of Joint Acoustic/Phonotactic DNN i-vectors for NIST LRE15. Proc. Odyssey 2016, 204-209.
@inproceedings{Mccree+2016, author={Alan Mccree and Greg Sell and Daniel Garcia-Romero}, title={Augmented Data Training of Joint Acoustic/Phonotactic DNN i-vectors for NIST LRE15}, year=2016, booktitle={Odyssey 2016}, doi={10.21437/Odyssey.2016-29}, url={http://dx.doi.org/10.21437/Odyssey.2016-29}, pages={204--209} }