We propose a novel design for acoustic feature-based automatic spoken language recognizers. Our design is inspired by recent advances in text-independent speaker recognition, where intraclass variability is modeled by factor analysis in Gaussian mixture model (GMM) space. We use approximations to GMM-likelihoods which allow variable-length data sequences to be represented as statistics of fixed size. Our experiments on NIST LRE07 show that variability-compensation of these statistics can reduce error-rates by a factor of three. Finally, we show that further improvements are possible with discriminative logistic regression training.
Cite as: Brümmer, N., Strasheim, A., Hubeika, V., Matějka, P., Burget, L., Glembek, O. (2009) Discriminative acoustic language recognition via channel-compensated GMM statistics. Proc. Interspeech 2009, 2187-2190, doi: 10.21437/Interspeech.2009-623
@inproceedings{brummer09_interspeech, author={Niko Brümmer and Albert Strasheim and Valiantsina Hubeika and Pavel Matějka and Lukáš Burget and Ondřej Glembek}, title={{Discriminative acoustic language recognition via channel-compensated GMM statistics}}, year=2009, booktitle={Proc. Interspeech 2009}, pages={2187--2190}, doi={10.21437/Interspeech.2009-623} }