Odyssey 2012 - The Speaker and Language Recognition Workshop
Probabilistic modeling is the most successful approach widely used in speaker recognition either for modeling the speakers in GMM-UBM structure or by serving as a prior in secondary-level feature extraction to form i-vectors. In this paper, we introduce exemplar-based sparse representation and sparse discrimination for closed-set speaker identification in a noisy living room from very short speech segments each of 2 seconds length on average. Large spectro-temporal contexts in mel-frequency band energy domain are used to build dictionary of all speakers and decomposing the observed noisy speech, the sparse activations are extracted as features for modeling stage. Sparse discriminant analysis is employed to learn sparse discriminative directions for classification stage. Experiments on the recently developed computational hearing in multi source environments (CHiME) corpus demonstrate excellent performance of the proposed approach specially in low-SNR. The speaker identification results are also reported for baseline text-independent GMM-UBM and text-dependent HMM.
Bibliographic reference. Saeidi, Rahim / Hurmalainen, Antti / Virtanen, Tuomas / Leeuwen, David A. van (2012): "Exemplar-based sparse representation and sparse discrimination for noise robust speaker identification", In Odyssey-2012, 248-255.