In Gaussian mixture model (GMM) approach to speaker recognition, it has been found that the maximum a posteriori (MAP) estimation is greatly affected by undesired variability due to varying duration of utterance as well as other hidden factors related to recording devices, session environment, and phonetic contents. We propose an adaptive relevance factor (RF) to compensate for this variability. In the other side, in realistic application, it is likely that the different channel corresponds to its different training and test conditions in terms of quantity and quality of the speech signals. In this connection, we develop a hybrid model that combines multiple complementary systems, each of which focuses on specific condition(s). We show the effectiveness of the proposed method on the core task of the National Institute of Standards and Technology (NIST) speaker recognition evaluation (SRE) 2008.
Bibliographic reference. You, Chang Huai / Li, Haizhou / Lee, Kong Aik (2010): "A hybrid modeling strategy for GMM-SVM speaker recognition with adaptive relevance factor", In INTERSPEECH-2010, 2746-2749.