Feature fusion is a paradigm that has found success in a number of speech related tasks. The primary objective in applying fusion is to leverage the complementary information present in the features. Conventionally, either early or late fusion is employed. Early fusion leads to large dimensional feature vectors. Further, the range of feature values for different streams require appropriate normalisation. Late fusion is carried out at score level, where the contribution from each type of feature is determined from the set of weights used. Feature switching is yet another paradigm that attempts to capture the diversity in the feature types used. Feature switching gains significance particularly in the context of speaker verification, where the feature type that best discriminates a speaker is used to verify the claims corresponding to that speaker. Earlier, feature switching was attempted in the conventional UBM-GMM framework. In this paper, the idea is extended to the Total Variability Space (TVS) framework. Two different feature types namely Modified Group Delay (MGD) and Mel-Frequency Cepstral Coefficients (MFCC) are explored in the proposed framework. Results are presented on NIST 2010 male database for the speaker verification task.
Bibliographic reference. Asha, T. / Saranya, M. S. / Pandia, D. S. Karthik / Madikeri, Srikanth / Murthy, Hema A. (2014): "Feature Switching in the i-vector framework for speaker verification", In INTERSPEECH-2014, 1125-1129.