5th International Conference on Spoken Language Processing
We provide an analysis of the relative importance of components of the modulation spectrum for speaker verification. The aim is to remove less relevant components and reduce system sensitivity to acoustic disturbances while improving verification accuracy. Spectral components between about 0.1Hz and 10Hz are found to contain the most useful speaker information. We discuss this result in the context of RASTA processing and cepstral mean subtraction. When compared to cepstral mean subtraction that retains components up to 50Hz, lowpass filtering to 10Hz with downsampling by 75 percent is found to significantly improve robustness in mismatched conditions. The downsampling results in a large computational savings.
Bibliographic reference. Vuuren, Sarel van / Hermansky, Hynek (1998): "On the importance of components of the modulation spectrum for speaker verification", In ICSLP-1998, paper 0631.