This paper presents new techniques for front-end analysis using long-term and temporal information for speaker recognition. We propose a long-term feature analysis strategy that averages short-time spectral features over a period of time in an effort to capture the speaker traits that are manifested over a speech segment longer than a spectral frame. We found that the moving averages of temporal information are effective in speaker recognition as well. The experiments on the 2008 NIST Speaker Recognition Evaluation dataset show the long-term and temporal information contribute to substantial EER reductions.
Bibliographic reference. Huang, Chien-Lin / Sun, Hanwu / Ma, Bin / Li, Haizhou (2010): "Speaker characterization using long-term and temporal information", In INTERSPEECH-2010, 370-373.