INTERSPEECH 2004 - ICSLP
While some powerful techniques have been developed for speaker recognition tasks , reliable decisions in a telephone environment can still be quite elusive. For example, using an in-house telephone database of long distance phone conversations, a GMM-UBM with 30 seconds for both training and testing yielded only a performance rate of 56%. Part of the difficulty with speaker identification for telephone applications is the variations and distortions in the telephone networks. This paper uses a blind channel estimate of the audio channel to modify the mel-features used in speaker identification tasks.
Bibliographic reference. Wenndt, Stanley / Floyd, Richard (2004): "Channel frequency response correction for speaker recognition", In INTERSPEECH-2004, 1765-1768.