September 22-25, 1997
The performance of speaker recognition algorithms drops significantly when testing and training acoustic environments differ. This decrease is caused by the statistical mismatch between the statistics representing the speaker and the testing acoustic data. This paper reports our preliminary results on the application of a novel environmental compensation algorithm to the problem of speaker recognition and identification. This new technique, called the Delta Vector Taylor Series (DVTS) approach, improves performance at signal-to-noise ratios below 20dB. The algorithm imposes a model of how the envi- ronment modifies speaker statistics and uses Expectation- Maximization (EM) to solve a joint maximum likelihood formulation for the speaker recognition problem over both the speakers and the environment. We report experimental results on a subset of the TIMIT and NTIMIT database.
Bibliographic reference. Eberman, Brian / Moreno, Pedro J. (1997): "Delta vector taylor series environment compensation for speaker recognition", In EUROSPEECH-1997, 2335-2338.