ESCA Workshop on Automatic Speaker Recognition, Identification, and Verification
In this paper we introduce a new philosophy of extracting robust features in speech systems based on intelligent processing of the eigenmodes of speech. Intrinsic to this philosophy is an explanation why linear predictive (LP) cepstra of speech provide a powerful feature set for recognition systems. Poles or the eigenmodes of a frame of speech are investigated under mismatches created by varying channel conditions for speaker identification systems.
The study of modes of speech has led to two related processing techniques, each of which provide a measurable degree of robustness under cross channel environments. One technique emphasizes processing of speech in the interframe domain (across many speech frames), while the other technique carries out an adaptive cepstral weighting of the intraframe (within a speech frame) LP spectral components. Experiments for the interframe techniques are presented using speech in the TIMIT database processed through a telephone channel simulator and a part of San Deigo portion of the King Database. Experiments of the intraframe technique are presented on the San Deigo portion of the King database. The techniques are shown to offer improved speaker identification performance when compared to related common methods in the interframe and intraframe domains.
Bibliographic reference. Naik, Devang / Assaleh, Khaled / Mammone, Richard J. (1994): "Robust speaker identification using pole filtering", In ASRIV-1994, 225-230.