9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Speaker Identification for Whispered Speech Based on Frequency Warping and Score Competition

Xing Fan, John H. L. Hansen

University of Texas at Dallas, USA

In certain situations, talkers will intentionally use whisper instead of neutral speech for the sake of privacy or confidentiality, which severely degrades the performance of speaker identification systems trained with only neutral speech. There are considerable differences in the spectral structure between whisper and neutral speech due to an absence of voice harmonic excitation. This study introduces a new feature based on frequency warping and score competition for the task of speaker identification for whisper. The proposed feature method is evaluated on a corpus of male speakers in both neutral and whisper. Closed set speaker ID results show an absolute 27% improvement in accuracy when compared with a traditional MFCC feature based system. The result confirms a viable approach to improving speaker ID performance between neutral and whisper speech condition.

