Existing automatic speaker verification (ASV) systems perform with high accuracy when the speech signal is collected close to the mouth of the speaker (< 1 ft). However, the performance of these systems reduces significantly when speech signals are collected at a distance from the speaker (2-6 ft). The objective of this paper is to address some issues in the processing of speech signals collected at a distance from the speaker, for text-dependent ASV system. An acoustic feature derived from short segments of speech signals is proposed for the ASV task. The key idea is to exploit the high signal-to-noise nature of short segments of speech in the vicinity of impulse-like excitations. We show that the proposed feature yields better performance of speaker verification than the mel-frequency cepstral coefficients (MFCCs). In addition, regions of high signal-to-reverberation ratio, duration and pitch information are used to improve the performance of the ASV system for distant speech.
Bibliographic reference. Avinash, B. / Guruprasad, S. / Yegnanarayana, Bayya (2010): "Exploring subsegmental and suprasegmental features for a text-dependent speaker verification in distant speech signals", In INTERSPEECH-2010, 1073-1076.