11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Exploring Subsegmental and Suprasegmental Features for a Text-Dependent Speaker Verification in Distant Speech Signals

B. Avinash (1), S. Guruprasad (2), Bayya Yegnanarayana (1)

(1) IIIT Hyderabad, India
(2) IIT Madras, India

Existing automatic speaker verification (ASV) systems perform with high accuracy when the speech signal is collected close to the mouth of the speaker (< 1 ft). However, the performance of these systems reduces significantly when speech signals are collected at a distance from the speaker (2-6 ft). The objective of this paper is to address some issues in the processing of speech signals collected at a distance from the speaker, for text-dependent ASV system. An acoustic feature derived from short segments of speech signals is proposed for the ASV task. The key idea is to exploit the high signal-to-noise nature of short segments of speech in the vicinity of impulse-like excitations. We show that the proposed feature yields better performance of speaker verification than the mel-frequency cepstral coefficients (MFCCs). In addition, regions of high signal-to-reverberation ratio, duration and pitch information are used to improve the performance of the ASV system for distant speech.

Full Paper

Bibliographic reference.  Avinash, B. / Guruprasad, S. / Yegnanarayana, Bayya (2010): "Exploring subsegmental and suprasegmental features for a text-dependent speaker verification in distant speech signals", In INTERSPEECH-2010, 1073-1076.