Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

High Performance Text-Independent Speaker Recognition System Based on Voiced/Unvoiced Segmentation and Multiple Neural Nets

Nikos Fakotakis, John Sirigos, George Kokkinakis

Wire Communications Laboratory University of Patras, Greece

This paper presents a text-independent speaker recognition system based on the voiced segments of the speech signal. The proposed system uses feedforward MLP classification with only a limited amount of training and testing data and gives a comparatively high accuracy. The techniques employed are: the Rasta-PLP speech analysis for parameter estimation, a feedforward MLP for voiced/unvoiced segmentation and a large number (equal to the number of speakers) of simple MLPs for the classification procedure. The system has been trained and tested using TIMIT and NTIMIT databases. The verification experiments presented a high accuracy rate: above 99% for clean speech (TIMIT) and 74.7%, for noisy speech (NTIMIT). Additional experiments were performed comparing the proposed approach of using voiced segments with only vowels and all phonetic categories with results favorable to the use of voiced segments.

Full Paper (PDF)

Bibliographic reference.  Fakotakis, Nikos / Sirigos, John / Kokkinakis, George (1999): "High performance text-independent speaker recognition system based on voiced/unvoiced segmentation and multiple neural nets", In EUROSPEECH'99, 979-982.