Third International Conference on Spoken Language Processing (ICSLP 94)
A text-independent automatic speaker identification system was constructed and evaluated with the TIMIT database. All voiced parts of speech signals were automatically located by measuring the short-term energy of the signals. For each segment of the voiced signals LPC based cepstrum were calculated to compose a feature vector. Multilayer perceptron (MLP) and learning vector quantization (LVQ) networks were used as classifiers. The codebooks of the LVQ classifiers were initialized by the LBG algorithm and then were trained by the LVQ3 algorithm. The MLP classifiers were standard feed forward networks with one hidden layer and were trained in two steps by the conjugate gradient method. Speech data from 112 male speakers in the test subdivision of the TIMIT database were used to evaluate our system. For each speaker, we randomly selected eight sentences as training data and the remaining two as the testing ones. The results showed that the best correct identification rates were 88.4% by LVQ classifiers and 99.1% by MLP classifiers for a population of 112 speakers.
Bibliographic reference. He, Jialong / Liu, Li / Palm, GŁnther (1994): "A text-independent speaker identification system based on neural networks", In ICSLP-1994, 1851-1854.