Third International Conference on Spoken Language Processing (ICSLP 94)
This paper proposes a new prediction error normalization method for speaker verification using predictive neural networks. Predictive neural networks non-linearly predict a frame of an input from the preceding several frames and compute a prediction error; The error strongly depends on a particular input and the goodness of the fit is difficult to determine by comparing the value with a fixed threshold. We propose a normalization method which uses the prediction error obtained by a network trained for multiple speakers as a measurement of predictability of an input. The algorithm was evaluated in text-independent speaker verification. Without normalization, an equal error rate of 41.2% was achieved for 12 male speaker verification, using the normalization, the equal error rate was improved drastically to 1.5%. The proposed algorithm is also applicable to other speech processing areas which involve comparison with a threshold such as word spotting and rejection of unknown words.
Bibliographic reference. Hattori, Hiroaki (1994): "A normalization method of prediction error for neural networks", In ICSLP-1994, 1543-1546.