Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

A Speech Similarity Distance Weighting for Robust Recognition

Michael J. Carey (1), Tuan P. Quang (2)

(1) University of Birmingham, UK; (2) University of Bristol, UK

Human listeners not only perform better than machine recognizers at the same signal to noise ratio but are also able to deal with the problems presented by non-stationary noise better. In this paper we describe a series of experiments in which the human ability to select clean segments of speech from a noisy environment is emulated by a machine recogniser. We show that a vector quantiser that incorporates speaker specific information can be used to estimate the similarity between the input signal and speech vector in a codebook and so produce a probabilistic weighting for the distances used in pattern matching. The performance of the probabilistic system is slightly worse with stationary noise than a system using noise-matched models however it is better on non-stationary noise although the matched system has knowledge of the background noise conditions while the probabilistic system has no information about the noise.

Full Paper

Bibliographic reference.  Carey, Michael J. / Quang, Tuan P. (2005): "A speech similarity distance weighting for robust recognition", In INTERSPEECH-2005, 1257-1260.