ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

A speech similarity distance weighting for robust recognition

Michael J. Carey, Tuan P. Quang

Human listeners not only perform better than machine recognizers at the same signal to noise ratio but are also able to deal with the problems presented by non-stationary noise better. In this paper we describe a series of experiments in which the human ability to select clean segments of speech from a noisy environment is emulated by a machine recogniser. We show that a vector quantiser that incorporates speaker specific information can be used to estimate the similarity between the input signal and speech vector in a codebook and so produce a probabilistic weighting for the distances used in pattern matching. The performance of the probabilistic system is slightly worse with stationary noise than a system using noise-matched models however it is better on non-stationary noise although the matched system has knowledge of the background noise conditions while the probabilistic system has no information about the noise.


doi: 10.21437/Interspeech.2005-481

Cite as: Carey, M.J., Quang, T.P. (2005) A speech similarity distance weighting for robust recognition. Proc. Interspeech 2005, 1257-1260, doi: 10.21437/Interspeech.2005-481

@inproceedings{carey05_interspeech,
  author={Michael J. Carey and Tuan P. Quang},
  title={{A speech similarity distance weighting for robust recognition}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={1257--1260},
  doi={10.21437/Interspeech.2005-481}
}