Interspeech'2005 - Eurospeech
In this paper, we examine a coding scheme for quantising feature vectors in a distributed speech recognition environment that is more robust to noise. It consists of a vector quantiser that operates on the logarithmic filterbank energies (LFBEs). Through the use of a perceptually-weighted Euclidean distance measure, which emphasises the LFBEs that represent the spectral peaks, the vector quantiser codebook provides a priori knowledge of the spectral characteristics of clean speech and is used to quantise features from noise-corrupted speech. Our comparative results from the ETSI Aurora-2 recognition task show that the perceptually-weighted vector quantisation of LFBEs achieves higher recognition accuracies for noisy speech than the unweighted vector quantisation, memoryless and multi-frame GMM-based block quantisation and scalar quantisation of Mel frequency-warped cepstral coefficients.
Bibliographic reference. So, Stephen / Paliwal, Kuldip K. (2005): "Improved noise-robustness in distributed speech recognition via perceptually-weighted vector quantisation of filterbank energies", In INTERSPEECH-2005, 941-944.