5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Product-Code Vector Quantization of Cepstral Parameters for Speech Recognition Over the WWW

Vassilis Digalakis (1), Leonardo Neumeyer (2), Manolis Perakakis (1)

(1) Technical University of Crete, Greece
(2) SRI International, USA

We follow the paradigm that we have previously introduced for the encoding of the recognizer parameters in a client-server model used for recognition over wireless networks and the WWW, trying to maximize recognition performance instead of perceptual reproduction. We present a new encoding scheme for the mel frequency-warped cepstral parameters (MFCCs) that uses product-code vector quantization, and we find that the required bit rate to achieve the recognition performance of high-quality unquantized speech is just 2000 bits per second. We also investigate the effect of additive noise on the recognition performance when quantized features are used, and we find that a small increase in the bit rate can provide the necessary robustness.

Full Paper

Bibliographic reference.  Digalakis, Vassilis / Neumeyer, Leonardo / Perakakis, Manolis (1998): "Product-code vector quantization of cepstral parameters for speech recognition over the WWW", In ICSLP-1998, paper 0940.