We follow the paradigm that we have previously introduced for the encoding of the recognizer parameters in a client-server model used for recognition over wireless networks and the WWW, trying to maximize recognition performance instead of perceptual reproduction. We present a new encoding scheme for the mel frequency-warped cepstral parameters (MFCCs) that uses product-code vector quantization, and we find that the required bit rate to achieve the recognition performance of high-quality unquantized speech is just 2000 bits per second. We also investigate the effect of additive noise on the recognition performance when quantized features are used, and we find that a small increase in the bit rate can provide the necessary robustness.
Cite as: Digalakis, V., Neumeyer, L., Perakakis, M. (1998) Product-code vector quantization of cepstral parameters for speech recognition over the WWW. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0940, doi: 10.21437/ICSLP.1998-644
@inproceedings{digalakis98_icslp, author={Vassilis Digalakis and Leonardo Neumeyer and Manolis Perakakis}, title={{Product-code vector quantization of cepstral parameters for speech recognition over the WWW}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0940}, doi={10.21437/ICSLP.1998-644} }