7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Channel Noise Robustness for Low-Bitrate Remote Speech Recognition

Alexis Bernard, Abeer Alwan

University of California at Los Angeles, USA

In remote (or distributed) speech recognition , the recognition features are quantized at the client, and transmitted to the server via wireless or packet-based communication for recognition. In this paper, we investigate the issue of robustness of remote speech recognition applications against channel noise. The techniques presented include: 1) optimal soft decision channel decoding allowing for error detection, 2) weighted Viterbi recognition (WVR) with weighting coefficients based on the channel decoding reliability, 3) frame erasure concealment, and 4) WVR with weighting coefficients based on the quality of the erasure concealment operation. The techniques presented are implemented at the receiver (server), which limit the complexity for the client, and significantly extend the range of channel conditions for which remote recognition can be sustained. As a case study, we illustrate that remote recognition based on perceptual linear prediction (PLP) coefficients is able to provide at less than 500 bps, good recognition accuracy over a wide range of channel conditions. Filtering the Spectral Parameters to Mitigate the


Full Paper

Bibliographic reference.  Bernard, Alexis / Alwan, Abeer (2002): "Channel noise robustness for low-bitrate remote speech recognition", In ICSLP-2002, 2213-2216.