7th International Conference on Spoken Language Processing
September 16-20, 2002
In remote (or distributed) speech recognition , the recognition features are quantized at the client, and transmitted to the server via wireless or packet-based communication for recognition. In this paper, we investigate the issue of robustness of remote speech recognition applications against channel noise. The techniques presented include: 1) optimal soft decision channel decoding allowing for error detection, 2) weighted Viterbi recognition (WVR) with weighting coefficients based on the channel decoding reliability, 3) frame erasure concealment, and 4) WVR with weighting coefficients based on the quality of the erasure concealment operation. The techniques presented are implemented at the receiver (server), which limit the complexity for the client, and significantly extend the range of channel conditions for which remote recognition can be sustained. As a case study, we illustrate that remote recognition based on perceptual linear prediction (PLP) coefficients is able to provide at less than 500 bps, good recognition accuracy over a wide range of channel conditions. Filtering the Spectral Parameters to Mitigate the
Bibliographic reference. Bernard, Alexis / Alwan, Abeer (2002): "Channel noise robustness for low-bitrate remote speech recognition", In ICSLP-2002, 2213-2216.