COST278 and ISCA Tutorial and Research Workshop (ITRW) on Robustness Issues in Conversational Interaction
University of East Anglia, Norwich, UK
The desire for improved user interfaces for distributed speech and multimodal services on mobile devices has motivated the need for reliable recognition performance over mobile channels. Performance needs to be robust both to background noise and to any errors introduced by the mobile transmission channel. There has been much work in the telecommunications standards bodies to develop standards to achieve this (ETSI Aurora and 3GPP). The Aurora interest in noise robust frontends is well known but in this paper the emphasis is given to the topic of channel robustness. The general area of channel robustness is very large so this paper takes the perspective of mobile telecommunications standards and the Distributed Speech Recognition (DSR) approach to robustness.
As background, the paper first provides an overview of the work in different standards bodies on DSR: the DSR standards created in ETSI Aurora; the work on Speech Enabled Services in 3GPP; the transport protocols in IETF. The different mobile channel types are reviewed next using the particular example of the GSM network. Drawing results from sources in the literature and in the standards bodies, a comparison is made between performance using a voice codec or DSR. Comparison is first made in error-free conditions to separate out the effects of speech compression. Robustness to channel errors is then examined; both with circuit-switched errors and with packet-switched errors. Finally some more advanced error mitigation techniques are cited. These are compatible with the DSR features and can provide even greater robustness with poor channels.
Bibliographic reference. Pearce, David (2004): "Robustness to transmission channel - the DSR approach", In Robust2004, paper KDP.