7th International Conference on Spoken Language Processing
September 16-20, 2002
In this paper the speech recognition performance obtained when using Distributed Speech Recognition (DSR) architecture is compared to that obtained when the speech is first transcoded using the Adaptive Multi-Rate (AMR) speech codec at 4.75 and 12.2 kbps. In a likeversus- like comparison, made using the Advanced DSR Front-end and the Aurora reference back-end, the DSR architecture gives substantial gains in speech recognition performance. The evaluations measure the change in Word Error Rate (WER) on the Aurora 2 and Aurora 3 databases with "perfect" endpoints. The performance with AMR 4.75 is 50% worse than DSR on Aurora 2 and 47% worse on Aurora 3. Even with the higher data rate of AMR 12.2, AMR is 17% worse than DSR on Aurora 2 and 20% worse on Aurora 3.
Bibliographic reference. Kelleher, Holly / Pearce, David / Ealey, Doug / Mauuary, Laurent (2002): "Speech recognition performance comparison between DSR and AMR transcoded speech", In ICSLP-2002, 1873-1876.