Sixth International Conference on Spoken Language Processing
We report results of large vocabulary continuous speech recognition (LVCSR) experiments, conducted using speech data read over cellular and landline phones. Specifically, we compare (using stereo recordings) the speaker-independent and speakeradapted recognition word error rates (WERs) measured over cellular and landline networks, with those measured using a closetalking noise-canceling headset microphone, which serves as a baseline. A test set consisting of speech data recorded by 25 speakers is used; each speaker providing test and adaptation data. We use acoustic models trained from relatively high-quality training data and an interpolated trigram language model. Some insights into the relative degradation in WERs over telephone networks are also provided by examining the recognition error rates for bandlimited and coded microphone speech.
Bibliographic reference. Rao, Ashwin / Roth, Bob / Nagesha, Venkatesh / McAllaster, Don / Liberman, Natalie / Gillick, Larry (2000): "Large vocabulary continuous speech recognition of read speech over cellular and landline networks", In ICSLP-2000, vol.4, 402-405.