INTERSPEECH 2008
9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Improving Large Scale Alphanumeric String Recognition Using Redundant Information

Ea-Ee Jan (1), Osamuyimen Stewart (1), Raymond Co (2), David Lubensky (1)

(1) IBM T.J. Watson Research Center, USA; (2) IBM Canada Global Business Services, Canada

This paper describes a framework for improving recognition performance and user experience in large scale alphanumeric listings commonly used in conversational speech applications for enterprise. The performance of these speech recognition grammars is severely impacted due to the poor recognition of alphabets. We propose a new approach based on augmenting performance through redundant semantic information. This provides additional acoustic features, which, although is redundant in the semantic space, improves performance by 30% in Canadian postal code application and serial number recognition. The additional queries for redundant semantic information are asked only when necessary: when the system makes false acceptance errors. This ensures that user satisfaction is not interrupted through needless questioning. Furthermore, we propose a way to compress the listing grammar by at least 85% in footprint with minimum performance impact due to good performance in digit recognition. This framework can be extended for general large scale alphanumeric listing grammars.

Full Paper

Bibliographic reference.  Jan, Ea-Ee / Stewart, Osamuyimen / Co, Raymond / Lubensky, David (2008): "Improving large scale alphanumeric string recognition using redundant information", In INTERSPEECH-2008, 491-494.