EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Context-Sensitive Evaluation and Correction of Phone Recognition Output

Michael Levit (1), Hiyan Alshawi (1), Allen Gorin (1), Elmar Nöth (2)

(1) AT&T Labs-Research, USA
(2) Universität Erlangen-Nürnberg, Germany

In speech and language processing, information about the errors made by a learning system is commonly used to assess and improve its performance. Because of high computational complexity, the context of the errors is usually either ignored, or exploited in a simplistic form. The complexity becomes tractable, however, for phone recognition because of the small lexicon. For phone-based systems, an exhaustive modeling of local context is possible. Furthermore, recent research studies have shown phone recognition to be useful for several spoken language processing tasks. In this paper, we present a mechanism which learns patterns of context-sensitive errors from ASR-output aligned with the "true" phone transcriptions. We also show how this information, encoded as a context-sensitive weighted transducer, can provide a modest improvement to phone recognition accuracy even when no transcriptions are available for the domain of interest.

Full Paper

Bibliographic reference.  Levit, Michael / Alshawi, Hiyan / Gorin, Allen / Nöth, Elmar (2003): "Context-sensitive evaluation and correction of phone recognition output", In EUROSPEECH-2003, 925-928.