11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Improving ASR Error Detection with Non-Decoder Based Features

Thomas Pellegrini, Isabel Trancoso

INESC-ID Lisboa, Portugal

This study reports error detection experiments in large vocabulary automatic speech recognition (ASR) systems, by using statistical classifiers. We explored new features gathered from other knowledge sources than the decoder itself: a binary feature that compares outputs from two different ASR systems (word by word), a feature based on the number of hits of the hypothesized bigrams, obtained by queries entered into a very popular Web search engine, and finally a feature related to automatically infered topics at sentence and word levels. Experiments were conducted on a European Portuguese broadcast news corpus. The combination of baseline decoder-based features and two of these additional features led to significant improvements, from 13.87% to 12.16% classification error rate (CER) with a maximum entropy model, and from 14.01% to 12.39% CER with linear-chain conditional random fields, comparing to a baseline using only decoder-based features.

Full Paper

Bibliographic reference.  Pellegrini, Thomas / Trancoso, Isabel (2010): "Improving ASR error detection with non-decoder based features", In INTERSPEECH-2010, 1950-1953.