16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Prediction of Speech Recognition Accuracy for Utterance Classification

Maxim L. Korenevsky, Andrey B. Smirnov, Valentin S. Mendelev

ITMO University, Russia

The paper deals with the problem of predicting speech recognition quality and filtering poorly recognized utterances in the case when no reference transcripts are available. In the proposed system, word error rate (WER) predictions for individual utterances are made using conditional random fields (CRF), and classification based on a given threshold is performed afterwards. We propose using a boosting technique, which significantly increases recall for high precision values. We also apply Recurrent Neural Networks (RNN) directly to the utterance classification task and obtain comparable results but with a much simpler system. All experiments were carried out on Russian spontaneous conversational speech.

Full Paper

Bibliographic reference.  Korenevsky, Maxim L. / Smirnov, Andrey B. / Mendelev, Valentin S. (2015): "Prediction of speech recognition accuracy for utterance classification", In INTERSPEECH-2015, 1275-1279.