9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Discriminative Rescoring Based on Minimization of Word Errors for Transcribing Broadcast News

Akio Kobayashi, Takahiro Oku, Shinichi Homma, Shoei Sato, Toru Imai, Tohru Takagi

NHK, Japan

This paper describes a novel method of rescoring that reflects tendencies of errors in word hypotheses in speech recognition for transcribing broadcast news, including ill-trained spontaneous speech. The proposed rescoring assigns penalties to sentence hypotheses according to the recognition error tendencies in the training lattices themselves using a set of weighting factors for feature functions activated by a variety of linguistic contexts. Word hypotheses with low possibilities of correct words are penalized while those with high possibilities are rewarded by the weighting factors. We introduce two types of training techniques to obtain the factors. The first is based on conditional random fields (CRFs), and the second is based on the minimization of word errors, which explicitly reduces expected word errors. The results of transcribing Japanese broadcast news achieved a word error rate (WER) of 10.38%, which was a 6.06% reduction relative to conventional lattice rescoring.

Full Paper

Bibliographic reference.  Kobayashi, Akio / Oku, Takahiro / Homma, Shinichi / Sato, Shoei / Imai, Toru / Takagi, Tohru (2008): "Discriminative rescoring based on minimization of word errors for transcribing broadcast news", In INTERSPEECH-2008, 1574-1577.