This paper describes a novel method of rescoring that reflects tendencies of errors in word hypotheses in speech recognition for transcribing broadcast news, including ill-trained spontaneous speech. The proposed rescoring assigns penalties to sentence hypotheses according to the recognition error tendencies in the training lattices themselves using a set of weighting factors for feature functions activated by a variety of linguistic contexts. Word hypotheses with low possibilities of correct words are penalized while those with high possibilities are rewarded by the weighting factors. We introduce two types of training techniques to obtain the factors. The first is based on conditional random fields (CRFs), and the second is based on the minimization of word errors, which explicitly reduces expected word errors. The results of transcribing Japanese broadcast news achieved a word error rate (WER) of 10.38%, which was a 6.06% reduction relative to conventional lattice rescoring.
Bibliographic reference. Kobayashi, Akio / Oku, Takahiro / Homma, Shinichi / Sato, Shoei / Imai, Toru / Takagi, Tohru (2008): "Discriminative rescoring based on minimization of word errors for transcribing broadcast news", In INTERSPEECH-2008, 1574-1577.