This paper addresses the issue of confidence measure reliability provided by automatic speech recognition systems for use in various spoken language processing applications. We propose a method based on conditional random field to combine contextual features to improve word-level confidence measures. The method consists in combining various knowledge sources (acoustic, lexical, linguistic, phonetic and morphosyntactic) to enhance confidence measures, explicitly exploiting context information. Experiments were conducted on a large French broadcast news corpus from the ESTER benchmark. Results demonstrate the added-value of our method with a significant improvement of the normalized cross entropy and of the equal error rate.
Bibliographic reference. Fayolle, Julien / Moreau, Fabienne / Raymond, Christian / Gravier, Guillaume / Gros, Patrick (2010): "CRF-based combination of contextual features to improve a posteriori word-level confidence measures", In INTERSPEECH-2010, 1942-1945.