8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

From WER and RIL to MER and WIL: Improved Evaluation Measures for Connected speech Recognition

Andrew Cameron Morris (1), Viktoria Maier (2), Phil Green (3)

(1) University of Saarland, Germany
(2) Dalle Molle Institute for Perceptual Artificial Intelligence (IDIAP), Switzerland
(3) University of Sheffield, England

The word error rate (WER), commonly used in ASR assessment, measures the cost of restoring the output word sequence to the original input sequence. However, for most CSR applications apart from dictation machines a more meaningful performance measure would be given by the proportion of information communicated. In this article we introduce two new absolute CSR performance measures: MER (match error rate) and WIL (word information lost). MER is the proportion of I/O word matches which are errors. WIL is a simple approximation to the proportion of word information lost which overcomes the problems associated with the RIL (relative information lost) measure that was proposed half a century ago. Issues relating to ideal performance measurement are discussed and the commonly used Viterbi input/output alignment procedure, with zero weight for hits and equal weight for substitutions, deletions and insertions, is shown to be optimal.

Full Paper

Bibliographic reference.  Morris, Andrew Cameron / Maier, Viktoria / Green, Phil (2004): "From WER and RIL to MER and WIL: improved evaluation measures for connected speech recognition", In INTERSPEECH-2004, 2765-2768.