INTERSPEECH 2011
12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Time- and Acoustic-Mediated Alignment Algorithms for Speech Recognition Evaluation

Simon Dobrišek, France Mihelič

University of Ljubljana, Slovenia

The paper investigates the time- and acoustic-mediated alignment algorithms that can be used for better speech recognition evaluation. The edit-cost function, which weights the cost of speech unit matches, substitutions, deletions and insertions, is defined as a function of timed symbols or even as a function of speech signal segments. The algorithms are compared using several classical statistical measures of different types that are derived from speech recognition confusion matrices and are normally used to measure the agreement between different classifications of the same set of objects. These measures provide a reasonable indication that the investigated algorithms provide more relevant speech recognition error statistics than the algorithms that are commonly used for this purpose.

Full Paper

Bibliographic reference.  Dobrišek, Simon / Mihelič, France (2011): "Time- and acoustic-mediated alignment algorithms for speech recognition evaluation", In INTERSPEECH-2011, 1517-1520.