Better Evaluation of ASR in Speech Translation Context Using Word Embeddings

Ngoc-Tien Le, Christophe Servan, Benjamin Lecouteux, Laurent Besacier


This paper investigates the evaluation of ASR in spoken language translation context. More precisely, we propose a simple extension of WER metric in order to penalize differently substitution errors according to their context using word embeddings. For instance, the proposed metric should catch near matches (mainly morphological variants) and penalize less this kind of error which has a more limited impact on translation performance. Our experiments show that the correlation of the new proposed metric with SLT performance is better than the one of WER. Oracle experiments are also conducted and show the ability of our metric to find better hypotheses (to be translated) in the ASR N-best. Finally, a preliminary experiment where ASR tuning is based on our new metric shows encouraging results. For reproducible experiments, the code allowing to call our modified WER and the corpora used are made available to the research community.


DOI: 10.21437/Interspeech.2016-464

Cite as

Le, N., Servan, C., Lecouteux, B., Besacier, L. (2016) Better Evaluation of ASR in Speech Translation Context Using Word Embeddings. Proc. Interspeech 2016, 2538-2542.

Bibtex
@inproceedings{Le+2016,
author={Ngoc-Tien Le and Christophe Servan and Benjamin Lecouteux and Laurent Besacier},
title={Better Evaluation of ASR in Speech Translation Context Using Word Embeddings},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-464},
url={http://dx.doi.org/10.21437/Interspeech.2016-464},
pages={2538--2542}
}