ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Word confidence using duration models

Stefano Scanzio, Pietro Laface, Daniele Colibro, Roberto Gemello

In this paper, we propose a word confidence measure based on phone durations depending on large contexts. The measure is based on the expected duration of each recognized phone in a word. In the approach here proposed the duration of each phone is in principle context-dependent, and the measure is a function of the distance between the observed and expected phone duration distributions within a word. Our experiments show that, since the Ā“duration confidenceĀ” does not make use of any acoustic information, its Equal Error Rate (EER) in terms of False Accept and False Rejection rates is not as good as the one obtained by using the more informed acoustic confidence measure. However, combining the two measures by a simple linear interpolation, the system EER improves by 6% to 10% relative on an isolated word recognition task in several languages.


doi: 10.21437/Interspeech.2009-349

Cite as: Scanzio, S., Laface, P., Colibro, D., Gemello, R. (2009) Word confidence using duration models. Proc. Interspeech 2009, 1207-1210, doi: 10.21437/Interspeech.2009-349

@inproceedings{scanzio09_interspeech,
  author={Stefano Scanzio and Pietro Laface and Daniele Colibro and Roberto Gemello},
  title={{Word confidence using duration models}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={1207--1210},
  doi={10.21437/Interspeech.2009-349}
}