10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Comparison of Estimation Techniques in Joint Uncertainty Decoding for Noise Robust Speech Recognition

Haitian Xu, K. K. Chin

Toshiba Research Europe Ltd., UK

Model-based joint uncertainty decoding (JUD) has recently achieved promising results by integrating the front-end uncertainty into the back-end decoding by estimating JUD transforms in a mathematically consistent framework. There are different ways of estimating the JUD transforms resulting in different JUD methods. This paper gives an overview of the estimation techniques existing in the literature including data-driven parallel model combination, Taylor series based approximation and the recently proposed second order approximation. Application of a new technique based on the unscented transformation is also proposed for the JUD framework. The different techniques have been compared in terms of both recognition accuracy and computational cost on a database recorded in a real car environment. Experimental results indicate the unscented transformation is one of the best options for estimating JUD transforms as it maintains a good balance between accuracy and efficiency.

Full Paper

Bibliographic reference.  Xu, Haitian / Chin, K. K. (2009): "Comparison of estimation techniques in joint uncertainty decoding for noise robust speech recognition", In INTERSPEECH-2009, 2403-2406.