Third European Conference on Speech Communication and Technology

Berlin, Germany
September 22-25, 1993


A Bayesian Approach to Phone Duration Adaptation for Lombard Speech Recognition

Olivier Siohan, Yifan Gong, Jean-Paul Haton

CRIN-CNRS/INRIA Lorraine, BP 239, Vandoeuvre-les-Nancy, France

Speech recognition under noisy conditions is of great interest for practical purposes. When a speech recognition system is trained under clean conditions and tested under noisy conditions, recognition rates are very low, due to the mismatching between clean and noisy speech. A lot of methods have been developed in the past few years to improve the robustness of speech recognizers in adverse conditions. All of these techniques aim at reducing mismatches in the spectral domain between clean and noisy speech. In presence of noise, speech is modified according to the Lombard effect. This causes some changes in the speech signal, in particular a modification of the duration of each phone. In this paper, we try to reduce mismatches between the phone duration in Lombard and clean speech. We use the framework of Bayesian adaptation to re-estimate the parameter of duration model for each phone. When testing on a 49 alphanumeric vocabulary in isolated mode, we found the adaptation of phone duration models leads to an improvement in the recognition rate, especially for female speakers.

Keywords: Lombard speech recognition, Bayesian adaptation, Phone duration model.

Full Paper

Bibliographic reference.  Siohan, Olivier / Gong, Yifan / Haton, Jean-Paul (1993): "A Bayesian approach to phone duration adaptation for lombard speech recognition", In EUROSPEECH'93, 1639-1642.