5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

A Novel Training Approach for Improving Speech Recognition Under Adverse Stressful Conditions

Sahar E. Bou-Ghazale, John H. L. Hansen

Robust Speech Processing Laboratory, Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina, USA

This paper presents a new training approach for improving recognition of speech under emotional and environmental stress. The proposed approach consists of training a speech recognizer with synthetically generated speech under each stress condition using stress perturbation models previously formulated in [4, 1]. The perturbation models were previously formulated to statistically model the parameter variations under angry, loud, and Lombard effect and were employed in an analysis-synthesis scheme for generating stressed synthetic speech from isolated neutral speech. In this paper, two training approaches employing the synthetically generated stressed speech are presented consisting of : speaker-independent, and speaker-adaptive training methods. Both approaches outperform neutral trained recognizers when tested with angry, loud, and Lombard effect speech.

