September 22-25, 1997
This paper presents a new training approach for improving recognition of speech under emotional and environmental stress. The proposed approach consists of training a speech recognizer with synthetically generated speech under each stress condition using stress perturbation models previously formulated in [4, 1]. The perturbation models were previously formulated to statistically model the parameter variations under angry, loud, and Lombard effect and were employed in an analysis-synthesis scheme for generating stressed synthetic speech from isolated neutral speech. In this paper, two training approaches employing the synthetically generated stressed speech are presented consisting of : speaker-independent, and speaker-adaptive training methods. Both approaches outperform neutral trained recognizers when tested with angry, loud, and Lombard effect speech.
Bibliographic reference. Bou-Ghazale, Sahar E. / Hansen, John H. L. (1997): "A novel training approach for improving speech recognition under adverse stressful conditions", In EUROSPEECH-1997, 2387-2390.