International Workshop on Hands-Free Speech Communication (HSC2001)

April 9-11, 2001
Kyoto, Japan

Acoustic Model for Robust Speech Recognition of Stressed Japanese Speech

Kozo Okuda, Tomoko Matsui and Satoshi Nakamura

ATR Spoken Language Translation Research Laboratories, Kyoto, Japan

In making an error recovery utterance, the users of a speech recognition system utter more clearly and slowly. In addition, the occurrence of syllable-stressed speech increases in Japanese. This paper investigates a method that is robust in recognizing syllable-stressed speech uttered for error recovery. In syllable-stressed speech, each syllable is uttered slowly and emphasized. The characteristics of each syllable is strongly altered by this modification and thereby the speech recognition performance is reduced. To cope with these problems, we propose a new recognition method. In this paper we propose an acoustic modeling method for recognizing the syllable-stressed speech by combining existing acoustic models. By our method, it is not necessary to collect additional training data. Our results indicate that the proposed method improves performance. Furthermore, the method does not need any expansion of the recognition lexicon or explicit selection of the models.


Full Paper

Bibliographic reference.  Okuda, Kozo / Matsui, Tomoko / Nakamura, Satoshi (2001): "Acoustic model for robust speech recognition of stressed Japanese speech", In HSC2001, 123-126.