ISCA Archive Eurospeech 2001
ISCA Archive Eurospeech 2001

Towards the creation of acoustic models for stressed Japanese speech

Kozo Okuda, Tomoko Matsui, Satoshi Nakamura

In error recovery utterance, the user using the speech recognition system changes his or her speaking style to aid the system in recognizing the speech. However, this change leads the mismatch between the acoustic models and reduces the performance of the system. This degradation causes a serious problem of speech recognition for a dialog system or a speech translation system. In error recovery utterance in Japanese, the occurrence of syllable-stressed speech increases. In syllable-stressed speech, each syllable is uttered slowly and emphasized. The characteristics of each syllable are strongly altered by this modification and the speech recognition performance is reduced. This paper investigates how to create acoustic models robust in recognizing error recovery utterances, especially syllable-stressed speech. In this paper, we propose an acoustic modeling method for syllable-stressed speech by combining existing acoustic models. Our results indicate that the proposed method improves the system performance. Furthermore, the method does not need any expansion of the recognition dictionary or explicit model selection.


doi: 10.21437/Eurospeech.2001-205

Cite as: Okuda, K., Matsui, T., Nakamura, S. (2001) Towards the creation of acoustic models for stressed Japanese speech. Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001), 1653-1656, doi: 10.21437/Eurospeech.2001-205

@inproceedings{okuda01_eurospeech,
  author={Kozo Okuda and Tomoko Matsui and Satoshi Nakamura},
  title={{Towards the creation of acoustic models for stressed Japanese speech}},
  year=2001,
  booktitle={Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001)},
  pages={1653--1656},
  doi={10.21437/Eurospeech.2001-205}
}