8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Use of Prosodic Features for Speech Recognition

Keikichi Hirose, Nobuaki Minematsu

University of Tokyo, Japan

Prosody is known to play an important role in human speech perception process. Therefore, there is an increasing need to use prosodic features for the advancement of speech recognition technology. However, prosody is related to various levels of information, from linguistic, para-linguistic, to non-linguistic, and, therefore, its acoustic manifestation is rather complicated with large variations. This fact prevents prosody to be incorporated in speech recognition process. In the current paper, discussions are given on how we can utilize prosodic features, showing our research works as examples. First, an idea of including word likelihood viewed from the accent type into the recognition process is shown. Second, a scheme of using prosody to control the pruning size in the decoding process is given. Prosodic features should be modeled rather differently form segmental features. Lastly, a new language model constructed by including prosodic events is explained.

Full Paper

Bibliographic reference.  Hirose, Keikichi / Minematsu, Nobuaki (2004): "Use of prosodic features for speech recognition", In INTERSPEECH-2004, 1445-1448.