ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Performance comparison among HMM, DTW, and human abilities in terms of identifying stress patterns of word utterances

Nobuaki Minematsu, Yukiko Fujisawa, Seiichi Nakagawa

We have been focusing on applying speech technologies to pronunciation learning. In our previous study [1], a stressed syllable detector was implemented by using stressed syllable HMMs and unstressed ones. And using the detector internally, several systems were implemented [2]. However, their development did not necessarily require the use of HMMs as an acoustic modeling method. In this paper, an HMM-based method, a DTW-based method, and a human strategy only with visual inspection were compared in terms of their performance in judging whether two utterances of a word have the same stress pattern, e.g. r´ecord and rec´ord. Here, one utterance was given by a Japanese learner and the other one was done by a native speaker. Experiments showed that HMMs gave us the higher performance than DTW and even human strategies. This result strongly supports the use of HMMs as an acoustic modeling method in the stressed syllable detector development.

s N. Minematsu et al., "Automatic detection of accent in English words spoken by Japanese students," Proc. EUROSPEECH’97, pp.701-704 (1997). N. Minematsu et al., "Prosodic evaluation of English words spoken by Japanese based upon estimating their pronunciation habits," Proc. ICSP, pp.439-444 (1999)


doi: 10.21437/ICSLP.2000-153

Cite as: Minematsu, N., Fujisawa, Y., Nakagawa, S. (2000) Performance comparison among HMM, DTW, and human abilities in terms of identifying stress patterns of word utterances. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 1, 617-620, doi: 10.21437/ICSLP.2000-153

@inproceedings{minematsu00_icslp,
  author={Nobuaki Minematsu and Yukiko Fujisawa and Seiichi Nakagawa},
  title={{Performance comparison among HMM, DTW, and human abilities in terms of identifying stress patterns of word utterances}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 1, 617-620},
  doi={10.21437/ICSLP.2000-153}
}