ISCA Archive ICSLP 1994
ISCA Archive ICSLP 1994

A computational model of prosody perception

Neil P. McAngus Todd, Guy J. Brown

This paper describes a computational model of auditory rhythm perception, and demonstrates its application to the extraction of prosodic information from spoken language. The model consists of three stages. In the first stage, the speech waveform is processed by a simulation of the auditory periphery. Secondly, the output of the auditory periphery is processed by a multiscale filtering mechanism, analogous to a short-term auditory memory. Finally, peaks in the response of the multiscale mechanism are accumulated in a long-term auditory store, and plotted to give a representation referred to as a rhythmogram. It is demonstrated that there is a close relationship between the rhythmogram of an utterance and its corresponding stress hierarchy derived by phonological analysis.

Cite as: Todd, N.P.M., Brown, G.J. (1994) A computational model of prosody perception. Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994), 127-130

  author={Neil P. McAngus Todd and Guy J. Brown},
  title={{A computational model of prosody perception}},
  booktitle={Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994)},