Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Application of Auditory Image Model for Speech Event Detection

Minoru Tsuzaki (1), Satomi Tanaka (1), Hiroaki Kato (2), Yoshinori Sagisaka (3)

(1) Kyoto City University of Arts, Japan; (2) ATR-HIS, Japan; (3)Waseda University, Tokyo, Japan

To provide an appropriate model for perception of temporal structures of speech, we applied a comprehensive computational model of the human auditory peripherals to detect changes in speech signals that potentially indicate arrivals of new events. In each tonotopic sub-band, an increase in the activation level was taken into account for the plausibility of a new event, while a decrease was ignored. The total contour obtained by integrating the sub-band information exhibited sharp peaks and dips compared to the loudness contour. A quantitative evaluation to estimate the speaking rate of natural speech also demonstrated that the event-plausibility model performs better than the loudness model.

Full Paper

Bibliographic reference.  Tsuzaki, Minoru / Tanaka, Satomi / Kato, Hiroaki / Sagisaka, Yoshinori (2005): "Application of auditory image model for speech event detection", In INTERSPEECH-2005, 677-680.