5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Representing Prosodic Words Using Statistical Models of Moraic Transition of Fundamental Frequency Contours of Japanese

Koji Iwano, Keikichi Hirose

Department of Information and Communication Engineering, School of Engineering, University of Tokyo, Japan

We have formerly proposed a statistical model of moraic transitions of fundamental frequency (F0) contours and showed its effectiveness for prosodic boundary detection and accent type recognition. This model represented F0 contours of prosodic words to simultaneously detect and recognize prosodic word boundaries and accent types. This paper proposes a method where prosodic word F0 contours are modeled separately according to their accent types and presence/absence of succeeding pauses. An utterance is regarded as a sequence of prosodic words under a simple grammar. Each moraic F0 contour is represented by a pair of codes; the original shape code and the newly introduced delta code representing the degree of F0 change between the mora in question and its preceding mora. Compared with earlier results, the boundary detection rate improves from 87.7% to 91.5%. Accent type recognition rate reached 76.0% (type 1 accent discrimination).

