8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Estimating Syntactic Structure from Prosodic Features in Japanese Speech

Tomoko Ohsuga, Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa

Chiba University, Japan

In this study, we introduce a method of estimating the syntactic tree structure of Japanese speech on the basis of the F0 contour and the time duration. We introduce a method of estimating the syntactic structure including the following phrase by using the local prosodic features of the first and final part of the leading phrase. This method involves discriminant analysis which is statistical method based on a large amount of training data. We applied the method to the ATR 503 speech database, and performed discrimination experiments. The results indicated an estimation accuracy of 84% for the branching judgment of each sequence of three leaves. In addition, the accuracy of discrimination saturated when using only the features up to the head part of the second phrase. We consider this result to be fairly good for the difficult task of estimating a syntactic structure that includes a future part on the basis of using only local prosodic features in the past, and also consider prosodic information to be very effective in real-time communication with speech.

Full Paper

Bibliographic reference.  Ohsuga, Tomoko / Nishida, Masafumi / Horiuchi, Yasuo / Ichikawa, Akira (2004): "Estimating syntactic structure from prosodic features in Japanese speech", In INTERSPEECH-2004, 3041-3044.