The Seventh ISCA Tutorial and Research Workshop on Speech Synthesis
This paper describes a method of automatic labeling of prosodic information focusing on accent types and accent phrase boundaries for Japanese spoken sentences. They are predicted by CRF (Conditional Random Fields) using linguistic information and F0 contour information. In the prediction of the accent type, we propose a method that uses a provisional accent type predicted by linguistic information and accentuation rules. The actual accent type is predicted by F0 information and linguistic information which includes the provisional accent type as one of features, under the condition that contents of speech and accent phrase boundaries are given. Evaluation experiments show that the introduction of accentuation rules improves accuracy of the accent type prediction by 6.1% and the prediction rate is 59.6% for spontaneous Japanese speech data. In the prediction of the accent phrase boundary, we propose a method that uses linguistic and prosodic probability models under the condition that the contents of speech and word labels are given. The prediction accuracy of accent phrase boundary is 76.5%.
Index Terms: Prosodic labeling, Accent type, Accent Phrase Boundary, F0 pattern, Conditional Random Fields, Accentuation rule
Bibliographic reference. Yamamoto, Asami / Suzuki, Kazuhiro / Cho, Kook / Yamashita, Yoichi (2010): "Automatic prosodic labeling of accent information for Japanese spoken sentences", In SSW7-2010, 300-305.