5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Automatic Segmental and Prosodic Labeling of Mandarin Speech Database

Fu-Chiang Chou (1), Chiu-Yu Tseng (2), Lin-Shan Lee (1)

(1) Dept. of Electrical Engineering, National Taiwan University, Taiwan
(2) Institute of Linguistics, Preparatory Office, Academia Sinica, Taiwan

In this paper we describe the techniques and methodology developed for automatic labeling of segmental and prosodic information for the Mandarin speech database. There are two major procedures. First, the text is converted into the phonetic network of possible pronunciations, and this network is aligned with the speech data by recognition processes. Secondly, many acoustic prosodic features are derived and the break indices are labeled with these features by decision trees. For the segmental labeling, 96.5% of automatically determined segment boundaries are accurate within a range of 20 ms. For the prosodic labeling, 84.9% of the automatic labeled break indices are the same with the manual labeled one.

