ISCA Archive ICSLP 1998
ISCA Archive ICSLP 1998

Automatic segmental and prosodic labeling of Mandarin speech database

Fu-Chiang Chou, Chiu-Yu Tseng, Lin-Shan Lee

In this paper we describe the techniques and methodology developed for automatic labeling of segmental and prosodic information for the Mandarin speech database. There are two major procedures. First, the text is converted into the phonetic network of possible pronunciations, and this network is aligned with the speech data by recognition processes. Secondly, many acoustic prosodic features are derived and the break indices are labeled with these features by decision trees. For the segmental labeling, 96.5% of automatically determined segment boundaries are accurate within a range of 20 ms. For the prosodic labeling, 84.9% of the automatic labeled break indices are the same with the manual labeled one.


doi: 10.21437/ICSLP.1998-584

Cite as: Chou, F.-C., Tseng, C.-Y., Lee, L.-S. (1998) Automatic segmental and prosodic labeling of Mandarin speech database. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0266, doi: 10.21437/ICSLP.1998-584

@inproceedings{chou98_icslp,
  author={Fu-Chiang Chou and Chiu-Yu Tseng and Lin-Shan Lee},
  title={{Automatic segmental and prosodic labeling of Mandarin speech database}},
  year=1998,
  booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)},
  pages={paper 0266},
  doi={10.21437/ICSLP.1998-584}
}