ISCA Archive SSW 1990
ISCA Archive SSW 1990

Automatic labeling of large prosodic databases : tools, methodology and links with a text-to-speech system

Gérard Bailly, Thierry Barbe, Hai-Dong Wang

This article presents an unified methodology to segment and label acoustic databases. The methodology is entirely based on a phonetic model: the temporal decomposition (TD) model. In this model phonemes are seen as emergence functions (EF) which overlap in time. The segmentation and the determination of the prosodic contour of an acoustic continuum is intimately linked with the detection of the EFs. As the EFs are automatically determined the coherence of the prosodic structure of utterances across the entire corpus is ensured and thus statistical methods can be applied to study the links between formal analysis of the text and prosodic structure of the message. Since the same methodology may be applied to the segmentation of phonetic units, synthesis by concatenative units may be performed : prosodic events detected in the prosodic database and in the phonetic units are entirely compatible. The tools presented below are speaker-independent and cover the entire analysis to synthesis process.


Cite as: Bailly, G., Barbe, T., Wang, H.-D. (1990) Automatic labeling of large prosodic databases : tools, methodology and links with a text-to-speech system. Proc. First ESCA Workshop on Speech Synthesis (SSW 1), 201-204

@inproceedings{bailly90b_ssw,
  author={Gérard Bailly and Thierry Barbe and Hai-Dong Wang},
  title={{Automatic labeling of large prosodic databases : tools, methodology and links with a text-to-speech system}},
  year=1990,
  booktitle={Proc. First ESCA Workshop on Speech Synthesis (SSW 1)},
  pages={201--204}
}