ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Nearly perfect detection of continuous f_0 contour and frame classification for TTS synthesis

Thomas Ewender, Sarah Hoffmann, Beat Pfister

We present a new method for the estimation of a continuous fundamental frequency (F0) contour. The algorithm implements a global optimization and yields virtually error-free F0 contours for high quality speech signals. Such F0 contours are subsequently used to extract a continuous fundamental wave. Some local properties of this wave, together with a number of other speech features allow to classify the frames of a speech signal into five classes: voiced, unvoiced, mixed, irregularly glottalized and silence. The presented F0 detection and frame classification can be applied to F0 modeling and prosodic modification of speech segments in high-quality concatenative speech synthesis.


doi: 10.21437/Interspeech.2009-23

Cite as: Ewender, T., Hoffmann, S., Pfister, B. (2009) Nearly perfect detection of continuous f_0 contour and frame classification for TTS synthesis. Proc. Interspeech 2009, 100-103, doi: 10.21437/Interspeech.2009-23

@inproceedings{ewender09_interspeech,
  author={Thomas Ewender and Sarah Hoffmann and Beat Pfister},
  title={{Nearly perfect detection of continuous f_0 contour and frame classification for TTS synthesis}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={100--103},
  doi={10.21437/Interspeech.2009-23}
}