ISCA Archive SSW 2007
ISCA Archive SSW 2007

Clustering algorithm for F0 curves based on hidden Markov models

Damien Lolive, Nelly Barbot, Olivier Boeffard

This article describes a new unsupervised methodology to learn F0 classes using HMM on a syllable basis. A F0 class is represented by a HMM with three emitting states. The unsupervised clustering algorithm relies on an iterative gaussian splitting and EM retraining process. First, a single class is learnt on a training corpus (8000 syllables) and it is then divided by perturbing gaussian means of successive levels. At each step, the mean RMS error is evaluated on a validation corpus (3000 syllables). The algorithm stops automatically when the error becomes stable or increases. The syllabic structure of a sentence is the reference level we have taken for F0 modelling even if the methodology can be applied to other structures. Clustering quality is evaluated in terms of cross-validation using a mean of RMS errors between F0 contours on a test corpus and the estimated HMM trajectories. The results show a pretty good quality of the classes (mean RMS error around 4Hz).


Cite as: Lolive, D., Barbot, N., Boeffard, O. (2007) Clustering algorithm for F0 curves based on hidden Markov models. Proc. 6th ISCA Workshop on Speech Synthesis (SSW 6), 85-89

@inproceedings{lolive07_ssw,
  author={Damien Lolive and Nelly Barbot and Olivier Boeffard},
  title={{Clustering algorithm for F0 curves based on hidden Markov models}},
  year=2007,
  booktitle={Proc. 6th ISCA Workshop on Speech Synthesis (SSW 6)},
  pages={85--89}
}