ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

A close look into the probabilistic concatenation model for corpus-based speech synthesis

Shinsuke Sakai, Ranniery Maia, Hisashi Kawai, Satoshi Nakamura

We have proposed a novel probabilistic approach to concatenation modeling for corpus-based speech synthesis, where the goodness of concatenation for a unit is modeled using a conditional Gaussian probability density whose mean is defined as a linear transform of the feature vector from the previous unit. This approach has shown its effectiveness through a subjective listening test. In this paper, we further investigate the characteristics of the proposed method by a objective evaluation and by observing the sequence of concatenation scores across an utterance. We also present the mathematical relationships of the proposed method with other approaches and show that it has a flexible modeling power, having other approaches to concatenation scoring methods as special cases.


doi: 10.21437/Interspeech.2009-254

Cite as: Sakai, S., Maia, R., Kawai, H., Nakamura, S. (2009) A close look into the probabilistic concatenation model for corpus-based speech synthesis. Proc. Interspeech 2009, 752-755, doi: 10.21437/Interspeech.2009-254

@inproceedings{sakai09_interspeech,
  author={Shinsuke Sakai and Ranniery Maia and Hisashi Kawai and Satoshi Nakamura},
  title={{A close look into the probabilistic concatenation model for corpus-based speech synthesis}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={752--755},
  doi={10.21437/Interspeech.2009-254}
}