ISCA Archive SSW 2007
ISCA Archive SSW 2007

Statistical analysis of filled pauses² rhythm for disfluent speech synthesis

Jordi Adell, Antonio Bonafonte, David Escudero

Given that state of the art speech synthesis systems have already reached a high naturalness level, it is time to move to talking speech from the actual read speech framework. For this purpose it is thus necessary to investigate how disfluencies can be included in speech synthesis and even increase its naturalness. This paper builds on a previously presented work and focuses on finding a local model of filled pauses rhythm. A statistical study of rhythm effects around filled pauses is presented and based on the correlation between rhythm variables, a regression model is proposed to predict filled pauses duration and prepausal lengthening.


Cite as: Adell, J., Bonafonte, A., Escudero, D. (2007) Statistical analysis of filled pauses² rhythm for disfluent speech synthesis. Proc. 6th ISCA Workshop on Speech Synthesis (SSW 6), 223-227

@inproceedings{adell07_ssw,
  author={Jordi Adell and Antonio Bonafonte and David Escudero},
  title={{Statistical analysis of filled pauses² rhythm for disfluent speech synthesis}},
  year=2007,
  booktitle={Proc. 6th ISCA Workshop on Speech Synthesis (SSW 6)},
  pages={223--227}
}