Disfluent speech synthesis is necessary in some applications such as automatic film dubbing or spoken translation. This paper presents a model for the generation of synthetic disfluent speech based on inserting each element of a disfluency in a context where they can be considered fluent. Prosody obtained by the application of standard techniques on these new sentences is used for the synthesis of the disfluent sentence. In addition, local modifications are applied to segmental units adjacent to disfluency elements. Experiments evidence that duration follows this behavior, what supports the feasibility of the model.
Bibliographic reference. Adell, Jordi / Bonafonte, Antonio / Escudero-Mancebo, David (2008): "On the generation of synthetic disfluent speech: local prosodic modifications caused by the insertion of editing terms", In INTERSPEECH-2008, 2278-2281.