9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

On the Generation of Synthetic Disfluent Speech: Local Prosodic Modifications Caused by the Insertion of Editing Terms

Jordi Adell (1), Antonio Bonafonte (1), David Escudero-Mancebo (2)

(1) Universitat Politècnica de Catalunya, Spain
(2) Universidad de Valladolid, Spain

Disfluent speech synthesis is necessary in some applications such as automatic film dubbing or spoken translation. This paper presents a model for the generation of synthetic disfluent speech based on inserting each element of a disfluency in a context where they can be considered fluent. Prosody obtained by the application of standard techniques on these new sentences is used for the synthesis of the disfluent sentence. In addition, local modifications are applied to segmental units adjacent to disfluency elements. Experiments evidence that duration follows this behavior, what supports the feasibility of the model.

Full Paper

Bibliographic reference.  Adell, Jordi / Bonafonte, Antonio / Escudero-Mancebo, David (2008): "On the generation of synthetic disfluent speech: local prosodic modifications caused by the insertion of editing terms", In INTERSPEECH-2008, 2278-2281.