Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Pitch-Synchronous Time-Scaling for Prosodic and Voice Quality Transformations

Joo P. Cabral, Lus C. Oliveira

INESC-ID/IST, Lisbon, Portugal

Current time-domain pitch modification techniques have well known limitations for large variations of the original fundamental frequency. This paper proposes a technique for changing the pitch and duration of a speech signal based on time-scaling the linear prediction (LP) residual. The resulting speech signal achieves better quality than the traditional LP-PSOLA method for large fundamental frequency modifications. By using non-uniform time-scaling, this technique can also change the shape of the LP residual for each pitch period. This way we can simulate changes of the most relevant glottal source parameters like the open quotient, the spectral tilt and the asymmetry coefficient. Careful adjustments of these source parameters allows the transformation of the original speech signal so that it is perceived as if it was uttered with a different voice quality or emotion.

Full Paper

Bibliographic reference.  Cabral, Joo P. / Oliveira, Lus C. (2005): "Pitch-synchronous time-scaling for prosodic and voice quality transformations", In INTERSPEECH-2005, 1137-1140.