7th International Conference on Spoken Language Processing
September 16-20, 2002
Traditionally, corpus-based text-to-speech systems generate the speech signal as the result of a two-staged process. First, the target prosody is determined and, after that, a set of speech units that minimize a cost function is selected. Once the target prosody is selected, no alternative prosodic information is generally considered, even when appropriated speech units are not found. In this paper we propose an alternative technique that takes into account several possible intonation contours, selecting the one that minimizes the cost function. In this method, both the candidate pitch contours and the candidate speech units are obtained by means of a unit selection process.
Bibliographic reference. Campillo-Díaz, Francisco / Banga, Eduardo R. (2002): "Combined prosody and candidate unit selections for corpus-based text-to-speech systems", In ICSLP-2002, 141-144.