This paper describes F0 segment selection, a novel syllable-based F0 conversion method, which provides a concatenative framework to search for F0 segments in a modest corpus of emotional speech (.15 minutes of data). The method is compared with our earlier work on F0 generation using context-sensitive syllable HMMs. Both methods are complemented with a duration conversion module as well as GMM-based spectral conversion to form a unified emotion conversion framework in English. The system was evaluated using three target styles: surprise, anger and sadness. The results of an extensive perceptual test show that segment selection significantly outperforms the HMM-based method in terms of both emotion recognition rates and intonation quality ratings for surprise and anger. For conveying sadness both methods were effective.
Bibliographic reference. Inanoglu, Zeynep / Young, Steve (2008): "Emotion conversion using F0 segment selection", In INTERSPEECH-2008, 2122-2125.