Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Nearly Defect-Free F0 Trajectory Extraction for Expressive Speech Modifications Based on STRAIGHT

Hideki Kawahara, Alain de Cheveigné, Hideki Banno, Toru Takahashi, Toshio Irino

Wakayama University, Japan

A new method for source information extraction is proposed. The aim of the method is to provide optimal source information for the very high quality speech manipulation system STRAIGHT. The method is based on both time interval and frequency cues, and it provides fundamental frequency and periodicity information within each frequency band, to allow mixed mode excitation. The method is designed to minimize perceptual disturbance due to errors in source information extraction. A preliminary evaluation using a database of simultaneously recorded EGG and speech signals yielded very low gross error rates (0.029% for females and 0.14% for males). In addition, the method is designed so as to minimize the perceptual disturbance caused by any such gross error.

Full Paper

Bibliographic reference.  Kawahara, Hideki / Cheveigné, Alain de / Banno, Hideki / Takahashi, Toru / Irino, Toshio (2005): "Nearly defect-free F0 trajectory extraction for expressive speech modifications based on STRAIGHT", In INTERSPEECH-2005, 537-540.