5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

An Instantaneous-Frequency-Based Pitch Extraction Method for High-Quality Speech Transformation: Revised TEMPO in the STRAIGHT-Suite

Hideki Kawahara (1), Alain de Cheveigne (2), Roy D. Patterson (3)

(1) Wakayama University/ATR/CREST, Japan
(2) Paris 7 University/CNRS, France
(3) CNBH University of Cambridge, UK

A new source information extraction algorithm is proposed to provide a reliable source signal for an extremely high-quality speech analysis, modification, and transformation system called STRAIGHT-suite (Speech Transformation and Representation based on Adaptive Interpolation of weiGHTed spectrogram). The proposed method makes use of instantaneous frequencies in harmonic components based on their reliability. A performance evaluation is conducted using a simultaneous EGG (Electroglottograph) recording as the reference signal. The error variance for F0 extraction using the proposed algorithm is shown to be about 1/3 that of the previous F0 extraction method used in STRAIGHT-suite, although the previous algorithm is still competitive with conventional F0 extraction methods.

Full Paper

Bibliographic reference.  Kawahara, Hideki / Cheveigne, Alain de / Patterson, Roy D. (1998): "An instantaneous-frequency-based pitch extraction method for high-quality speech transformation: revised TEMPO in the STRAIGHT-suite", In ICSLP-1998, paper 0659.