Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

A Sinusoidal Model Based on Frequency-To-Instantaneous Frequency Mapping

Parham Zolfaghari (1), Hideki Kawahara (2)

(1) CIAIR/CREST, Itakura Laboratory, Nagoya University, Chikusa-ku, Nagoya, Japan
(2) Wakayama University/CREST/ATR Human Information Processing, Wakayama, Japan

In this paper we describe a sinusoidal analysis and synthesis framework which uses a novel method of extracting the sinusoidal components and fundamental frequency. This method is based on a mapping from linearly spaced filter centre frequencies to the instantaneous frequencies of the filter outputs. Frequency domain fixed points are obtained from this mapping which result in the extraction of the constituent sinusoidal components of the input signal. A robust fundamental frequency extraction technique based on a wavelet representation of this model is also used. These form the essential parts of the sinusoidal analysis framework which also includes a sinusoidal component trajectory continuation scheme. In order to reconstruct the spectrum, the inverse FFT method is used in synthesis [1]. This model has been shown to produce speech of high quality and is also applicable to other sound sources.


  1. Sdepalle, P., and Rodet, X. Synthèse additive par FFT inverse. Rapport Interne IRCAM (1990).

