INTERSPEECH 2011
12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Low-Frequency Bandwidth Extension of Telephone Speech Using Sinusoidal Synthesis and Gaussian Mixture Model

Hannu Pulakka, Ulpu Remes, Santeri Yrttiaho, Kalle J. Palomäki, Mikko Kurimo, Paavo Alku

Aalto University, Finland

The limited audio bandwidth of narrowband telephone speech degrades the speech quality. This paper proposes a method that extends the bandwidth of telephone speech to the frequency range 0.300 Hz. The lowest harmonics of voiced speech are generated using sinusoidal synthesis. The energy in the extension band is estimated from spectral features using a Gaussian mixture model. The amplitudes and phases of the synthesized signal are adjusted based on the amplitudes and phases of the narrowband input speech. The proposed method was evaluated with listening tests together with a bandwidth extension method for the range 4.8 kHz. The low-frequency bandwidth extension was found to reduce dissimilarity with wideband speech but no perceived quality improvement was achieved.

Full Paper

Bibliographic reference.  Pulakka, Hannu / Remes, Ulpu / Yrttiaho, Santeri / Palomäki, Kalle J. / Kurimo, Mikko / Alku, Paavo (2011): "Low-frequency bandwidth extension of telephone speech using sinusoidal synthesis and Gaussian mixture model", In INTERSPEECH-2011, 1181-1184.