9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Spectral Envelope Recovery Beyond the Nyquist Limit for High-Quality Manipulation of Speech Sounds

Hideki Kawahara (1), Masanori Morise (2), Hideki Banno (3), Toru Takahashi (4), Ryuichi Nisimura (1), Toshio Irino (1)

(1) Wakayama University, Japan; (2) Kwansei Gakuin University, Japan; (3) Meijo University, Japan; (4) Kyoto University, Japan

A simple new method to recover details in a spectral envelope is proposed based on a recently introduced speech analysis, modification and resynthesis framework called TANDEM-STRAIGHT. Spectral envelope recovery of voiced sounds is a discrete-to-analog conversion in the frequency domain. However, there is a fundamental problem because the spatial frequency contents of vocal tract functions generally exceed the Nyquist limit of the equivalent sampling rate determined by the fundamental frequency. TANDEM-STRAIGHT yields a method to recover a spectral envelope based on the consistent sampling theory and provides base information for exceeding this limit. At the final stage, the AR spectral envelope estimated from the TANDEM-STRAIGHT spectrum is divided by the F0 adaptively smoothed version of itself to supply the missing high-spatial-frequency details of the envelope. The underlying principle of the proposed method can also be applied to other speech synthesis frameworks.

Full Paper

Bibliographic reference.  Kawahara, Hideki / Morise, Masanori / Banno, Hideki / Takahashi, Toru / Nisimura, Ryuichi / Irino, Toshio (2008): "Spectral envelope recovery beyond the nyquist limit for high-quality manipulation of speech sounds", In INTERSPEECH-2008, 650-653.