Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Max-Gabor Analysis and Synthesis of Spectrograms

Tony Ezzat, Jake Bouvrie, Tomaso Poggio

Massachusetts Institute of Technology, USA

We present a method that analyzes a two-dimensional magnitude spectrogram S(f, t) into its local constituent spectro-temporal amplitudes A(f, t), frequencies F(f, t), orientations (f , t), and phases (f, t). The method operates by performing a two-dimensional local Gabor-like analysis of the spectrogram, retaining only the parameters of the 2D-Gabor filter with maximal amplitude response within the local region. We demonstrate the technique over a wide variety of speakers, and show how the spectrograms in each case may be adequately reconstructed using the parameters of the Max-Gabor analysis. Finally, we discuss the nature of the extracted Max-Gabor parameters.

