ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition (SAPA2008)

Brisbane, Australia
September 21, 2008

Explicit Consistency Constraints for STFT Spectrograms and their Application to Phase Reconstruction

Jonathan Le Roux, Nobutaka Ono, Shigeki Sagayama

Graduate School of Information Science and Technology, The University of Tokyo, Japan

As many acoustic signal processing methods, for example for source separation or noise canceling, operate in the magnitude spectrogram domain, the problem of reconstructing a perceptually good sounding signal from a modified magnitude spectrogram, and more generally to understand what makes a spectrogram consistent, is very important. In this article, we derive the constraints which a set of complex numbers must verify to be a consistent STFT spectrogram, i.e. to be the STFT spectrogram of a real signal, and describe how they lead to an objective function measuring the consistency of a set of complex numbers as a spectrogram. We then present a flexible phase reconstruction algorithm based on a local approximation of the consistency constraints, explain its relation with phase-coherence conditions devised as necessary for a good perceptual sound quality, and derive a real-time time scale modification algorithm based on sliding-block analysis. Finally, we show how inconsistency can be used to develop a spectrogram-based audio encryption scheme.

Full Paper

Bibliographic reference.  Roux, Jonathan Le / Ono, Nobutaka / Sagayama, Shigeki (2008): "Explicit consistency constraints for STFT spectrograms and their application to phase reconstruction", In SAPA-2008, 23-28.