ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition (SAPA2006)

Pittsburgh, PA, USA
September 16, 2006

Frequency Component Restoration for Music Sounds Using Local Probabilistic Models with Maximum Entropy Learning

Tomonori Izumitani, Kunio Kashino

NTT Communication Science Laboratories, Atsugi-shi, Kanagawa, Japan

We propose a method that estimates frequency component structures from musical audio signals and restores missing components due to noise. Restoration has become important in various music information processing systems including music information retrieval. Our method comprises two steps: (1) pattern classification for the initial component-state estimation, and (2) state optimization by a generative model (Markov random fields; MRF). Throughout the method, we use a probabilistic model defined for each local region on a spectrogram. Unlike conventional MRF models, the model parameters are learned using a maximum entropy method. Experiments using artificial noisy sounds show that a combination of the above two steps improves the performance with respect to restoration accuracy and robustness, compared with the sole use of pattern classification or a generative model. The method achieves an F-measure greater than 0.6 even in periods where signals are replaced by noises. In addition, the method is shown to be effective even for audio signals of real instruments

Full Paper

Bibliographic reference.  Izumitani, Tomonori / Kashino, Kunio (2006): "Frequency component restoration for music sounds using local probabilistic models with maximum entropy learning", In SAPA-2006, 12-17.