InterSinging 2010
First Interdisciplinary Workshop on Singing Voice

The University of Tokyo, Japan
October 1-2, 2010

Singing Voice Enhancement for Monaural Music Signals Based on Multiple Time-Frequency Analysis

Hideyuki Tachibana, Nobutaka Ono, Shigeki Sagayama

Graduate School of Information Science and Technology, the University of Tokyo, Japan

We propose a novel technique to enhance singing voice in monaural music au- dio signals by capturing uctuation of singing voice on spectrogram. Based on multiple spectrogram representation, the method separates an input signal into three components: stationary, uctuated, and transient components, and singing voice is mainly included in the uctuated component. The proposed algorithm consists of two-stage processing of the sinusoidal/non-sinusoidal separation algorithm which we have recently developed. It is called harmonic/percussive sound separation (HPSS). In &# 12;rst stage, we &# 12;lter out the stationary component based on HPSS analysis with long frame, and in second stage, we &# 12;lter out the transient component based on HPSS analysis with short frame. We show that the proposed method effectively enhances the singing voice in music by experiments and show its application to melody extraction, which also supports the effectiveness of the method.

Full Paper

Bibliographic reference.  Tachibana, Hideyuki / Ono, Nobutaka / Sagayama, Shigeki (2010): "Singing voice enhancement for monaural music signals based on multiple time-frequency analysis", In InterSinging-2010, 35-38.