INTERSPEECH 2009
10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Monaural Segregation of Voiced Speech Using Discriminative Random Fields

Rohit Prabhavalkar, Zhaozhang Jin, Eric Fosler-Lussier

Ohio State University, USA

Techniques for separating speech from background noise and other sources of interference have important applications for robust speech recognition and speech enhancement. Many traditional computational auditory scene analysis (CASA) based approaches decompose the input mixture into a time-frequency (T-F) representation, and attempt to identify the T-F units where the target energy dominates that of the interference. This is accomplished using a two stage process of segmentation and grouping. In this pilot study, we explore the use of Discriminative Random Fields (DRFs) for the task of monaural speech segregation. We find that the use of DRFs allows us to effectively combine multiple auditory features into the system, while simultaneously integrating the the two CASA stages into one. Our preliminary results suggest that CASA based approaches may benefit from the DRF framework.

Full Paper

Bibliographic reference.  Prabhavalkar, Rohit / Jin, Zhaozhang / Fosler-Lussier, Eric (2009): "Monaural segregation of voiced speech using discriminative random fields", In INTERSPEECH-2009, 856-859.