15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

Joint Filtering and Factorization for Recovering Latent Structure from Noisy Speech Data

Colin Vaz, Vikram Ramanarayanan, Shrikanth S. Narayanan

University of Southern California, USA

We propose a joint filtering and factorization algorithm to recover latent structure from noisy speech. We incorporate the minimum variance distortionless response (MVDR) formulation within the non-negative matrix factorization (NMF) framework to derive a single, unified cost function for both filtering and factorization. Minimizing this cost function jointly optimizes three quantities — a filter that removes noise, a basis matrix that captures latent structure in the data, and an activation matrix that captures how the elements in the basis matrix can be linearly combined to reconstruct input data. Results show that the proposed algorithm recovers the speech basis matrix from noisy speech significantly better than NMF alone or Wiener filtering followed by NMF. Furthermore, PESQ scores show that our algorithm is a viable choice for speech denoising.

Full Paper

Bibliographic reference.  Vaz, Colin / Ramanarayanan, Vikram / Narayanan, Shrikanth S. (2014): "Joint filtering and factorization for recovering latent structure from noisy speech data", In INTERSPEECH-2014, 2365-2369.