Coupled Initialization of Multi-Channel Non-Negative Matrix Factorization Based on Spatial and Spectral Information

Yuuki Tachioka, Tomohiro Narita, Iori Miura, Takanobu Uramoto, Natsuki Monta, Shingo Uenohara, Ken’ichi Furuya, Shinji Watanabe, Jonathan Le Roux


Multi-channel non-negative matrix factorization (MNMF) is a multi-channel extension of NMF and often outperforms NMF because it can deal with spatial and spectral information simultaneously. On the other hand, MNMF has a larger number of parameters and its performance heavily depends on the initial values. MNMF factorizes an observation matrix into four matrices: spatial correlation, basis, cluster-indicator latent variables, and activation matrices. This paper proposes effective initialization methods for these matrices. First, the spatial correlation matrix, which shows the largest initial value dependencies, is initialized using the cross-spectrum method from enhanced speech by binary masking. Second, when the target is speech, constructing bases from phonemes existing in an utterance can improve the performance: this paper proposes a speech bases selection by using automatic speech recognition (ASR). Third, we also propose an initialization method for the cluster-indicator latent variables that couple the spatial and spectral information, which can achieve the simultaneous optimization of above two matrices. Experiments on a noisy ASR task show that the proposed initialization significantly improves the performance of MNMF by reducing the initial value dependencies.


 DOI: 10.21437/Interspeech.2017-61

Cite as: Tachioka, Y., Narita, T., Miura, I., Uramoto, T., Monta, N., Uenohara, S., Furuya, K., Watanabe, S., Roux, J.L. (2017) Coupled Initialization of Multi-Channel Non-Negative Matrix Factorization Based on Spatial and Spectral Information. Proc. Interspeech 2017, 2461-2465, DOI: 10.21437/Interspeech.2017-61.


@inproceedings{Tachioka2017,
  author={Yuuki Tachioka and Tomohiro Narita and Iori Miura and Takanobu Uramoto and Natsuki Monta and Shingo Uenohara and Ken’ichi Furuya and Shinji Watanabe and Jonathan Le Roux},
  title={Coupled Initialization of Multi-Channel Non-Negative Matrix Factorization Based on Spatial and Spectral Information},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={2461--2465},
  doi={10.21437/Interspeech.2017-61},
  url={http://dx.doi.org/10.21437/Interspeech.2017-61}
}