ISCA Archive Interspeech 2017
ISCA Archive Interspeech 2017

Complex-Valued Restricted Boltzmann Machine for Direct Learning of Frequency Spectra

Toru Nakashika, Shinji Takaki, Junichi Yamagishi

In this paper, we propose a new energy-based probabilistic model where a restricted Boltzmann machine (RBM) is extended to deal with complex-valued visible units. The RBM that automatically learns the relationships between visible units and hidden units (but without connections in the visible or the hidden units) has been widely used as a feature extractor, a generator, a classifier, pre-training of deep neural networks, etc. However, all the conventional RBMs have assumed the visible units to be either binary-valued or real-valued, and therefore complex-valued data cannot be fed to the RBM.

In various applications, however, complex-valued data is frequently used such examples include complex spectra of speech, fMRI images, wireless signals, and acoustic intensity. For the direct learning of such the complex-valued data, we define the new model called “complex-valued RBM (CRBM)” where the conditional probability of the complex-valued visible units given the hidden units forms a complex-Gaussian distribution. Another important characteristic of the CRBM is to have connections between real and imaginary parts of each of the visible units unlike the conventional real-valued RBM. Our experiments demonstrated that the proposed CRBM can directly encode complex spectra of speech signals without decoupling imaginary number or phase from the complex-value data.

doi: 10.21437/Interspeech.2017-584

Cite as: Nakashika, T., Takaki, S., Yamagishi, J. (2017) Complex-Valued Restricted Boltzmann Machine for Direct Learning of Frequency Spectra. Proc. Interspeech 2017, 4021-4025, doi: 10.21437/Interspeech.2017-584

  author={Toru Nakashika and Shinji Takaki and Junichi Yamagishi},
  title={{Complex-Valued Restricted Boltzmann Machine for Direct Learning of Frequency Spectra}},
  booktitle={Proc. Interspeech 2017},