DNN-Based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification

Zeyan Oo, Yuta Kawakami, Longbiao Wang, Seiichi Nakagawa, Xiong Xiao, Masahiro Iwahashi


The importance of the phase information of speech signal is gathering attention. Many researches indicate system combination of the amplitude and phase features is effective for improving speaker recognition performance under noisy environments. On the other hand, speech enhancement approach is taken usually to reduce the influence of noises. However, this approach only enhances the amplitude spectrum, therefore noisy phase spectrum is used for reconstructing the estimated signal. Recent years, DNN based feature enhancement is studied intensively for robust speech processing. This approach is expected to be effective also for phase-based feature. In this paper, we propose feature space enhancement of amplitude and phase features using deep neural network (DNN) for speaker identification. We used mel-frequency cepstral coefficients as an amplitude feature, and modified group delay cepstral coefficients as a phase feature. Simultaneous enhancement of amplitude and phase based feature was effective, and it achieved about 24% relative error reduction comparing with individual feature enhancement.


DOI: 10.21437/Interspeech.2016-717

Cite as

Oo, Z., Kawakami, Y., Wang, L., Nakagawa, S., Xiao, X., Iwahashi, M. (2016) DNN-Based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification. Proc. Interspeech 2016, 2204-2208.

Bibtex
@inproceedings{Oo+2016,
author={Zeyan Oo and Yuta Kawakami and Longbiao Wang and Seiichi Nakagawa and Xiong Xiao and Masahiro Iwahashi},
title={DNN-Based Amplitude and Phase Feature Enhancement for Noise Robust Speaker Identification},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-717},
url={http://dx.doi.org/10.21437/Interspeech.2016-717},
pages={2204--2208}
}