ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Improvement of distant-talking speaker identification using bottleneck features of DNN

Takanori Yamada, Longbiao Wang, Atsuhiko Kai

In this paper we propose bottleneck features of deep neural network for distant-talking speaker identification. The accuracy of distant-talking speaker recognition is significantly degraded under reverberant environment. Feature mapping or feature transformation has been shown efficacy in channel-mismatch speaker recognition. Bottleneck feature derived from multi-layer network, which is a nonlinear feature transformation method, has been shown efficacy in automatic speech recognition (ASR) system. In this study, bottleneck features extracted from deep neural networks (DNNs) which employ an unsupervised pre-training method are used as nonlinear feature transformation for distant-talking speech. The speaker identification experiment was performed on large-scale distant-talking speech set, with reverberant environments different to the training environments. The proposed bottleneck features achieved a relative error reduction of 46.3% compared with conventional MFCC. Moreover, a combination of likelihoods of bottleneck


doi: 10.21437/Interspeech.2013-686

Cite as: Yamada, T., Wang, L., Kai, A. (2013) Improvement of distant-talking speaker identification using bottleneck features of DNN. Proc. Interspeech 2013, 3661-3664, doi: 10.21437/Interspeech.2013-686

@inproceedings{yamada13_interspeech,
  author={Takanori Yamada and Longbiao Wang and Atsuhiko Kai},
  title={{Improvement of distant-talking speaker identification using bottleneck features of DNN}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={3661--3664},
  doi={10.21437/Interspeech.2013-686}
}