14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Improvement of Distant-Talking Speaker Identification Using Bottleneck Features of DNN

Takanori Yamada (1), Longbiao Wang (2), Atsuhiko Kai (1)

(1) Shizuoka University, Japan
(2) Nagaoka University of Technology, Japan

In this paper we propose bottleneck features of deep neural network for distant-talking speaker identification. The accuracy of distant-talking speaker recognition is significantly degraded under reverberant environment. Feature mapping or feature transformation has been shown efficacy in channel-mismatch speaker recognition. Bottleneck feature derived from multi-layer network, which is a nonlinear feature transformation method, has been shown efficacy in automatic speech recognition (ASR) system. In this study, bottleneck features extracted from deep neural networks (DNNs) which employ an unsupervised pre-training method are used as nonlinear feature transformation for distant-talking speech. The speaker identification experiment was performed on large-scale distant-talking speech set, with reverberant environments different to the training environments. The proposed bottleneck features achieved a relative error reduction of 46.3% compared with conventional MFCC. Moreover, a combination of likelihoods of bottleneck

Full Paper

Bibliographic reference.  Yamada, Takanori / Wang, Longbiao / Kai, Atsuhiko (2013): "Improvement of distant-talking speaker identification using bottleneck features of DNN", In INTERSPEECH-2013, 3661-3664.