16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Multilingual Tandem Bottleneck Feature for Language Identification

Wang Geng, Jie Li, Shanshan Zhang, Xinyuan Cai, Bo Xu

Chinese Academy of Sciences, China

The deep bottleneck (BN) feature based ivector solution has been recognized as a popular pipeline for language identification (LID) recently. However, issues such as how to extract more effective BN features and how to fully utilize features extracted from deep neural networks (DNN) are still not well investigated. In this paper, these issues are empirically tackled by means as follows: First, two novel types of deep features, phone-discriminant and triphone-discriminate are extracted. Then, DNNs are trained both separately and jointly on multilingual corpuses to produce different BN features. Finally, tandem fashion on deep BN features is applied to build enhanced deep features. Experiment results show that systems built on top of tandem deep features obtain 19% and 42% relative equal error rate reduction on average on NIST LRE 2007 over the counterpart built on traditional deep BN features and the cepstral feature based LID system, respectively.

Full Paper

Bibliographic reference.  Geng, Wang / Li, Jie / Zhang, Shanshan / Cai, Xinyuan / Xu, Bo (2015): "Multilingual tandem bottleneck feature for language identification", In INTERSPEECH-2015, 413-417.