16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Attribute Knowledge Integration for Speech Recognition Based on Multi-Task Learning Neural Networks

Hao Zheng (1), Zhanlei Yang (1), Liwei Qiao (2), Jianping Li (2), Wenju Liu (1)

(1) Chinese Academy of Sciences, China
(2) SGCC, China

It has been demonstrated that the speech recognition performance can be improved by adding extra articulatory information, and subsequently, how to use such information effectively becomes a challenging problem. In this paper, we propose an attribute-based knowledge integration architecture which is realized by modeling and learning both acoustic and articulatory cues simultaneously in a uniform framework. The framework promotes the performance by providing attribute-based knowledge in both feature and model domains. In model domain, the attribute classification is used as the secondary task to improve the performance of an MTL-DNN used for speech recognition by lifting the discriminative ability on pronunciation. In feature domain, an attribute-based feature is extracted from an MTL-DNN trained with attribute classification as its primary task and phonetic/tri-phone state classification as the secondary task. Experiments on TIMIT and WSJ corpuses show that the proposed framework achieves significant performance improvements compared with the baseline DNN-HMM systems.

Full Paper

Bibliographic reference.  Zheng, Hao / Yang, Zhanlei / Qiao, Liwei / Li, Jianping / Liu, Wenju (2015): "Attribute knowledge integration for speech recognition based on multi-task learning neural networks", In INTERSPEECH-2015, 543-547.