Recently, Deep Belief Networks (DBNs) have been proposed for phone recognition and were found to achieve highly competitive performance. In the original DBNs, only frame-level information was used for training DBN weights while it has been known for long that sequential or full-sequence information can be helpful in improving speech recognition accuracy. In this paper we investigate approaches to optimizing the DBN weights, state-to-state transition parameters, and language model scores using the sequential discriminative training criterion. We describe and analyze the proposed training algorithm and strategy, and discuss practical issues and how they affect the final results. We show that the DBNs learned using the sequence-based training criterion outperforms that with frame-based criterion on three-layer DBNs and explain why the gain vanishes on six-layer DBNs, when evaluated on TIMIT.
Bibliographic reference. Mohamed, Abdel-rahman / Yu, Dong / Deng, L. (2010): "Investigation of full-sequence training of deep belief networks for speech recognition", In INTERSPEECH-2010, 2846-2849.