We propose an algorithm that allows online training of a context dependent DNN model. It designs a state inventory based on DNN features and jointly optimizes the DNN parameters and alignment of the training data. The process allows flat starting a model from scratch and avoids any dependency on a GMM acoustic model to bootstrap the training process. A 15k state model trained with the proposed algorithm reduced the error rate on a mobile speech task by 24% compared to a system bootstrapped from a CI GMM and by 16% compared to a system bootstrapped from a CD GMM system.
Bibliographic reference. Bacchiani, Michiel / Senior, Andrew / Heigold, Georg (2014): "Asynchronous, online, GMM-free training of a context dependent acoustic model for speech recognition", In INTERSPEECH-2014, 1900-1904.