INTERSPEECH 2011

We describe a set of novel, batchmode algorithms we developed recently as one key component in scalable, deep neural network based speech recognition. The essence of these algorithms is to structure the singlehiddenlayer neural network so that the upperlayer's weights can be written as a deterministic function of the lowerlayer's weights. This structure is effectively exploited during training by plugging in the deterministic function to the least square error objective function while calculating the gradients. Accelerating techniques are further exploited to make the weight updates move along the most promising directions. The experiments on TIMIT framelevel phone and phonestate classification show strong results. In particular, the error rate is strictly monotonically dropping as the minibatch size increases. This demonstrates the potential for the proposed batchmode algorithms in large scale speech recognition since they are easily parallelizable across computers.
Bibliographic reference. Yu, Dong / Deng, Li (2011): "Accelerated parallelizable neural network learning algorithm for speech recognition", In INTERSPEECH2011, 22812284.