Annealed f-Smoothing as a Mechanism to Speed up Neural Network Training

Tara N. Sainath, Vijayaditya Peddinti, Olivier Siohan, Arun Narayanan


In this paper, we describe a method to reduce the overall number of neural network training steps, during both cross-entropy and sequence training stages. This is achieved through the interpolation of frame-level CE and sequence level SMBR criteria, during the sequence training stage. This interpolation is known as f-smoothing and has previously been just used to prevent overfitting during sequence training. However, in this paper, we investigate its application to reduce the training time. We explore different interpolation strategies to reduce the overall training steps; and achieve a reduction of up to 25% with almost no degradation in word error rate (WER). Finally, we explore the generalization of f-smoothing to other tasks.


 DOI: 10.21437/Interspeech.2017-231

Cite as: Sainath, T.N., Peddinti, V., Siohan, O., Narayanan, A. (2017) Annealed f-Smoothing as a Mechanism to Speed up Neural Network Training. Proc. Interspeech 2017, 3542-3546, DOI: 10.21437/Interspeech.2017-231.


@inproceedings{Sainath2017,
  author={Tara N. Sainath and Vijayaditya Peddinti and Olivier Siohan and Arun Narayanan},
  title={Annealed f-Smoothing as a Mechanism to Speed up Neural Network Training},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={3542--3546},
  doi={10.21437/Interspeech.2017-231},
  url={http://dx.doi.org/10.21437/Interspeech.2017-231}
}