Training Recurrent Neural Network through Moment Matching for NLP Applications

Yue Deng, Yilin Shen, KaWai Chen, Hongxia Jin


Recurrent neural network (RNN) is conventionally trained in the supervised mode but used in the free-running mode for inferences on testing samples. The supervised mode takes ground truth token values as RNN inputs but the free-running mode can only use self-predicted token values as surrogating inputs. Such inconsistency inevitably results in poor generalizations of RNN on out-of-sample data. We propose a moment matching (MM) training strategy to alleviate such inconsistency by simultaneously taking these two distinct modes and their corresponding dynamics into consideration. Our MM-RNN shows significant performance improvements over existing approaches when tested on practical NLP applications including logic form generation and image captioning.


 DOI: 10.21437/Interspeech.2018-1369

Cite as: Deng, Y., Shen, Y., Chen, K., Jin, H. (2018) Training Recurrent Neural Network through Moment Matching for NLP Applications. Proc. Interspeech 2018, 3353-3357, DOI: 10.21437/Interspeech.2018-1369.


@inproceedings{Deng2018,
  author={Yue Deng and Yilin Shen and KaWai Chen and Hongxia Jin},
  title={Training Recurrent Neural Network through Moment Matching for NLP Applications},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={3353--3357},
  doi={10.21437/Interspeech.2018-1369},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1369}
}