Layer Trajectory BLSTM

Eric Sun, Jinyu Li, Yifan Gong

Recently, we proposed layer trajectory (LT) LSTM (ltLSTM) which significantly outperforms LSTM by decoupling the functions of senone classification and temporal modeling with separate depth and time LSTMs. We further improved ltLSTM with contextual layer trajectory LSTM (cltLSTM) which uses the future context frames to predict target labels. Given bidirectional LSTM (BLSTM) also uses future context frames to improve its modeling power, in this study we first compare the performance between these two models. Then we apply the layer trajectory idea to further improve BLSTM models, in which BLSTM is in charge of modeling the temporal information while depth-LSTM takes care of senone classification. In addition, we also investigate the model performance among different LT component designs on BLSTM models. Trained with 30 thousand hours of EN-US Microsoft internal data, the proposed layer trajectory BLSTM (ltBLSTM) model improved the baseline BLSTM with up to 14.5% relative word error rate (WER) reduction across different tasks.

 DOI: 10.21437/Interspeech.2019-2971

Cite as: Sun, E., Li, J., Gong, Y. (2019) Layer Trajectory BLSTM. Proc. Interspeech 2019, 1403-1407, DOI: 10.21437/Interspeech.2019-2971.

  author={Eric Sun and Jinyu Li and Yifan Gong},
  title={{Layer Trajectory BLSTM}},
  booktitle={Proc. Interspeech 2019},