Character-Aware Sub-Word Level Language Modeling for Uyghur and Turkish ASR

Chang Liu, Zhen Zhang, Pengyuan Zhang, Yonghong Yan

Uyghur and Turkish are two typical agglutinative languages, which suffer heavily from the data sparsity problem. Due to this, we first apply a statistical morphological segmentation and change the number of morphs to get a better sub-word level automatic speech recognition (ASR) system. The best systems, which yield 2.03% and 1.65% absolute WER reductions from the word level systems for Uyghur and Turkish respectively, are used for further n-best rescoring. To further alleviate the data sparsity problem, we use both convolutional neural network (CNN) based and bi-directional long short-term memory (BLSTM) based character-aware language models on the two languages. In order to alleviate the information missing of the middle steps of the BLSTM based character aware language model, we propose to use the weighted average of each time-steps’ outputs. The proposed weighting methods can be divided into three categories: decay based, position-based and attention-based. Results show that the decay based weighting method leads to the most significant WER reductions, which are 2.38% and 1.96%, compared with the sub-word level 1-pass ASR system for Uyghur and Turkish respectively.

 DOI: 10.21437/Interspeech.2019-1484

Cite as: Liu, C., Zhang, Z., Zhang, P., Yan, Y. (2019) Character-Aware Sub-Word Level Language Modeling for Uyghur and Turkish ASR. Proc. Interspeech 2019, 3495-3499, DOI: 10.21437/Interspeech.2019-1484.

  author={Chang Liu and Zhen Zhang and Pengyuan Zhang and Yonghong Yan},
  title={{Character-Aware Sub-Word Level Language Modeling for Uyghur and Turkish ASR}},
  booktitle={Proc. Interspeech 2019},