ISCA Archive Eurospeech 2001
ISCA Archive Eurospeech 2001

Improved context-dependent acoustic modeling for continuous Chinese speech recognition

Jiyong Zhang, Fang Zheng, Jing Li, Chunhua Luo, Guoliang Zhang

This paper describes the new framework of context-dependent (CD) Initial/Final (IF) acoustic modeling using the decision tree based state tying for continuous Chinese speech recognition. The Extended Initial/Final (XIF) set is chosen as the basic speech recognition unit (SRU) set according to the Chinese language characteristics, which outperforms the standard IF set. An adaptive mixture increasing strategy is applied when splitting the single Gaussian into mixed Gaussians in each tied state after the decision tree has been constructed. Our experimental results show that these two improvements are helpful to the acoustic modeling of Chinese speech recognition and that the CD XIF model outperforms the baseline syllable model over 30%.


doi: 10.21437/Eurospeech.2001-196

Cite as: Zhang, J., Zheng, F., Li, J., Luo, C., Zhang, G. (2001) Improved context-dependent acoustic modeling for continuous Chinese speech recognition. Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001), 1617-1620, doi: 10.21437/Eurospeech.2001-196

@inproceedings{zhang01_eurospeech,
  author={Jiyong Zhang and Fang Zheng and Jing Li and Chunhua Luo and Guoliang Zhang},
  title={{Improved context-dependent acoustic modeling for continuous Chinese speech recognition}},
  year=2001,
  booktitle={Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001)},
  pages={1617--1620},
  doi={10.21437/Eurospeech.2001-196}
}