ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Real-time multilingual HMM training robust to channel variations

E. E. Jan, Jaime Botella Ordinas, George Saon, Salim Roukos

This paper describes our efforts towards real-time telephony multi-lingual Large Vocabulary Continuous Speech Recognition (LVCSR) system. The trilingual (English, French and Spanish) landline cellular hybrid systems is compared to each of our best monolingual systems. The results are very comparable. The degradation is approximately less than 10%. A HMM state quality measurement technique is explored to improve the performances on multilingual acoustic models. A pilot experiment on English/Spanish bilingual system demonstrates very good results. We achieved between 5% to 20% improvement on different test conditions. To further extend to speaker phone applications, we employed different front-end processing techniques, mainly CDCN prior to HDA and MLLT to reduce the error rate on the trilingual system by as many as 30%. These results suggest that trilingual acoustic models can be used for real telephony applications.


Cite as: Jan, E.E., Botella Ordinas, J., Saon, G., Roukos, S. (2000) Real-time multilingual HMM training robust to channel variations. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 3, 925-928

@inproceedings{jan00_icslp,
  author={E. E. Jan and Jaime {Botella Ordinas} and George Saon and Salim Roukos},
  title={{Real-time multilingual HMM training robust to channel variations}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 3, 925-928}
}