ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Cross-domain robust acoustic training

Ea-Ee Jan, Jaime Botella Ordinas

This paper describes our efforts towards cross-domain acoustic training for LargeVocabulary Continuous Speech Recognition (LVCSR) systems. We used weighted multi-style training by pooling insufficient telephony landline and cellular data with down sampled wide band clean data to develop better hybrid acoustic models. We explored the effects on decision tree size to accuracy by approximately 10%. The results show that by fixing number of parameters, system with smaller number of context dependentHMMstates yields better accuracy. It leads to a smaller phone set design. We then investigated the performance degradation on two reduced phone sets for Spanish. Based on these studies, we are able to develop a hybrid system for 8KHz closing talking microphone, telephony landline and cellular phone environments. The acoustic model is evaluated on both flat grammars, digit and name at department, and language model tasks, ATIS and general dictation, using the IBM ViaVoice product engine.

Cite as: Jan, E.-E., Botella Ordinas, J. (2000) Cross-domain robust acoustic training. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 4, 644-647

  author={Ea-Ee Jan and Jaime {Botella Ordinas}},
  title={{Cross-domain robust acoustic training}},
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 4, 644-647}