Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Cross-Domain Robust Acoustic Training

Ea-Ee Jan (1), Jaime Botella Ordinas (2)

(1) Human Language Technologies, IBM T.J. Watson Research Center, Yorktown Heights, NY, USA
(2) European Speech Research, IBM Voice Systems, Seville, Spain

This paper describes our efforts towards cross-domain acoustic training for LargeVocabulary Continuous Speech Recognition (LVCSR) systems. We used weighted multi-style training by pooling insufficient telephony landline and cellular data with down sampled wide band clean data to develop better hybrid acoustic models. We explored the effects on decision tree size to accuracy by approximately 10%. The results show that by fixing number of parameters, system with smaller number of context dependentHMMstates yields better accuracy. It leads to a smaller phone set design. We then investigated the performance degradation on two reduced phone sets for Spanish. Based on these studies, we are able to develop a hybrid system for 8KHz closing talking microphone, telephony landline and cellular phone environments. The acoustic model is evaluated on both flat grammars, digit and name at department, and language model tasks, ATIS and general dictation, using the IBM ViaVoice product engine.


Full Paper

Bibliographic reference.  Jan, Ea-Ee / Botella Ordinas, Jaime (2000): "Cross-domain robust acoustic training", In ICSLP-2000, vol.4, 644-647.