11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Acoustic Modeling with Bootstrap and Restructuring for Low-Resourced Languages

Xiaodong Cui, Jian Xue, Pierre L. Dognin, Upendra V. Chaudhari, Bowen Zhou

IBM T.J. Watson Research Center, USA

This paper investigates an acoustic modeling approach for low-resourced languages based on bootstrap and model restructuring. The approach first creates an acoustic model with redundancy by averaging over bootstrapped models from resampled subsets of sparse training data, which is followed by model restructuring to scale down the model to a desired cardinality. A variety of techniques for Gaussian clustering and model refinement are discussed for the model restructuring. LVCSR experiments are carried out on Pashto language with up to 105 hours of training data. The proposed approach is shown to yield more robust acoustic models given sparse training data and obtain superior performance over the traditional training procedure.

Full Paper

Bibliographic reference.  Cui, Xiaodong / Xue, Jian / Dognin, Pierre L. / Chaudhari, Upendra V. / Zhou, Bowen (2010): "Acoustic modeling with bootstrap and restructuring for low-resourced languages", In INTERSPEECH-2010, 2974-2977.