INTERSPEECH 2011
12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Acoustic Modeling with Bootstrap and Restructuring Based on Full Covariance

Xiaodong Cui (1), Xin Chen (2), Jian Xue (1), Peder A. Olsen (1), John R. Hershey (3), Bowen Zhou (1)

(1) IBM T.J. Watson Research Center, USA
(2) University of Missouri, USA
(3) MERL, USA

Bootstrap and restructuring (BSRS) has been shown in our previous work to be superior over the conventional acoustic modeling approach when dealing with low-resourced languages. This paper presents a full covariance based BSRS scheme, which is an extension of our previous work on diagonal covariance based BSRS acoustic modeling. Since full covariance provides richer structural information of acoustic model compared to its diagonal counterpart, it is advantageous for both model clustering and refinement. Therefore, in this work, full covariance is employed in BSRS to keep the structural information until the last step before being converted to diagonal covariance for practical applications. We show that using full covariance further improves the performance over diagonal covariance in the BSRS acoustic modeling framework under the same model size without increasing computational cost in decoding.

Full Paper

Bibliographic reference.  Cui, Xiaodong / Chen, Xin / Xue, Jian / Olsen, Peder A. / Hershey, John R. / Zhou, Bowen (2011): "Acoustic modeling with bootstrap and restructuring based on full covariance", In INTERSPEECH-2011, 1697-1700.