Third Workshop on Spoken Language Technologies for Under-resourced Languages
Cape Town, South Africa
Acoustic model parameter estimation is hampered by a lack of data. To reduce the number of parameters to be estimated, we propose sub-GMM modelling, which constrains the acoustic models to a lowdimensional manifold embedded in the space of Gaussian mixture weights. The manifold model is obtained through non-negative matrix factorization with sparsity constraints. Our preliminary monolingual experiments show that the proposed model is as efficient as clustering the distributions to a smaller set, while it opens perspectives for a new parameter tying technique. In the example, the number of parameters to be estimated per distribution is reduced more than an order of magnitude.
Index Terms: under-resourced languages, manifold, sparsity, non-negative matrix factorization, substructure
Bibliographic reference. Zhang, Xueru / Demuynck, Kris / Compernolle, Dirk Van / Van hamme, Hugo (2012): "Subspace-GMM acoustic models for under-resourced languages: feasibility study", In SLTU-2012, 1-4.