14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Parameter Clustering for Temporally Varying Weight Regression for Automatic Speech Recognition

Shilin Liu, Khe Chai Sim

National University of Singapore, Singapore

Recently, an implicit trajectory model using temporally varying weight regression (TVWR) was proposed and achieved promising gains using ML training criteria. In the original TVWR, each component weight is modelled as a constrained linear regression function with respect to the monophone posterior feature. Due to the high dimensionality of the posterior feature, many free parameters were introduced into the TVWR system. Compared to the standard HMM system, such increment of system complexity could potentially cause two issues: over-training and slow decoding. In order to avoid these two issues, parameter clustering for TVWR is proposed to estimate cluster specific instead of original component specific regression parameters. In this paper, both knowledge-driven and data-driven approaches are introduced to define the cluster. Parameter re-estimation of clustered regression parameters is also derived. Experiments are conducted on the clean data of Aurora 4 corpus and systems are evaluated on Novf92 5k closed vocabulary recognition task. Results show that comparable performance can be obtained and decoding time improves by more than 20% after significant reduction of system complexity.

Full Paper

Bibliographic reference.  Liu, Shilin / Sim, Khe Chai (2013): "Parameter clustering for temporally varying weight regression for automatic speech recognition", In INTERSPEECH-2013, 1796-1800.