Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Maximal Rank Likelihood as an Optimization Function for Speech Recognition

Yuqing Gao, Yongxin Li, Michael Picheny

IBM Thomas J. Watson Research Center, Yorktown Heights, NY, USA

Research has shown that rank statistics derived from context-dependent state likelihood can provide robust speech recognition. In previous work, empirical distributions were used to characterize the rank statistics. We present parametric models of the state rank and the rank likelihood, and then based on them, present a new objective function, Maximal Rank Likelihood (MRL), for estimating parameters in a HMM based speech recognition system. The objective function optimizes the average logarithm of the rank likelihood of training/adaptation data. It is a discriminative based estimation process and hence makes the training criterion close to the decoding criterion. Three applications of MRL are discussed. First one is a Linear Discriminative Projection, which optimizes the objective function using all training data and projects feature vectors into a discriminative space with a reduced dimension. The second and third applications are a feature space transformation and a model space transformation, respectively, for adaptation. The transformations are optimized to maximize the rank likelihood of the adaptation data. The experimental results show that the MRL adaptation algorithms outperform the MLLR adaptation.


Full Paper

Bibliographic reference.  Gao, Yuqing / Li, Yongxin / Picheny, Michael (2000): "Maximal rank likelihood as an optimization function for speech recognition", In ICSLP-2000, vol.4, 125-128.