5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Nonreciprocal Data Sharing in Estimating HMM Parameters

Xiaoqiang Luo, Frederick Jelinek

CLSP, The Johns Hopkins University, USA

Parameter tying is often used in large vocabulary continuous speech recognition (LVCSR) systems to balance the model resolution and generalizability. However, one consequence of tying is that the differences among tied constructs are ignored. Parameter tying can be alternatively viewed as reciprocal data sharing in that a tied construct uses data associated with all others in its tied-class. To capture the fine difference among tied constructs, we propose to use nonreciprocal data sharing (NRDS) when estimating HMM parameters. In particular, when estimating Gaussian parameters for a HMM state, contributions from other acoustically similar HMM states will be weighted, thus allowing different statistics to govern different states. Data sharing weights are optimized using cross-validation. It can be shown that the objective function for cross-validation is a sum of rational functions and can be efficiently optimized by the growth-transform. Our results on Switchboard show that NRDS reduces the word error rate (WER) significantly compared with a state-of-art baseline system using HMM state-tying.

Full Paper

Bibliographic reference.  Luo, Xiaoqiang / Jelinek, Frederick (1998): "Nonreciprocal data sharing in estimating HMM parameters", In ICSLP-1998, paper 0365.