5th International Conference on Spoken Language Processing
In this paper, we evaluate performance of model adaptation by the previously proposed HMM decomposition method on telephone speech recognition. The HMM decomposition method separates a composed HMM into a known phoneme HMM and an unknown noise and channel HMM by maximum likelihood (ML) estimation of the HMM parameters. A transfer function (telephone channel) HMM is estimated using adaptation speech data by applying the HMM decomposition twice in the linear spectral domain for noise and in the cepstral domain for channel. The telephone speech data for evaluation are recorded through 10 kinds of ordinary analog telephone handsets and cordless telephone handsets. The test results show that the average phrase accuracy with the clean speech HMMs is 60.9% for the ordinary analog telephone handsets, and 19.6% for the cordless telephone handsets. By the HMM decomposition method, the average phrase accuracy is improved to 78.1% for the ordinary analog telephone handsets, and 50.5% for the cordless telephone handsets.
Bibliographic reference. Takiguchi, Tetsuya / Nakamura, Satoshi / Shikano, Kiyohiro / Morishima, Masatoshi / Isobe, Toshihiro (1998): "Evaluation of model adaptation by HMM decomposition on telephone speech recognition", In ICSLP-1998, paper 0698.