In this paper, we introduce the HMM-state sequence confusion characteristics as prior knowledge into the framework of MLLR to relax the transformation and reduce the risks of over-training when adaptation data size is small. There are two issues to be addressed as follows: first, how to estimate such confusion information reliably; second how to use the information in refining the estimation of MLLR adaptation. The pronunciation modeling technology was utilized to build the state sequence confusion table. Then the correlation of states is calculated according to the confusion table. Following proposed algorithm made a relaxation in the process of MLLR adaptation when the adaptation data is very small. Our experiment on a Mandarin state-tying triphone toneless LVCSR system showed that error rate reduction is 9.5% over standard MLLR with about 10 utterances (less than 30 seconds) of adaptation data.
Cite as: Zhao, B., Xu, B. (2000) Incorporating HMM-state sequence confusion for rapid MLLR adaptation to new speakers. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 3, 690-693, doi: 10.21437/ICSLP.2000-629
@inproceedings{zhao00d_icslp, author={Bing Zhao and Bo Xu}, title={{Incorporating HMM-state sequence confusion for rapid MLLR adaptation to new speakers}}, year=2000, booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)}, pages={vol. 3, 690-693}, doi={10.21437/ICSLP.2000-629} }