Gaussian mixture models are the most popular probability density used in automatic speech recognition. During decoding, often many Gaussians are evaluated. Only a small number of Gaussians contributes significantly to probability. Several promising methods to select relevant Gaussians are known. These methods have different properties in terms of required memory, overhead and quality of selected Gaussians. Projection search, bucket box intersection, and Gaussian clustering are investigated in a broadcast news system with focus on adaptation (MLLR).
Cite as: Gehrig, D., Schaaf, T. (2006) A comparative study of Gaussian selection methods in large vocabulary continuous speech recognition. Proc. Interspeech 2006, paper 1954-Mon3BuP.10, doi: 10.21437/Interspeech.2006-225
@inproceedings{gehrig06_interspeech, author={Dirk Gehrig and Thomas Schaaf}, title={{A comparative study of Gaussian selection methods in large vocabulary continuous speech recognition}}, year=2006, booktitle={Proc. Interspeech 2006}, pages={paper 1954-Mon3BuP.10}, doi={10.21437/Interspeech.2006-225} }