In this paper we present a highly optimized implementation of Gaussian mixture acoustic model evaluation algorithm. Evaluation of these likelihoods is one of the most computationally intensive parts of automatics speech recognizers but it can be well-parallelized and offloaded to GPU devices. Our approach offers significant speed-up compared to the recently published approaches, since it exploits the GPU architecture better. All the recent implementations were programmed either in CUDA or OpenCL GPU programming frameworks. We present results for both; CUDA as well as OpenCL.
Results suggest that even very large acoustic models can be utilized in real-time speech recognition engines on computers and laptops equipped with a low-end GPU. Optimization of acoustic likelihoods computation on GPU enables to use the remaining GPU resources for offloading of other compute-intensive parts of LVCSR decoder. Other possible use of the freed GPU resources is to evaluate several acoustic models at the same time and use fusion techniques or model selection techniques to improve the quality of resulting conditional likelihoods under diverse conditions.
Bibliographic reference. Vaněk, Jan / Trmal, Jan / Psutka, Josef V. / Psutka, Josef (2011): "Optimization of the Gaussian mixture model evaluation on GPU", In INTERSPEECH-2011, 1737-1740.