## Third International Conference on Spoken Language Processing (ICSLP 94)## Yokohama, Japan |

A computationally very expensive task arising within speech recognition systems using continuous mixture density HMMs is the log-likelihood computation. In the Philips large vocabulary continuous-speech recognition system it consumes 50%-75% of the computing resources. In our system the log-likelihood computation amounts to a nearest-neighbor search, i.e. to a search for the component density of a mixture density whose mean vector has a minimal distance to the observed feature vector. Fast nearest-neighbor search techniques based on the triangle inequality are very powerful if the dimension of the feature space is lower than about 10. However, a direct application of these techniques is prohibitive in our framework which is characterized by a high-dimensional feature space and a small number of component densities per mixture density. In a typical setup we have 120 component densities per mixture density and a dimension of 63. This paper

- introduces an efficient nearest-neighbor search algorithm adapted to the conditions of a high dimensional feature space and sparse data,
- gives a theoretical explanation and an experimental validation for the constraints of fast nearest-neighbor search techniques in a high-dimensional space.

__Bibliographic reference.__
Beyerlein, Peter (1994):
"Fast log-likelihood computation for mixture densities in a high-dimensional feature space",
In *ICSLP-1994*, 271-274.