EUROSPEECH 2003 - INTERSPEECH 2003
To reduce speech recognition error rate we can use better statistical language models. These models can be improved by grouping words into word equivalence classes. Clustering algorithms can be used to automatically do this word grouping. We present an incremental clustering algorithm and two iterative clustering algorithms. Also, we compare them with previous algorithms. The experimental results show that the two iterative algorithms perform as well as previous ones. It should be pointed out that one of them, that uses the leaving one out technique, has the ability to automatically determine the optimum number of classes. These iterative algorithms are used by the incremental one. On the other hand, the proposed incremental algorithm achieves the best results of the compared algorithms, its behavior is the most regular with the variation of the number of classes and can automatically determine the optimum number of classes.
Bibliographic reference. Barrachina, Sergio / Vilar, Juan Miguel (2003): "Incremental and iterative monolingual clustering algorithms", In EUROSPEECH-2003, 241-244.