Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Fast Very Large Vocabulary Recognition Based on Compact DAWG-Structured Language Models

Kalirroi Georgila, Kyriakos Sgarbas, Nikos Fanotakis, George Kokkinakis

Wire Communications Lab, University of Patras, Greece

In this paper we present a method for building compact lattices for very large vocabularies, which has been applied to surname recognition in an Interactive telephone-based Directory Assistance Services svstem. The method involves the construction of a non-deterministic DAWG, which is eventually transformed into a phoneme lattice in Entropics HTK Application Programming Interface (HAPI) format. Incremental construction functions are used for the creation and update of the DAWG, whereas an algorithm for converting the DAWG into the HAPI format is presented. Furthermore, trees, graphs, and full-forms (whole words with no merging of nodes) are compared in a straightforward way under the same conditions, using the same decoder (HAPI MVX) and the same vocabularies. Experimental results showed that as we go from full-form lexicons to trees and then to graphs the size of the recognition network is reduced and therefore the recognition time too. However, recognition accuracy is retained since the same phoneme combinations are involved.

Full Paper

Bibliographic reference.  Georgila, Kalirroi / Sgarbas, Kyriakos / Fanotakis, Nikos / Kokkinakis, George (2000): "Fast very large vocabulary recognition based on compact DAWG-structured language models", In ICSLP-2000, vol.2, 987-990.