Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Nparse - A Shallow N-Gram-Based Grammatical-Phrase Parser

Alice Carlberger

Centre for Speech Technology, Department of Speech, Music and Hearing, KTH, Stockholm, Sweden

Nparse is a shallow probabilistic unification-based parser for N-best list resorting and the finding of simple grammatical phrases. It is data-driven and robust, allowing both domain-specific and unrestricted-language training. We believe it can be an interesting alternative for use in a synthesis or recogni-tion front end. This parser has been trained for Swedish on a fine-grained set of grammatical-phrase nodes and grammatical features and evaluated on three language domains. A tree bank database has been built and a detailed linguistic assessment performed. Later, these results will be compared with evalua-tion on a simplified node-and-feature system. Our aim is to find the optimal system complexity for accurately establishing phrase boundaries and phrase types in newspaper text and, ul-timately, unrestricted language. For this, a combination of it-erative manual training and unsupervised training will be used.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Carlberger, Alice (1999): "Nparse - a shallow n-gram-based grammatical-phrase parser", In EUROSPEECH'99, 2067-2070.