Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Learning of Stochastic Context-Free Grammars by Means of Estimation Algorithms

Joan-Andreu Sánchez, José-Miguel Benedí

Departamento de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, Valencia, Spain

The use of the Inside-Outside (IO) algorithm for the estimation of the probability distributions of Stochastic Context-Free Grammars is characterized by the use of all the derivations in the learning process. However, its application in real tasks for Language Modeling is restricted due to the time complexity per iteration and the large number of iterations that it needs to converge. Alternatively, several estimations algorithms which consider a certain subset of derivations in the estimation process have been proposed elsewhere. This set of derivations can be chosen according to structural criteria, or by selecting the k-best derivations. These alternatives are studied in this paper, and they are tested on the corpus of the Wall Street Journal processed in the Penn Treebank project.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Sánchez, Joan-Andreu / Benedí, José-Miguel (1999): "Learning of stochastic context-free grammars by means of estimation algorithms", In EUROSPEECH'99, 1799-1802.