Sixth European Conference on Speech Communication and Technology

The use of the InsideOutside (IO) algorithm for the estimation of the probability distributions of Stochastic ContextFree Grammars is characterized by the use of all the derivations in the learning process. However, its application in real tasks for Language Modeling is restricted due to the time complexity per iteration and the large number of iterations that it needs to converge. Alternatively, several estimations algorithms which consider a certain subset of derivations in the estimation process have been proposed elsewhere. This set of derivations can be chosen according to structural criteria, or by selecting the kbest derivations. These alternatives are studied in this paper, and they are tested on the corpus of the Wall Street Journal processed in the Penn Treebank project.
Full Paper (PDF) GnuZipped Postscript
Bibliographic reference. Sánchez, JoanAndreu / Benedí, JoséMiguel (1999): "Learning of stochastic contextfree grammars by means of estimation algorithms", In EUROSPEECH'99, 17991802.