ISCA Archive ICSLP 1994
ISCA Archive ICSLP 1994

Computer assisted grammar construction

H.-H. Shih, Steve J. Young

This paper presents a system for computer assisted grammar construction (CAGC) and its application in speech processing. The CAGC system is designed to infer linguistically-motivated broad-coverage stochastic context-free grammars (SCFGs) for large corpora, without requiring significant manual contributions. Our approach utilizes an extended inside-outside learning algorithm [1] to train a hybrid SCFG [2] from a bracketed training set. The bracketing information is derived by an automatic surface bracketing system (AUTO) specifically designed for this purpose[3]. Experimental results, evaluated by using Parseval metrics [4], demonstrate that the CAGC system is capable of inferring a grammar from a subset of the Wall Street. Journal (WSJ) tagged text corpus and that the inferred grammar achieves high coverage and good precision. As an application, the inferred grammar acts as a language model for rescoring N-best outputs from a speech recognizer [5].

Cite as: Shih, H.-H., Young, S.J. (1994) Computer assisted grammar construction. Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994), 855-858

  author={H.-H. Shih and Steve J. Young},
  title={{Computer assisted grammar construction}},
  booktitle={Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994)},