ISCA Archive ISCSLP 2004
ISCA Archive ISCSLP 2004

Exploiting Syntactic, Semantic and Lexical Regularities in Language Modeling Via Directed Markov Random Fields

Shaojun Wang, Shaomin Wang, Russell Greiner, Dale Schuurmans, Li Cheng

We present a directed Markov random field (MRF) model that combines � -gram models, probabilistic context free grammars (PC FGs) and probabilistic latent semantic analysis (PLSA) for the purpose of statistical language modeling. The composite directed MRF model has potentially exponential number of loops and becomes context sensitive grammar, nevertheless we are able to estimate its parameters in cubic time using an efficient modified EM method, the generalized inside-outside algorithm, which extends inside-outside algorithm to incorporate the effects of the � -gram and PLSA language models.


Cite as: Wang, S., Wang, S., Greiner, R., Schuurmans, D., Cheng, L. (2004) Exploiting Syntactic, Semantic and Lexical Regularities in Language Modeling Via Directed Markov Random Fields. Proc. International Symposium on Chinese Spoken Language Processing, 305-308

@inproceedings{wang04e_iscslp,
  author={Shaojun Wang and Shaomin Wang and Russell Greiner and Dale Schuurmans and Li Cheng},
  title={{Exploiting Syntactic, Semantic and Lexical Regularities in Language Modeling Via Directed Markov Random Fields}},
  year=2004,
  booktitle={Proc. International Symposium on Chinese Spoken Language Processing},
  pages={305--308}
}