ISCA Archive ICSLP 1998
ISCA Archive ICSLP 1998

A language modeling based on a hierarchical approach: m_n^v

Imed Zitouni

In contrast to conventional n-gram approches, which are the most used language model in continuous speech recognition system, the multigram approach models a stream of variable-length sequences. To overcome the independence assumption in classical multigram, we propose in this paper a hierarchical model which successively relaxes this assumption. We called this model: Mnv. The estimation of the model parameters can be formulated as a Maximum Likelihood estimation problem from incomplete data used at different levels (j in 1...v). We show that estimates of the model parameters can be computed through an iterative Expectation-Maximization algorithm. A few experimental tests were carried out on a corpus extracted from the French ``Le Monde''. Results show that Mnv outperforms based multigram and interpolated bigram but are comparable to the interpolated trigram model.


doi: 10.21437/ICSLP.1998-841

Cite as: Zitouni, I. (1998) A language modeling based on a hierarchical approach: m_n^v. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0727, doi: 10.21437/ICSLP.1998-841

@inproceedings{zitouni98b_icslp,
  author={Imed Zitouni},
  title={{A language modeling based on a hierarchical approach: m_n^v}},
  year=1998,
  booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)},
  pages={paper 0727},
  doi={10.21437/ICSLP.1998-841}
}