ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Fitting long-range information using interpolated distanced n-grams and cache models into a latent dirichlet language model for speech recognition

Md. Akmal Haidar, Douglas O'Shaughnessy

We propose a language modeling (LM) approach using interpolated distanced n-grams into a latent Dirichlet language model (LDLM) for speech recognition. The LDLM relaxes the bag-of-words assumption and document topic extraction of latent Dirichlet allocation (LDA). It uses default background n-grams where topic information is extracted from the (n-1) history words through Dirichlet distribution in calculating n-gram probabilities. The model does not capture the long-range information from outside of the n-gram events that can improve the language modeling performance. In this paper, we present an interpolated LDLM (ILDLM) by using different distanced n-grams. Here, the topic information is exploited from (n-1) history words through the Dirichlet distribution using interpolated distanced n-grams. The n-gram probabilities of the model are computed by using the distanced word probabilities for the topics and the interpolated topic information for the histories. In addition, we incorporate a cache-based LM, which models the re-occurring words, through unigram scaling to adapt the LDLM and ILDLM models that model the topical words. We have seen that our approaches give significant reductions in perplexity and word error rate (WER) over the probabilistic latent semantic analysis (PLSA) and LDLM approaches using the Wall Street Journal (WSJ) corpus.


doi: 10.21437/Interspeech.2013-616

Cite as: Haidar, M.A., O'Shaughnessy, D. (2013) Fitting long-range information using interpolated distanced n-grams and cache models into a latent dirichlet language model for speech recognition. Proc. Interspeech 2013, 2678-2682, doi: 10.21437/Interspeech.2013-616

@inproceedings{haidar13_interspeech,
  author={Md. Akmal Haidar and Douglas O'Shaughnessy},
  title={{Fitting long-range information using interpolated distanced n-grams and cache models into a latent dirichlet language model for speech recognition}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={2678--2682},
  doi={10.21437/Interspeech.2013-616}
}