ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Viterbi decoding for latent words language models using gibbs sampling

Ryo Masumura, Hirokazu Masataki, Takanobu Oba, Osamu Yoshioka, Satoshi Takahashi

This paper introduces a new approach that directly uses latent words language models (LWLMs) in automatic speech recognition (ASR). LWLMs are effective against data sparseness because of their soft-decision clustering structure and Bayesian modeling so it can be expected that LWLMs perform robustly in multiple ASR tasks. Unfortunately, implementing a LWLM to ASR is difficult because of its computation complexity. In our previous work, we implemented an approximate LWLM for ASR by sampling words according to a stochastic process and training a word n-gram LMs. However, the previous approach cannot take into account the latent variable sequence behind the recognition hypothesis. To solve this problem, we propose a method based on Viterbi decoding that simultaneously decodes the recognition hypothesis and its latent variable sequence. In the proposed method, we use Gibbs sampling for rapid decoding. Our experiments show the effectiveness of the proposed Viterbi decoding based on n-best rescoring. Moreover, we also investigate the effects on the combination of the previous approximate LWLM and the proposed Viterbi decoding.


doi: 10.21437/Interspeech.2013-751

Cite as: Masumura, R., Masataki, H., Oba, T., Yoshioka, O., Takahashi, S. (2013) Viterbi decoding for latent words language models using gibbs sampling. Proc. Interspeech 2013, 3429-3433, doi: 10.21437/Interspeech.2013-751

@inproceedings{masumura13_interspeech,
  author={Ryo Masumura and Hirokazu Masataki and Takanobu Oba and Osamu Yoshioka and Satoshi Takahashi},
  title={{Viterbi decoding for latent words language models using gibbs sampling}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={3429--3433},
  doi={10.21437/Interspeech.2013-751}
}