ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Optimizing the structure of partly-hidden Markov models using weighted likelihood-ratio maximization criterion

Tetsuji Ogawa, Tetsunori Kobayashi

A structure of Partly-Hidden Markov Model (PHMM) is optimized.

PHMM was proposed in our previous work to deal with the complicated temporal changes of acoustic features. It can realize the observation dependent behaviors in both observations and state transitions. In the formulation of previous PHMM, we used a common structure in all model categories. However, it is well known that the optimal structure which gives best performance differs from category to category.

In this paper, we designed a new structure optimization method in which the state-observation dependences in PHMM are optimally defined with respect to each category using Weighted Likelihood- Ratio Maximization (WLRM) criterion. WLRM criterion induces sparse and discriminative structures, and therefore gives the resulting structurally discriminative models. We define the model structure combination which gives maximum weighted likelihoodratio for any possible structure patterns as the optimal structures, and Genetic Algorithm is applied to an optimal approximation of search.

As the results of continuous speech recognition aiming at lecture talks, the effectiveness of the proposed structure optimization is shown: it reduced the word errors compared to HMM and PHMM with common structure for all categories.


doi: 10.21437/Interspeech.2005-861

Cite as: Ogawa, T., Kobayashi, T. (2005) Optimizing the structure of partly-hidden Markov models using weighted likelihood-ratio maximization criterion. Proc. Interspeech 2005, 3353-3356, doi: 10.21437/Interspeech.2005-861

@inproceedings{ogawa05_interspeech,
  author={Tetsuji Ogawa and Tetsunori Kobayashi},
  title={{Optimizing the structure of partly-hidden Markov models using weighted likelihood-ratio maximization criterion}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={3353--3356},
  doi={10.21437/Interspeech.2005-861}
}