ISCA Archive ICSLP 1994
ISCA Archive ICSLP 1994

A comparison study of output probability functions in HMMs through spoken digit recognition

Li Zhao, Hideyuki Suzuki, Seiichi Nakagawa

In speech recognition, HMM, as one of the most effective methods, has been used to model speech statistically. Traditionally, the discrete distribution HMM and the continuous distribution HMM have been widely used in various applications. On the other hand, to further improve recognition performance, in recent years, HMMs with various output probability functions have been proposed. The representative models of which are the mixture continuous distributed HMM and the semi-continuous distributed HMM. However, recently we have also proposed the RBF(radius basis function) based HMM and the VQ-distortion based HMM by combining the RBF function and the VQ-distortion measure with HMM, respectively. These use a RBF function and VQ-distortion measure at each state respectively instead of an output probability density function used by traditional HMMs.

In this paper, we first describe the RBF based HMM and the VQ-distortion based HMM. In addition, to confirm the performance of RBF based HMM and the VQ-distortion based HMM, we compared them with the discrete distributed HMM, the mixture continuous distributed HMM(full covariances matrix and diagonal matrix) and the semi-continuous distributed HMM based on their speech recognition performance rates through experiments on speaker-independent spoken digit recognition. From these comparison, we confirmed that the RBF based HMM and the VQ-distortion based HMM are robust and superior to conventional HMMs.


Cite as: Zhao, L., Suzuki, H., Nakagawa, S. (1994) A comparison study of output probability functions in HMMs through spoken digit recognition. Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994), 231-234

@inproceedings{zhao94_icslp,
  author={Li Zhao and Hideyuki Suzuki and Seiichi Nakagawa},
  title={{A comparison study of output probability functions in HMMs through spoken digit recognition}},
  year=1994,
  booktitle={Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994)},
  pages={231--234}
}