ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Rapid unsupervised adaptation using frame independent output probabilities of gender and context independent phoneme models

Satoshi Kobashikawa, Atsunori Ogawa, Yoshikazu Yamaguchi, Satoshi Takahashi

Business is demanding higher recognition accuracy with no increase in computation time compared to previously adopted baseline speech recognition systems. Accuracy can be improved by adding a gender dependent acoustic model and unsupervised adaptation based on CMLLR (Constrained Maximum Likelihood Linear Regression). CMLLR-based batch-type unsupervised adaptation estimates a single global transformation matrix by utilizing prior unsupervised labeling, which unfortunately increases the computation time. Our proposed technique reduces prior gender selection and labeling time by using frame independent output probabilities of only gender dependent speech GMM (Gaussian Mixture Model) and context independent phoneme (monophone) HMM (Hidden Markov Model) in dual-gender acoustic models. The proposed technique further raises accuracy by employing a power term after adaptation. Simulations using spontaneous speech show that the proposed technique reduces computation time by 17.9% and the relative error in correct rate by 13.7% compared to the baseline without prior gender selection and unsupervised adaptation.


doi: 10.21437/Interspeech.2009-211

Cite as: Kobashikawa, S., Ogawa, A., Yamaguchi, Y., Takahashi, S. (2009) Rapid unsupervised adaptation using frame independent output probabilities of gender and context independent phoneme models. Proc. Interspeech 2009, 1615-1618, doi: 10.21437/Interspeech.2009-211

@inproceedings{kobashikawa09_interspeech,
  author={Satoshi Kobashikawa and Atsunori Ogawa and Yoshikazu Yamaguchi and Satoshi Takahashi},
  title={{Rapid unsupervised adaptation using frame independent output probabilities of gender and context independent phoneme models}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={1615--1618},
  doi={10.21437/Interspeech.2009-211}
}