ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Decision tree based rate of speech modeling for speech recognition

Bhuvana Ramabhadran, Yuqing Gao

A real-world speech recognition system encounters several speaking styles and speaking rates and its accuracy depends highly on the speaking rate, i.e., degrades sharply with very fast or very slow speech (including hyperarticulated speech) In this paper, we propose a generic modeling scheme to capture a range of speaking rates from very slow to very fast with the use of decision trees. This approach improves recognition performance on fast and slow speech, without degrading the performance on normal speech. The main idea behind this scheme is to model the context-dependent HMM state likelihoods differently for different speaking rates as the joint probability of observing the sequence of durations given the sequence of the acoustic states, without having to rely on any explicit duration computation during run-time.


Cite as: Ramabhadran, B., Gao, Y. (2000) Decision tree based rate of speech modeling for speech recognition. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 4, 600-603

@inproceedings{ramabhadran00b_icslp,
  author={Bhuvana Ramabhadran and Yuqing Gao},
  title={{Decision tree based rate of speech modeling for speech recognition}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 4, 600-603}
}