As an effort to make prosody useful in spontaneous speech recognition, we adopt a quasi-continuous prosodic annotation and accordingly design a prosody-dependent acoustic model to improve ASR performances. We propose a variable-parameter Hidden Markov Models, modeling the mean vector as a function of the prosody variable through a polynomial regression model. The prosodically-adapted acoustic models are used to re-score the N-best output from a standard ASR, according to the prosody variable assigned by an automatic prosody detector. Experiments on the Buckeye corpus demonstrate the effectiveness of our approach.
Index Terms: Prosody-dependent ASR, variable parameter HMM, re-scoring
Cite as: Huang, J.-T., Huang, P.-S., Mo, Y., Hasegawa-Johnson, M., Cole, J. (2010) Prosody-dependent acoustic modeling using variable-parameter hidden Markov models. Proc. Speech Prosody 2010, paper 623
@inproceedings{huang10_speechprosody, author={Jui-Ting Huang and Po-Sen Huang and Yoonsook Mo and Mark Hasegawa-Johnson and Jennifer Cole}, title={{Prosody-dependent acoustic modeling using variable-parameter hidden Markov models}}, year=2010, booktitle={Proc. Speech Prosody 2010}, pages={paper 623} }