A significant source of variation in spontaneous speech is due to intra-speaker pronunciation changes. Previous work in automatic speech recognition has identified several factors that affect pronunciation variability such as phonetic context and speaking rate. This work examines new higher level information sources: syntax and discourse structure, specifically the relationship between these factors and pronunciation variation as seen in reduction and hyper-articulation. Analyses of hand-labeled data are used to determine features for phoneindependent variables characterizing pronunciation changes, which in turn are used in a decision-tree based dynamic pronunciation model. Pronunciation prediction experiments show a reduction in phone error rate of 10% over a baseline model using only phonetic context.
Cite as: Bates, R., Ostendorf, M. (2001) Modeling pronunciation variation in conversational speech using syntax and discourse. Proc. ITRW on Prosody in Speech Recognition and Understanding, paper 3
@inproceedings{bates01_prosody, author={Rebecca Bates and Mari Ostendorf}, title={{Modeling pronunciation variation in conversational speech using syntax and discourse}}, year=2001, booktitle={Proc. ITRW on Prosody in Speech Recognition and Understanding}, pages={paper 3} }