Thios work explores ways of incorporating prosodic modules for efficiently improving the performance of Large Vocabulary Continuous Sopeech Recognition (LVCSR). Prosodic-syntactic boundary as an information source can be used to improve the performance of LVCSR in both efficiency and accuracy. In this paper, we address the effect of language model score on setting pruning beam width and how to controll the Cross-word Dependent (CCD) models by prosodic boundary information. In the first pass decoding, dynamic beam search strategy regarding inner-word and cross-word paths is proposed to reduce search space efficiently, and then cross-word context dependent models are optimized using boundary information in tghe second pass decoding. The experimental evaluation demonstrates the efficiency of incorporation of prosodic modules and shows the effect of the syntactic and prosodic boundary in LVCSR.
Cite as: Lee, S.-w., Hirose, K., Minematsu, N. (2001) Incorporation of prosodic modules for large vocabulary continuous speech recognition. Proc. ITRW on Prosody in Speech Recognition and Understanding, paper 18
@inproceedings{lee01_prosody, author={Shi-wook Lee and Keikichi Hirose and Nobuaki Minematsu}, title={{Incorporation of prosodic modules for large vocabulary continuous speech recognition}}, year=2001, booktitle={Proc. ITRW on Prosody in Speech Recognition and Understanding}, pages={paper 18} }