Prosody in Speech Recognition and Understanding

October 22-24, 2001
Molly Pitcher Inn, Red Bank, NJ, USA

Prosodic Scoring of Recognition Outputs in the JUPITER Domain

Chao Wang, Stephanie Seneff

Spoken Language Systems Group, MIT Laboratory for Computer Science, Cambridge, MA, USA

JUPITER is a conversational system that allows users to access weather information over the telephone using natural speech. This work examines the use of prosodic information to predict speech recognition errors more accurately for improved system robustness. Two approaches were explored here. The first approach is based on a probabilistic condence scoring framework, which uses prosodic cues as additional features to improve both utterance-level and word-level condence scoring. The second approach aims at scoring part of the prosodic space, focusing on phrases that bear important communicative functions. We explored the feasibility of characterizing directly the F0 contours of some carefully selected English phrase patterns. We envision that these models can be applied to resort recognizer N-best outputs or to support rejection.

Full Paper (PDF)   Full Paper (Zipped Postscript)

Bibliographic reference.  Wang, Chao / Seneff, Stephanie (2001): "Prosodic scoring of recognition outputs in the JUPITER domain", In Prosody-2001, paper 28.