ISCA Archive SpeechProsody 2010
ISCA Archive SpeechProsody 2010

Prosodically-based automatic segmentation and punctuation

Helena Moniz, Fernando Batista, Hugo Meinedo, Alberto Abad, Isabel Trancoso, Ana Isabel Mata, Nuno Mamede

This work explores prosodic/acoustic cues for improving a baseline phone segmentation module. The baseline version is provided by a large vocabulary continuous speech recognition system. An analysis of the baseline results revealed problems in word boundary detection, that we tried to solve by using postprocessing rules based on prosodic features (pitch, energy and duration). These rules achieved better results in terms of interword pause detection, durations of silent pauses previously detected, and also durations of phones at initial and final sentencelike unit level. These improvements may be relevant not only for retraining acoustic models, but also for the automatic punctuation task. These two tasks were evaluated. Results based on more reliable boundaries are promising. This work allows us to tackle more challenging problems, combining prosodic and lexical features for the identification of sentence-like units.

Index Terms: prosody, automatic phone segmentation, punctuation.

Cite as: Moniz, H., Batista, F., Meinedo, H., Abad, A., Trancoso, I., Mata, A.I., Mamede, N. (2010) Prosodically-based automatic segmentation and punctuation. Proc. Speech Prosody 2010, paper 910

  author={Helena Moniz and Fernando Batista and Hugo Meinedo and Alberto Abad and Isabel Trancoso and Ana Isabel Mata and Nuno Mamede},
  title={{Prosodically-based automatic segmentation and punctuation}},
  booktitle={Proc. Speech Prosody 2010},
  pages={paper 910}