ISCA Archive ICSLP 1994
ISCA Archive ICSLP 1994

Pitch-based emphasis detection for segmenting speech recordings

Barry Arons

This paper describes a technique to automatically locate emphasized segments of a speech recording based on pitch. These salient portions can be used in a variety of applications, but were originally designed to be used in an interactive system that enables high-speed skimming and browsing of speech recordings. Previous techniques to detect emphasis have used Hidden Markov Models; emphasized regions in close temporal proximity were found to successfully create useful summaries of the recordings. The new research described herein presents a simpler technique to detect salient segments and summarize a recording without using statistical models that require large amounts of training data. The algorithm adapts to the pitch range of a speaker, then automatically selects the regions of highest pitch activity as a measure of emphasis.


Cite as: Arons, B. (1994) Pitch-based emphasis detection for segmenting speech recordings. Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994), 1931-1934

@inproceedings{arons94_icslp,
  author={Barry Arons},
  title={{Pitch-based emphasis detection for segmenting speech recordings}},
  year=1994,
  booktitle={Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994)},
  pages={1931--1934}
}