ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

A real-time filled pause detection system for spontaneous speech recognition

Masataka Goto, Katunobu Itou, Satoru Hayamizu

This paper describes a method for automatically detecting filled (vocalized) pauses, which are one of the hesitation phenomena that current speech recognizers typically cannot handle. The detection of these pauses is important in spontaneous speech dialogue systems because they play valuable roles, such as helping a speaker keep a conversational turn, in oral communication. Although a few speech recognition systems have processed filled pauses within subword-based connected word recognition or word-spotting frameworks, they did not detect the pauses indi-vidually and consequently could not consider their roles. In this paper we propose a method that detects filled pauses and word lengthening on the basis of small fundamental frequency transition and small spectral envelope deformation under the assumption that speakers do not change articulator parameters during filled pauses. Experimental results for a Japanese spoken dialogue corpus show that our real-time filled-pause-detection system yielded a recall rate of 84.9% and a precision rate of 91.5%.


doi: 10.21437/Eurospeech.1999-60

Cite as: Goto, M., Itou, K., Hayamizu, S. (1999) A real-time filled pause detection system for spontaneous speech recognition. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 227-230, doi: 10.21437/Eurospeech.1999-60

@inproceedings{goto99_eurospeech,
  author={Masataka Goto and Katunobu Itou and Satoru Hayamizu},
  title={{A real-time filled pause detection system for spontaneous speech recognition}},
  year=1999,
  booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)},
  pages={227--230},
  doi={10.21437/Eurospeech.1999-60}
}