Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Focus Detection by Comparison of Speech Waveforms

Satoshi Kitagawa, Nick Campbell

ATR Interpreting Telecommunications Research Labs., Seika-cho, Soraku-gun, Kyoto, Japan

For the eficient translation of speech by machine, the word sequence alone is not always sufficient to convey the intended meaning. Prosodic information can be lost in the speech recognition process. This paper presents methods by which focus can be detected in the input speech using timing and pitch information. By comparing the prosodic characteristics of an input utterance against profiles generated by components of a speech synthesiser for a default rendition of the same sequence of words, we are able to detect areas in the signal where prominence has been added.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Kitagawa, Satoshi / Campbell, Nick (1999): "Focus detection by comparison of speech waveforms", In EUROSPEECH'99, 1867-1870.