Speech Prosody 2004

Nara, Japan
March 23-26, 2004

Improvement of Speech Summarization Using Prosodic Information

Akira Inoue, Takayoshi Mikami, Yoichi Yamashita

Department of Computer Science, Ritsumeikan University, Japan

Speech summarization is a technique of extracting important sentences from spoken documents. It provides us useful information to looking for the spoken documents that we want. Spoken documents contain non-linguistic information, which is mainly expressed by prosody, while written text conveys only linguistic information. This paper describes a summarization method which uses prosodic information as well as linguistic information. The linguistic information is derived from text which is transcribed by a continuous speech recognition system. In this paper, the speech summarization is defined as extraction of important sentences from transcribed text. Importance of the sentence is predicted by the prosodic parameters and the linguistic information which are combined by multiple regression analysis. Proposed methods are evaluated both on the correlation between the predicted scores of sentence importance and the preference scores by subjects and on the accuracy of extraction of important sentences. Prosodic information improved the quality of speech summary, and it is more effective when the speech is transcribed by automatic speech recognition because speech recognition errors damage linguistic information.

