9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Using Prosody for the Improvement of ASR - Sentence Modality Recognition

Klára Vicsi, György Szaszák

Laboratory of Speech Acoustics, Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Hungary

In the Laboratory of Speech Acoustics ASR research has been prepared, in which we were searching for the possibility to contribute to the higher linguistic processing levels of ASR - at syntactic, and semantic level - by acoustical preprocessing of the supra-segmental (prosodic) features. The subject of our current article is a semantic level processing, built on supra-segmental parameters. HMM models of modality types of sentences were built by training the recognizer with speech databases processed according to the types of modality, and a simple set of connection rules of modalities were used as linguistic model. The best recognition results were obtained, when state numbers of HMM clause type-models were 11, and each state had 2 Gaussian components. With these adjustments the accuracy of recognized types of modalities was 71 % for Hungarian, and 78% for German, even though the database was small for both languages.

Full Paper

Bibliographic reference.  Vicsi, Klára / Szaszák, György (2008): "Using prosody for the improvement of ASR - sentence modality recognition", In INTERSPEECH-2008, 2877-2880.