In the Laboratory of Speech Acoustics ASR research has been prepared, in which we were searching for the possibility to contribute to the higher linguistic processing levels of ASR - at syntactic, and semantic level - by acoustical preprocessing of the supra-segmental (prosodic) features. The subject of our current article is a semantic level processing, built on supra-segmental parameters. HMM models of modality types of sentences were built by training the recognizer with speech databases processed according to the types of modality, and a simple set of connection rules of modalities were used as linguistic model. The best recognition results were obtained, when state numbers of HMM clause type-models were 11, and each state had 2 Gaussian components. With these adjustments the accuracy of recognized types of modalities was 71 % for Hungarian, and 78% for German, even though the database was small for both languages.
Bibliographic reference. Vicsi, Klára / Szaszák, György (2008): "Using prosody for the improvement of ASR - sentence modality recognition", In INTERSPEECH-2008, 2877-2880.