EUROSPEECH 2003 - INTERSPEECH 2003
This paper describes an approach for flexible speech act identification of spontaneous speech with disfluency. In this approach, semantic information, syntactic structure, and fragment features of an input utterance are statistically encapsulated into a proposed speech act hidden Markov model (SAHMM) to characterize the speech act. To deal with the disfluency problem in a sparse training corpus, an interpolation mechanism is exploited to re-estimate the state transition probability in SAHMM. Finally, the dialog system accepts the speech act with best score and returns the corresponding response. Experiments were conducted to evaluate the proposed approach using a spoken dialogue system for the air travel information service. A testing database from 25 speakers containing 480 dialogues including 3038 sentences was collected and used for evaluation. Using the proposed approach, the experimental results show that the performance can achieve 90.3% in speech act correct rate (SACR) and 85.5% in fragment correct rate (FCR) for fluent speech and gains a significant improvement of 5.7% in SACR and 6.9% in FCR compared to the baseline system without considering filled pauses for disfluent speech.
Bibliographic reference. Wu, Chung-Hsien / Yan, Gwo-Lang (2003): "Flexible speech act identification of spontaneous speech with disfluency", In EUROSPEECH-2003, 653-656.