The goal of the work presented here is to automatically predict the type of an utterance in spoken dialogue by using automatically extracted suprasegmental information. For this task we present and compare three stochastic algorithms: hidden Markov models, artificial neural nets, and classification and regression trees. These models are easily trainable, reasonably robust and fit into the probabilistic framework required for speech recognition. Utterance type detection is dependent on the assumption that different types of utterances have different suprasegmental characteristics. The categorisation of these utterance types is based on the theory of conversation games and consists of 12 move types (e.g. reply to a question, wh-question, acknowledgement). The system is speaker independent and is trained on spontaneous goal-directed dialogue collected from Canadian speakers. This utterance type detector is used in an automatic speech recognition system to reduce word error rate.
Cite as: Wright, H. (1998) Automatic utterance type detection using suprasegmental features. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0575, doi: 10.21437/ICSLP.1998-158
@inproceedings{wright98_icslp, author={Helen Wright}, title={{Automatic utterance type detection using suprasegmental features}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0575}, doi={10.21437/ICSLP.1998-158} }