2nd International Workshop on Speech, Language and Audio in Multimedia (SLAM2014)
We are interested in understanding speech overlaps and their function in human conversations. Previous studies on speech overlaps have relied on supervised methods, small corpora and controlled conversations. The characterization of overlaps based on timing, semantic and discourse function requires an analysis over a very large feature space. In this study, the corpus of overlapped speech segments was automatically extracted from human-human spoken conversations using a large vocabulary Automatic Speech Recognizer (ASR) and a turn segmenter. Each overlap instance is automatically projected onto a high dimensional space of acoustic and lexical features. Then, we used unsupervised clustering to find the distinct and well-separated clusters in terms of acoustic and lexical features. We have evaluated recognition and clustering algorithms over a large set of real human-human spoken conversations. The clusters have been comparatively evaluated in terms of feature distributions and their contribution to the automatic classification of the clusters.
Index Terms: Overlapping Speech, Human Conversation, Discourse, Language understanding
Bibliographic reference. Chowdhury, Shammur Absar / Riccardi, Giuseppe / Alam, Firoj (2014): "Unsupervised recognition and clustering of speech overlaps in spoken conversations", In SLAM-2014, 62-66.