11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

It Takes Two to Tango - Assessing the Impact of Delay on Conversational Interactivity on Perceived Speech Quality

Sebastian Egger (1), Raimund Schatz (1), Stefan Scherer (2)

(1) Telecommunications Research Center (FTW), Vienna, Austria
(2) Institute of Neural Information Processing, Universität Ulm, Germany

This paper analyzes the relationship between transmission delay, conversational interactivity and perceived quality of bi-directional speech. Our work is grounded on the results of subjective speech quality tests conducted in our lab and recent studies in this field. The test experiments do not only quantify the impact of network delay on speech quality as perceived by untrained subjects. They also assess the mutual influences between conversational interactivity (CI) and delay using three different conversation scenarios. Our results show a clear positive correlation between the level of conversational interactivity and interlocutors' delay sensitivity. Another key finding is that even in contexts of high interactivity, one-way delay values up to 400 ms did not have any significant impact on untrained participants' perception of overall speech quality. Furthermore, we examine the surface structure of participants' conversations across a wide range of delay conditions (up to 1600 ms). Our analysis demonstrates how additional metrics such as unintended interruption rate (UIR) can be successfully used to determine the surface structure and delay sensitivity of a conversation.

Full Paper

Bibliographic reference.  Egger, Sebastian / Schatz, Raimund / Scherer, Stefan (2010): "It takes two to tango - assessing the impact of delay on conversational interactivity on perceived speech quality", In INTERSPEECH-2010, 1321-1324.