7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Automatic User-Adaptive Speaking Rate Selection for Information Delivery

Nigel Ward (1), Satoshi Nakagawa (2)

(1) University of Texas at El Paso, USA; (2) IBM Japan Ltd., Japan

Today there are many services which provide information over the phone using a prerecorded or synthesized voice. These voices are invariant in speed. Humans giving information over the telephone, however, tend to adapt the speed of their presentation to suit the needs of the listener. This paper presents a preliminary model of this adaptation. In a corpus of simulated directory assistance dialogs the operatorís speed in number-giving correlates with the speed of the userís initial response and with the userís speaking rate. Multiple regression gives a formula which predicts appropriate speaking rates, and these predictions correlate (.46) with the speeds observed in good dialogs in the corpus. An experiment with 18 subjects suggests that users prefer a system which adapts its speed to the user in this way.

