Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Is Voice Quality Enough? - Study on How the Situation and Userís Awareness Influence the Utterance Features

Shinya Yamada, Toshihiko Itoh, Kenji Araki

Hokkaido University, Japan

This paper presents the characteristic differences of linguistic and acoustic features observed in different spoken dialogue situations and with different dialogue partners: human-human vs. human-machine interactions. And it also presents influences of awareness of users on those characteristics. We compare the linguistic and acoustic features of the userís speech to a spoken dialogue system and to a human operator in several goal setting and destination database searching tasks for a car navigation system. Because it is not clear enough whether different dialogue situations and different dialogue partners cause any differences of linguistic or acoustic features on oneís utterances in a speech interface system, we have performed experiments in several dialogue situations[4]. However, in these experiments the conditions such as voice quality and awareness of users such as impressions on the partner and prejudices against a system have not been considered. And so we collected a set of spoken dialogues in new dialogue situations. To investigate influence of voice quality, we also prepare recorded voice for response of dialogue partners and compared the influences of voice (natural voice, synthetic voice and recorded voice). We also made users answer questionnaire before and after the experiments and investigated characteristic differences caused by awareness of users. Additionally, in order to confirm the usefulness of the results of all experiments, we actually applied acoustic features of usersí utterances and identified the utterances made to a system.

Full Paper

Bibliographic reference.  Yamada, Shinya / Itoh, Toshihiko / Araki, Kenji (2006): "Is voice quality enough? - study on how the situation and user≤s awareness influence the utterance features", In INTERSPEECH-2006, paper 1955-Mon2FoP.9.