Towards Building an Attentive Artificial Listener: On the Perception of Attentiveness in Feedback Utterances

Catharine Oertel, Joakim Gustafson, Alan W. Black


Current speech synthesizers typically lack backchannel tokens. Those synthesiser, which include backchannels, typically only support a limited set of stereotypical functions. However, this does not mirror the subtleties of backchannels in spontaneous conversations. If we want to be able to build an artificial listener, that can display degrees of attentiveness, we need a speech synthesizer with more fine-grained control of the prosodic realisations of its backchannels.

In the current study we used a corpus of three-party face-to-face discussions to sample backchannels produced under varying conversational dynamics. We wanted to understand i) which prosodic cues are relevant for the perception of varying degrees of attentiveness ii) how much of a difference is necessary for people to perceive a difference in attentiveness iii) whether a preliminary classifier could be trained to distinguish between more and less attentive backchannel token.


DOI: 10.21437/Interspeech.2016-1274

Cite as

Oertel, C., Gustafson, J., Black, A.W. (2016) Towards Building an Attentive Artificial Listener: On the Perception of Attentiveness in Feedback Utterances. Proc. Interspeech 2016, 2915-2919.

Bibtex
@inproceedings{Oertel+2016,
author={Catharine Oertel and Joakim Gustafson and Alan W. Black},
title={Towards Building an Attentive Artificial Listener: On the Perception of Attentiveness in Feedback Utterances},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1274},
url={http://dx.doi.org/10.21437/Interspeech.2016-1274},
pages={2915--2919}
}