Using Pupil Dilation to Measure Cognitive Load When Listening to Text-to-Speech in Quiet and in Noise

Avashna Govender, Anita E. Wagner, Simon King


With increased use of text-to-speech (TTS) systems in real-world applications, evaluating how such systems influence the human cognitive processing system becomes important. Particularly in situations where cognitive load is high, there may be negative implications such as fatigue. For example, noisy situations generally require the listener to exert increased mental effort. A better understanding of this could eventually suggest new ways of generating synthetic speech that demands low cognitive load. In our previous study, pupil dilation was used as an index of cognitive effort. Pupil dilation was shown to be sensitive to the quality of synthetic speech, but there were some uncertainties regarding exactly what was being measured. The current study resolves some of those uncertainties. Additionally, we investigate how the pupil dilates when listening to synthetic speech in the presence of speech-shaped noise. Our results show that, in quiet listening conditions, pupil dilation does not reflect listening effort but rather attention and engagement. In noisy conditions, increased pupil dilation indicates that listening effort increases as signal-to-noise ratio decreases, under all conditions tested.


 DOI: 10.21437/Interspeech.2019-1783

Cite as: Govender, A., Wagner, A.E., King, S. (2019) Using Pupil Dilation to Measure Cognitive Load When Listening to Text-to-Speech in Quiet and in Noise. Proc. Interspeech 2019, 1551-1555, DOI: 10.21437/Interspeech.2019-1783.


@inproceedings{Govender2019,
  author={Avashna Govender and Anita E. Wagner and Simon King},
  title={{Using Pupil Dilation to Measure Cognitive Load When Listening to Text-to-Speech in Quiet and in Noise}},
  year=2019,
  booktitle={Proc. Interspeech 2019},
  pages={1551--1555},
  doi={10.21437/Interspeech.2019-1783},
  url={http://dx.doi.org/10.21437/Interspeech.2019-1783}
}