Measuring the Cognitive Load of Synthetic Speech Using a Dual Task Paradigm

Avashna Govender, Simon King


We present a methodology for measuring the cognitive load (listening effort) of synthetic speech using a dual task paradigm. Cognitive load is calculated from changes in a listener’s performance on a secondary task (e.g., reaction time to decide if a visually-displayed digit is odd or even). Previous related studies have only found significant differences between the best and worst quality systems but failed to separate the systems that lie in between. A paradigm that is sensitive enough to detect differences between state-of-the-art, high quality speech synthesizers would be very useful for advancing the state of the art. In our work, four speech synthesis systems from a previous Blizzard Challenge and the corresponding natural speech, were compared. Our results show that reaction times slow down as speech quality reduces, as we expected: lower quality speech imposes a greater cognitive load, taking resources away from the secondary task. However, natural speech did not have the fastest reaction times. This intriguing result might indicate that, as speech synthesizers attain near-perfect intelligibility, this paradigm is measuring something like the listener’s level of sustained attention and not listening effort.


 DOI: 10.21437/Interspeech.2018-1199

Cite as: Govender, A., King, S. (2018) Measuring the Cognitive Load of Synthetic Speech Using a Dual Task Paradigm. Proc. Interspeech 2018, 2843-2847, DOI: 10.21437/Interspeech.2018-1199.


@inproceedings{Govender2018,
  author={Avashna Govender and Simon King},
  title={Measuring the Cognitive Load of Synthetic Speech Using a Dual Task Paradigm},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={2843--2847},
  doi={10.21437/Interspeech.2018-1199},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1199}
}