ISCA Archive SSW 2023
ISCA Archive SSW 2023

Investigating the Utility of Surprisal from Large Language Models for Speech Synthesis Prosody

Sofoklis Kakouros, Juraj Šimko, Martti Vainio, Antti Suni

This paper investigates the use of word surprisal, a measure ofthe predictability of a word in a given context, as a feature toaid speech synthesis prosody. We explore how word surprisalextracted from large language models (LLMs) correlates withword prominence, a signal-based measure of the salience of aword in a given discourse. We also examine how context lengthand LLM size affect the results, and how a speech synthesizerconditioned with surprisal values compares with a baseline system. To evaluate these factors, we conducted experiments using a large corpus of English text and LLMs of varying sizes.Our results show that word surprisal and word prominence aremoderately correlated, suggesting that they capture related butdistinct aspects of language use. We find that length of contextand size of the LLM impact the correlations, but not in the direction anticipated, with longer contexts and larger LLMs generally underpredicting prominent words in a nearly linear manner. We demonstrate that, in line with these findings, a speechsynthesizer conditioned with surprisal values provides a minimal improvement over the baseline with the results suggestinga limited effect of using surprisal values for eliciting appropriateprominence patterns.


doi: 10.21437/SSW.2023-20

Cite as: Kakouros, S., Šimko, J., Vainio, M., Suni, A. (2023) Investigating the Utility of Surprisal from Large Language Models for Speech Synthesis Prosody. Proc. 12th ISCA Speech Synthesis Workshop (SSW2023), 127-133, doi: 10.21437/SSW.2023-20

@inproceedings{kakouros23_ssw,
  author={Sofoklis Kakouros and Juraj Šimko and Martti Vainio and Antti Suni},
  title={{Investigating the Utility of Surprisal from Large Language Models for Speech Synthesis Prosody}},
  year=2023,
  booktitle={Proc. 12th ISCA Speech Synthesis Workshop (SSW2023)},
  pages={127--133},
  doi={10.21437/SSW.2023-20}
}