ISCA Archive SSW 2021
ISCA Archive SSW 2021

Are we truly modeling expressiveness? A study on expressive TTS in Brazilian Portuguese for real-life application styles

Lucas H. Ueda, Paula D. P. Costa, Flavio O. Simoes, Mário U. Neto

This paper presents a study of expressive speech synthesis applied to real-life application styles in Brazilian Portuguese. We explore the use of data with different recording conditions in state-of-the-art architectures in expressive TTS. Our results suggest that the variability of recording conditions of the same style, combined with a guided training of the latent representation space of the Reference Encoder, assists in the modeling of non-archetypal expressivities. Additionally, we propose an alternative to evaluating the model’s ability to generate expressive speech during preliminary results, based on a classifier using GeMAPS features.


doi: 10.21437/SSW.2021-15

Cite as: Ueda, L.H., Costa, P.D.P., Simoes, F.O., Neto, M.U. (2021) Are we truly modeling expressiveness? A study on expressive TTS in Brazilian Portuguese for real-life application styles. Proc. 11th ISCA Speech Synthesis Workshop (SSW 11), 84-89, doi: 10.21437/SSW.2021-15

@inproceedings{ueda21_ssw,
  author={Lucas H. Ueda and Paula D. P. Costa and Flavio O. Simoes and Mário U. Neto},
  title={{Are we truly modeling expressiveness? A study on expressive TTS in Brazilian Portuguese for real-life application styles}},
  year=2021,
  booktitle={Proc. 11th ISCA Speech Synthesis Workshop (SSW 11)},
  pages={84--89},
  doi={10.21437/SSW.2021-15}
}