Comparative Analysis of Prosodic Characteristics Using WaveNet Embeddings

Antti Suni, Marcin Włodarczak, Martti Vainio, Juraj Šimko

We present a methodology for assessing similarities and differences between language varieties and dialects in terms of prosodic characteristics. A multi-speaker, multi-dialect WaveNet network is trained on low sample-rate signal retaining only prosodic characteristics of the original speech. The network is conditioned on labels related to speakers’ region or dialect. The resulting conditioning embeddings are subsequently used as a multi-dimensional characteristics of different language varieties, with results consistent with dialectological studies. The method and results are illustrated on a Swedia 2000 corpus of Swedish dialectal variation.

 DOI: 10.21437/Interspeech.2019-2373

Cite as: Suni, A., Włodarczak, M., Vainio, M., Šimko, J. (2019) Comparative Analysis of Prosodic Characteristics Using WaveNet Embeddings. Proc. Interspeech 2019, 2538-2542, DOI: 10.21437/Interspeech.2019-2373.

  author={Antti Suni and Marcin Włodarczak and Martti Vainio and Juraj Šimko},
  title={{Comparative Analysis of Prosodic Characteristics Using WaveNet Embeddings}},
  booktitle={Proc. Interspeech 2019},