Diphthong interpolation, phone mapping, and prosody transfer for speech synthesis of similar dialect pairs

Michael Pucher, Carina Lozo, Philip Vergeiner, Dominik Wallner


Dialect synthesis is a challenging area of research and contrasts the synthesis of standard varieties not only as to the non standard nature of dialects but also in collecting proper data. In this paper we describe a diphthong interpolation and phone mapping based method that can be used to synthesize a new dialect with an existing dialect model of a similar dialect. The method only uses transcriptions of original dialect data, which are then mapped onto the phones in the model. We improve the basic mapping model further by transferring prosodic features such as original duration and F0. In addition to prosody transfer we want to investigate, if interpolation between two diphthong parts can substitute satisfactorily a missing phone in the target dialect. The methods are applied to two South-Bavarian dialects from Tyrol in Austria.


 DOI: 10.21437/SSW.2019-36

Cite as: Pucher, M., Lozo, C., Vergeiner, P., Wallner, D. (2019) Diphthong interpolation, phone mapping, and prosody transfer for speech synthesis of similar dialect pairs. Proc. 10th ISCA Speech Synthesis Workshop, 200-204, DOI: 10.21437/SSW.2019-36.


@inproceedings{Pucher2019,
  author={Michael Pucher and Carina Lozo and Philip Vergeiner and Dominik Wallner},
  title={{Diphthong interpolation, phone mapping, and prosody transfer for speech synthesis of similar dialect pairs}},
  year=2019,
  booktitle={Proc. 10th ISCA Speech Synthesis Workshop},
  pages={200--204},
  doi={10.21437/SSW.2019-36},
  url={http://dx.doi.org/10.21437/SSW.2019-36}
}