An Investigation of Convolution Attention Based Models for Multilingual Speech Synthesis of Indian Languages

Pallavi Baljekar, SaiKrishna Rallabandi, Alan W Black


In this paper we investigate multi-speaker, multi-lingual speech synthesis for 4 Indic languages (Hindi, Marathi, Gujarathi, Bengali) as well as English in a fully convolutional attention based model. We show how factored embeddings can allow cross lingual transfer and investigate methods to adapt the model in a low resource scenario for the case of Marathi and Gujarati. We also show results on how effectively the model scales to a new language and how much data is required to train the system on a new language.


 DOI: 10.21437/Interspeech.2018-1869

Cite as: Baljekar, P., Rallabandi, S., Black, A.W. (2018) An Investigation of Convolution Attention Based Models for Multilingual Speech Synthesis of Indian Languages. Proc. Interspeech 2018, 2474-2478, DOI: 10.21437/Interspeech.2018-1869.


@inproceedings{Baljekar2018,
  author={Pallavi Baljekar and SaiKrishna Rallabandi and Alan W Black},
  title={An Investigation of Convolution Attention Based Models for Multilingual Speech Synthesis of Indian Languages},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={2474--2478},
  doi={10.21437/Interspeech.2018-1869},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1869}
}