ISCA Archive SSW 2013
ISCA Archive SSW 2013

Text to speech in new languages without a standardized orthography

Sunayana Sitaram, Gopala Krishna Anumanchipalli, Justin Chiu, Alok Parlikar, Alan W. Black

Many spoken languages do not have a standardized writing system. Building text to speech voices for them, without accurate transcripts of speech data is difficult. Our language independent method to bootstrap synthetic voices using only speech data relies upon crosslingual phonetic decoding of speech. In this paper, we describe novel additions to our bootstrapping method. We present results on eight different languages—English, Dari, Pashto, Iraqi, Thai, Konkani, Inupiaq and Ojibwe, from different language families and show that our phonetic voices can be made understandable with as little as an hour of speech data that never had transcriptions, and without many resources in the target language available. We also present purely acoustic techniques that can help induce syllable and word level information that can further improve the intelligibility of these voices.

Index Terms: speech synthesis, synthesis without text, languages without an orthography


Cite as: Sitaram, S., Anumanchipalli, G.K., Chiu, J., Parlikar, A., Black, A.W. (2013) Text to speech in new languages without a standardized orthography. Proc. 8th ISCA Workshop on Speech Synthesis (SSW 8), 95-100

@inproceedings{sitaram13_ssw,
  author={Sunayana Sitaram and Gopala Krishna Anumanchipalli and Justin Chiu and Alok Parlikar and Alan W. Black},
  title={{Text to speech in new languages without a standardized orthography}},
  year=2013,
  booktitle={Proc. 8th ISCA Workshop on Speech Synthesis (SSW 8)},
  pages={95--100}
}