9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Building Sleek Synthesizers for Multi-Lingual Screen Reader

Veera Raghavendra E. (1), B. Yegnanarayana (1), Alan W. Black (2), Kishore Prahallad (2)

(1) IIIT Hyderabad, India; (2) Carnegie Mellon University, USA

In this paper, we are investigating the unit size: syllable, halfphone and quarter-phone to be used for speech synthesis in multi-lingual screen reader in phonetic languages such as Telugu and non-phonetic language English. Perceptual studies show that syllable-level unit performs better for Telugu and half-phone units perform better for English. While syllable based synthesizers produce better sounding speech, the coverage of all syllables is a non-trivial issue. We address the issue of coverage of syllables through approximate matching of syllable and show that such approximation produces intelligible and better quality speech than diphone units. In this paper, we also propose a hybrid synthesizer within the framework of unit selection and also show that the hybrid synthesizer built from pruned database performs as well as hybrid synthesizer built from unpruned database.

Full Paper

Bibliographic reference.  Raghavendra E., Veera / Yegnanarayana, B. / Black, Alan W. / Prahallad, Kishore (2008): "Building sleek synthesizers for multi-lingual screen reader", In INTERSPEECH-2008, 1865-1868.