Fourth International Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU-2014)

St. Petersburg, Russia
May 14-16, 2014

HMM-Based Speech Synthesiser for the Urdu Language

Zeeshan Ahmed (1), Joao P. Cabral (2)

(1) CNGL, University College; (2) Trinity College; Dublin, Ireland

This work presents Hidden Markov Model (HMM) based speeeh synthesis for the Urdu language. This is a widely spoken language across different regions in Asia. For example, Urdu is the official language of Pakistan and one of the national languages of India. Unfortunately, there is no corpus of Urdu currently publicly available that to our knowledge is appropriate for HMM-based speech synthesis purpose. We overcame this problem by recording an Urdu speech database with word and phone labels obtained using manual and semi-automatic annotation approaches. In summary, the objective of this work is to develop an HMM- based Urdu speech synthesiser from scratch by trying to use publicly available text processing tools for this language and by developing the necessary processing components.

Index Terms: HMM-based speech synthesis, Urdu speech synthesiser, Urdu speech corpus

Bibliographic reference.  Ahmed, Zeeshan / Cabral, Joao P. (2014): "HMM-based speech synthesiser for the Urdu language", In SLTU-2014, 92-97.