2nd Workshop on Spoken Language Technologies for Under-Resourced Languages
Universiti Sains, Penang, Malaysia
This paper describes the development procedure of three different Bangla read speech corpora which can be used for phonetic research and developing speech applications. Several criteria were maintained in the corpora development process that includes considering the phonetic and prosodic features during text selection. On the other hand, a specification was maintained in the recording phase as the speaking style is a vital part in speech applications. We also concentrated on proper text normalization, pronunciation, aligning, and labeling. The labeling was done manually in the present endeavor sentence level labeling (annotation) was completed by maintaining a specification so that it could be expanded in future.
Index Terms: speech corpora, phonetic research, speech processing
Bibliographic reference. Alam, Firoj / Habib, S. M. Murtoza / Sultana, Dil Afroza / Khan, Mumit (2010): "Development of annotated Bangla speech corpora", In SLTU-2010, 35-41.