Hindi speech database

K. Samudravijaya, P. V. S. Rao, S. S. Agrawal

The design and development of an annotated and time-aligned speech database for Hindi language is described here. Although this continuous speech database is principally intended for training of a speech recognition system for Hindi, the design specifications of the database are general so that it can also be useful in tasks such as speaker recognition, study of acousticphonetic correlates of the language. The database consists of a total of 500 sentences spoken by 50 speakers. There are two sets of sentences. The first set of 2 sentences (containing most Hindi phonemes) was read by each and every speaker. The second sets of sentences (8 distinct sentences per speaker) were designed such that they collectively cover most phonemic contexts. The database is comprehensive enough to effectively capture phonetic, acoustic, intra-speaker and inter-speaker variabilities in Hindi speech. The speech data was simultaneously recorded using a close talking microphone and another desktop "far field" microphone. The former speech data was manually segmented and labeled in terms of sub-phonetic units by trained personnel.

The database was used to conduct a study of the prosodic characteristics of the Hindi vowels. There are five pairs of vowels in Indian languages; one member is longer in duration than the other. It was observed that native speakers of Hindi seem to give more importance to the duration attribute to contrast vowels in a vowel pair than non native speakers.

