Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Recognition of Continuous Persian Speech Using a Medium-Sized Vocabulary Speech Corpus

S. M. Ahadi

Electrical Eng. Dept., Amirkabir University, Tehran, Iran

Speech recognition in Persian (Farsi) has recently been addressed by a few native speaking researchers and some approaches to isolated word and phoneme recognition have been reported. A main bottleneck in this research field is the lack of a recognition-specific speech corpus. In this work, a phonetically balanced speech database of Persian has been modified and used in continuous speech recognition. A basic continuous speech recognizer using HMMs has been designed for this language and recognition tests have been performed. Using mixture-Gaussian monophone models, a word recognition rate of about 68% in no-grammar tests were obtained while word-pair grammar tests increased this rate to an unexpectedly high value of 99.5%. The reason is found to be the low grammar perplexity of the database which is not suitable for recognition applications. This obviates the need for a Persian speech corpus specifically designed for such tasks.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Ahadi, S. M. (1999): "Recognition of continuous persian speech using a medium-sized vocabulary speech corpus", In EUROSPEECH'99, 863-866.