9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Mandarin Connected Digits Recognition for Whispered Speech

Tingting Ru, Xiang Xie, Hui Yin, Jingming Kuang

Beijing Institute of Technology, China

In this paper, the acoustic characteristics and recognition of whispered speech are discussed. A Mandarin digits database is built both in normal speech and whispered speech. The collected speech materials of normal and whispered speech are analyzed to verify the characteristics and differences for the two kinds of speech. Cross recognition is carried out using normal and whispered speech as training data and testing data respectively, and the detailed recognition results are analyzed by using the confusion matrices. The results show that it's not suitable to recognize whispered speech using models trained by normal speech, and the word correct rate of the whispered speech is in close relation with its acoustic characteristics. Some possible solutions are also suggested.

Full Paper

Bibliographic reference.  Ru, Tingting / Xie, Xiang / Yin, Hui / Kuang, Jingming (2008): "Mandarin connected digits recognition for whispered speech", In INTERSPEECH-2008, 1141-1144.