In this paper, the acoustic characteristics and recognition of whispered speech are discussed. A Mandarin digits database is built both in normal speech and whispered speech. The collected speech materials of normal and whispered speech are analyzed to verify the characteristics and differences for the two kinds of speech. Cross recognition is carried out using normal and whispered speech as training data and testing data respectively, and the detailed recognition results are analyzed by using the confusion matrices. The results show that it's not suitable to recognize whispered speech using models trained by normal speech, and the word correct rate of the whispered speech is in close relation with its acoustic characteristics. Some possible solutions are also suggested.
Bibliographic reference. Ru, Tingting / Xie, Xiang / Yin, Hui / Kuang, Jingming (2008): "Mandarin connected digits recognition for whispered speech", In INTERSPEECH-2008, 1141-1144.