Interspeech'2005 - Eurospeech
Speakers with velopharyngeal incompetence produce hypernasal speech across voiced elements. Acoustical study  on hypernasal speech and nasalized vowels of normal speakers revealed the fact that there is an additional formant frequency introduced in the low-frequency region, close to the first formant of the phonations /a/, /i/, and /u/. Based on this observation, in the current study, the focus is given to the low-frequency region alone, by low-pass filtering the speech signal. From each frame of the given speech signal, a three dimensional feature vector, which comprises of the locations of first two highest frequency peaks in the group delay spectrum and the ratio of the group delay of these frequencies, is extracted. An Accumulated Minimum distance classifier and a Maximum likelihood classifier are trained for each of the phonations separately, and tested to make a decision between normal and hypernasal speakers. For the current study, phonations /a/, /i/, and /u/ uttered by 45 speakers with cleft palate who are expected to produce hypernasal speech, and phonations of 26 normal speakers are considered. Results show that the presence of hypernasality in speech can be detected with 85% of accuracy using the statistical classifiers which use the proposed three dimensional feature vector.
Bibliographic reference. Vijayalakshmi, P. / RamasubbaReddy, M. (2005): "Detection of hypernasality using statistical pattern classifiers", In INTERSPEECH-2005, 701-704.