9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Landmark Based Recognition of Stops: Acoustic Attributes versus Smoothed Spectra

Veena Karjigi, Preeti Rao

IIT Bombay, India

Landmark based recognition of unvoiced word-initial stops is investigated. The relative effectiveness of acoustic-phonetic attributes versus more global spectral shape features is experimentally evaluated for four-way place classification of unvoiced, unaspirated stops. Various feature sets derived from the burst and vocalic transition regions of word initial consonants are compared via GMM based classification under speaker, gender, and vowel-context variability. While a set of acoustic attributes derived from the burst shows the best invariance to vowel context, it is found that global spectral shape features provide the most robust representation of the vocalic transition region by overcoming the problem of errors in explicit formant tracking. A combination of features from the burst and vocalic regions was superior to burst-only cues, but still far from the near perfect identification achieved in human perception.

Full Paper

Bibliographic reference.  Karjigi, Veena / Rao, Preeti (2008): "Landmark based recognition of stops: acoustic attributes versus smoothed spectra", In INTERSPEECH-2008, 1550-1553.