7th International Conference on Spoken Language Processing
September 16-20, 2002
This paper discusses the relevance of non-uniform frequency resolution used by current speech analysis methods like Mel frequency analysis and perceptual linear predictive (PLP) analysis. It is shown that linear discriminant analysis of short-time Fourier spectrum of speech yields spectral basis functions which provide comparatively lower resolution to the high frequency region of spectrum. This is consistent with critical-band resolution and is shown to be caused by the spectral properties of vowel sounds. Further, we show that this non-uniform resolution can be traced to the physiology of speech production mechanism. In ASR experiments, features extracted by the discriminant functions are shown to outperform the conventional features derived by cosine basis functions.
Bibliographic reference. Malayath, Naren / Hermansky, Hynek (2002): "Bark resolution from speech data", In ICSLP-2002, 2169-2172.