7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Bark Resolution from Speech Data

Naren Malayath (1), Hynek Hermansky (2)

(1) Qualcomm Inc., USA; (2) Oregon Health & Science University, USA

This paper discusses the relevance of non-uniform frequency resolution used by current speech analysis methods like Mel frequency analysis and perceptual linear predictive (PLP) analysis. It is shown that linear discriminant analysis of short-time Fourier spectrum of speech yields spectral basis functions which provide comparatively lower resolution to the high frequency region of spectrum. This is consistent with critical-band resolution and is shown to be caused by the spectral properties of vowel sounds. Further, we show that this non-uniform resolution can be traced to the physiology of speech production mechanism. In ASR experiments, features extracted by the discriminant functions are shown to outperform the conventional features derived by cosine basis functions.


Full Paper

Bibliographic reference.  Malayath, Naren / Hermansky, Hynek (2002): "Bark resolution from speech data", In ICSLP-2002, 2169-2172.