INTERSPEECH 2012
13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Analysis of Temporal Resolution in Frequency Domain Linear Prediction

Sriram Ganapathy (1), Hynek Hermansky (1,2)

(1) Center for Language and Speech Processing; (2) Human Language Technology Center of Excellence;
Johns Hopkins University, Baltimore, MD, USA

Frequency domain linear prediction (FDLP) is a technique for auto-regressive (AR) modeling of Hilbert envelopes of the signal. The model is derived by the application of linear prediction on the discrete cosine transform (DCT) of the signal. We analyze resolution properties of the FDLP model using synthetic signals with peaks that are closely spaced in time. The temporal resolution of the FDLP model is defined as the inverse of a critical time span, which is the duration between two temporal peaks in the signal below which the resulting peaks of the AR model cannot be resolved. We study several factors that affect this resolution, such as the location of the input peaks within the analysis segment, type of window applied in the DCT of the signal, and order of the FDLP model. The results of this analysis suggest ways to improve the performance of the FDLP analysis on phoneme recognition of speech. The improved FDLP features outperform MFCC features in both the clean and the noisy conditions.

Index Terms: Frequency Domain Linear Prediction, Resolution Analysis, Feature Extraction, Phoneme Recognition

Full Paper

Bibliographic reference.  Ganapathy, Sriram / Hermansky, Hynek (2012): "Analysis of temporal resolution in frequency domain linear prediction", In INTERSPEECH-2012, 1828-1831.