INTERSPEECH 2009
10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Discriminant Spectrotemporal Features for Phoneme Recognition

Nima Mesgarani, G. S. V. S. Sivaram, Sridhar Krishna Nemala, Mounya Elhilali, Hynek Hermansky

Johns Hopkins University, USA

We propose discriminant methods for deriving two-dimensional spectrotemporal features for phoneme recognition that are estimated to maximize the separation between the representations of phoneme classes. The linearity of the filters results in their intuitive interpretation enabling us to investigate the working principles of the system and to improve its performance by locating the sources of error. Two methods for the estimation of filters are proposed: Regularized Least Square (RLS) and Modified Linear Discriminant Analysis (MLDA). Both methods reach a comparable improvement over the baseline condition demonstrating the advantage of the discriminant spectrotemporal filters.

Full Paper

Bibliographic reference.  Mesgarani, Nima / Sivaram, G. S. V. S. / Nemala, Sridhar Krishna / Elhilali, Mounya / Hermansky, Hynek (2009): "Discriminant spectrotemporal features for phoneme recognition", In INTERSPEECH-2009, 2983-2986.