Discriminant spectrotemporal features for phoneme recognition

Nima Mesgarani, G. S. V. S. Sivaram, Sridhar Krishna Nemala, Mounya Elhilali, Hynek Hermansky

We propose discriminant methods for deriving two-dimensional spectrotemporal features for phoneme recognition that are estimated to maximize the separation between the representations of phoneme classes. The linearity of the filters results in their intuitive interpretation enabling us to investigate the working principles of the system and to improve its performance by locating the sources of error. Two methods for the estimation of filters are proposed: Regularized Least Square (RLS) and Modified Linear Discriminant Analysis (MLDA). Both methods reach a comparable improvement over the baseline condition demonstrating the advantage of the discriminant spectrotemporal filters.

Cite as: Mesgarani, N., Sivaram, G.S.V.S., Nemala, S.K., Elhilali, M., Hermansky, H. (2009) Discriminant spectrotemporal features for phoneme recognition. Proc. Interspeech 2009, 2983-2986, doi: 10.21437/Interspeech.2009-755

