Recognizing aspects of articulation from audio recordings of speech is an important problem, either as an end in itself or as part of an articulatory approach to automatic speech recognition. In this paper we study the frame-level classification of a set of articulatory features (AFs) inspired by the vocal tract variables of articulatory phonology. We compare k nearest neighbor (k-NN) classifiers and multilayer perceptrons (MLPs), using different acoustic feature vectors, and classify the AFs either independently or jointly. We also consider using the MLP outputs for all of the AFs as inputs to k-NN classifiers for the individual AFs, effectively using the MLPs as a form of nonlinear dimensionality reduction and allowing the decision for each AF to be based on the MLPs for the other AFs. We find that MLPs outperform k-NN classifiers, while k-NN classifiers using MLP outputs outperform both.
Bibliographic reference. Næss, Arild Brandrud / Livescu, Karen / Prabhavalkar, Rohit (2011): "Articulatory feature classification using nearest neighbors", In INTERSPEECH-2011, 2301-2304.