12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Articulatory Feature Classification Using Nearest Neighbors

Arild Brandrud Næss (1), Karen Livescu (2), Rohit Prabhavalkar (3)

(1) NTNU, Norway
(2) Toyota Technological Institute at Chicago, USA
(3) Ohio State University, USA

Recognizing aspects of articulation from audio recordings of speech is an important problem, either as an end in itself or as part of an articulatory approach to automatic speech recognition. In this paper we study the frame-level classification of a set of articulatory features (AFs) inspired by the vocal tract variables of articulatory phonology. We compare k nearest neighbor (k-NN) classifiers and multilayer perceptrons (MLPs), using different acoustic feature vectors, and classify the AFs either independently or jointly. We also consider using the MLP outputs for all of the AFs as inputs to k-NN classifiers for the individual AFs, effectively using the MLPs as a form of nonlinear dimensionality reduction and allowing the decision for each AF to be based on the MLPs for the other AFs. We find that MLPs outperform k-NN classifiers, while k-NN classifiers using MLP outputs outperform both.

Full Paper

Bibliographic reference.  Næss, Arild Brandrud / Livescu, Karen / Prabhavalkar, Rohit (2011): "Articulatory feature classification using nearest neighbors", In INTERSPEECH-2011, 2301-2304.