11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Data-Driven Analysis of Realtime Vocal Tract MRI Using Correlated Image Regions

Adam C. Lammert, Michael I. Proctor, Shrikanth S. Narayanan

University of Southern California, USA

Realtime MRI provides useful data about the human vocal tract, but also introduces many of the challenges of processing high-dimensional image data. Intuitively, data reduction would proceed by finding the air-tissue boundaries in the images, and tracing an outline of the vocal tract. This approach is anatomically well-founded. We explore an alternative approach which is data-driven and has a complementary set of advantages. Our method directly examines pixel intensities. By analyzing how the pixels co-vary over time, we segment the image into spatially localized regions, in which the pixels are highly correlated with each other. Intensity variations in these correlated regions correspond to vocal tract constrictions, which are meaningful units of speech production. We show how these regions can be extracted entirely automatically, or with manual guidance. We present two examples and discuss its merits, including the opportunity to do direct data-driven time series modeling.

Full Paper

Bibliographic reference.  Lammert, Adam C. / Proctor, Michael I. / Narayanan, Shrikanth S. (2010): "Data-driven analysis of realtime vocal tract MRI using correlated image regions", In INTERSPEECH-2010, 1572-1575.