10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Automatic Formant Extraction for Sociolinguistic Analysis of Large Corpora

Keelan Evanini, Stephen Isard, Mark Liberman

University of Pennsylvania, USA

In this paper, we propose a method of formant prediction from pole and bandwidth data, and apply this method to automatically extract F1 and F2 values from a corpus of regional dialect variation in North America that contains 134,000 manual formant measurements. These predicted formants are shown to increase performance over the default formant values from a popular speech analysis package. Finally, we demonstrate that sociolinguistic analysis based on vowel formant data can be conducted reliably using the automatically predicted values, and we argue that sociolinguists should begin to use this methodology in order to be able to analyze larger amounts of data efficiently.

Full Paper

Bibliographic reference.  Evanini, Keelan / Isard, Stephen / Liberman, Mark (2009): "Automatic formant extraction for sociolinguistic analysis of large corpora", In INTERSPEECH-2009, 1655-1658.