ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Large-scale analysis of formant frequency estimation variability in conversational telephone speech

Nancy F. Chen, Wade Shen, Joseph Campbell, Reva Schwartz

We quantify how the telephone channel and regional dialect influence formant estimates extracted from Wavesurfer [1, 2] in spontaneous conversational speech from over 3,600 native American English speakers. To the best of our knowledge, this is the largest scale study on this topic. We found that F1 estimates are higher in cellular channels than those in landline, while F2 in general shows an opposite trend. We also characterized vowel shift trends in northern states in U.S.A. and compared them with the Northern city chain shift (NCCS) [3]. Our analysis is useful in forensic applications where it is important to distinguish between speaker, dialect, and channel characteristics.

s Snack Sound Toolkit: http://www.speech.kth.se/snack/ Talkin, D., “Speech Formant Trajectory Estimation using Dynamic Programming with Modulated Transition Costs”, J. Acoust. Soc. Am., S1, 1987, pp. S55. Labov, W., Ash, S., and Boberg, C.,“The Atlas of North American English: Phonetics, Phonology, and Sound Change”, Mouton de Gruyter, Berlin, 2006.


doi: 10.21437/Interspeech.2009-627

Cite as: Chen, N.F., Shen, W., Campbell, J., Schwartz, R. (2009) Large-scale analysis of formant frequency estimation variability in conversational telephone speech. Proc. Interspeech 2009, 2203-2206, doi: 10.21437/Interspeech.2009-627

@inproceedings{chen09d_interspeech,
  author={Nancy F. Chen and Wade Shen and Joseph Campbell and Reva Schwartz},
  title={{Large-scale analysis of formant frequency estimation variability in conversational telephone speech}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={2203--2206},
  doi={10.21437/Interspeech.2009-627}
}