ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Formant frequency tracking using Gaussian mixtures with maximum a posteriori adaptation

Jonathan C. Kim, Hrishikesh Rao, Mark A. Clements

We present a novel method for estimating formant frequencies by fitting Gaussian mixtures to discrete Fourier Transform (DFT) magnitude spectra. The method first estimates the Gaussian parameters for a sequence of wideband spectra using the Expectation- Maximization (EM) algorithm. It then refines the parameters by using maximum a posteriori (MAP) adaptation. The work was evaluated using manually labeled ground truth data with 516 utterances and comparing results both with PRAAT's formant tracking algorithm in various noisy environments and one other state-of-the-art method. We obtained statistically significant improvements in the relative errors for the first three formants over all phonetic classes.


doi: 10.21437/Interspeech.2013-714

Cite as: Kim, J.C., Rao, H., Clements, M.A. (2013) Formant frequency tracking using Gaussian mixtures with maximum a posteriori adaptation. Proc. Interspeech 2013, 3221-3225, doi: 10.21437/Interspeech.2013-714

@inproceedings{kim13f_interspeech,
  author={Jonathan C. Kim and Hrishikesh Rao and Mark A. Clements},
  title={{Formant frequency tracking using Gaussian mixtures with maximum a posteriori adaptation}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={3221--3225},
  doi={10.21437/Interspeech.2013-714}
}