ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Approximate intrinsic fourier analysis of speech

Frank Tompkins, Patrick J. Wolfe

Popular parametric models of speech sounds such as the sourcefilter model provide a fixed means of describing the variability inherent in speech waveform data. However, nonlinear dimensionality reduction techniques such as the intrinsic Fourier analysis method of Jansen and Niyogi provide a more flexible means of adaptively estimating such structure directly from data. Here we employ this approach to learn a low-dimensional manifold whose geometry is meant to reflect the structure implied by the human speech production system. We derive a novel algorithm to efficiently learn this manifold for the case of many training examples - the setting of both greatest practical interest and computational difficulty. We then demonstrate the utility of our method by way of a proof-of-concept phoneme identification system that operates effectively in the intrinsic Fourier domain.

doi: 10.21437/Interspeech.2009-28

Cite as: Tompkins, F., Wolfe, P.J. (2009) Approximate intrinsic fourier analysis of speech. Proc. Interspeech 2009, 120-123, doi: 10.21437/Interspeech.2009-28

  author={Frank Tompkins and Patrick J. Wolfe},
  title={{Approximate intrinsic fourier analysis of speech}},
  booktitle={Proc. Interspeech 2009},