ISCA Archive Eurospeech 1999
ISCA Archive Eurospeech 1999

Formant analysis and synthesis using hidden Markov models

Alex Acero

This paper describes a unifying framework for both formant tracking and speech synthesis using Hidden Markov Models (HMM). The feature vector in the HMM is composed by the first three formant frequencies, their bandwidths and their delta with time. Speech is synthesized by generating the most likely sequence of feature vectors from a HMM, trained with a set of sentences from a given speaker. Higher formant tracking accuracy can be achieved by finding the most likely formant track given a distribution of the formants of every sound. This data-driven formant synthesizer bridges the gaps between rule-based formant synthesizers and concatenative synthesizers by synthesizing speech that is both smooth and resembles the speaker in the training data.


doi: 10.21437/Eurospeech.1999-251

Cite as: Acero, A. (1999) Formant analysis and synthesis using hidden Markov models. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 1047-1050, doi: 10.21437/Eurospeech.1999-251

@inproceedings{acero99_eurospeech,
  author={Alex Acero},
  title={{Formant analysis and synthesis using hidden Markov models}},
  year=1999,
  booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)},
  pages={1047--1050},
  doi={10.21437/Eurospeech.1999-251}
}