ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Speech inversion and re-synthesis

Viktor N. Sorokin, A. S. Leonov, I. S. Makarov, A. I. Tsyplikhin

Inverse problems with respect to parameters of the articulatory model are solved for all types of sounds: vowels, semi-vowels, nasals, stops and fricatives in various contexts. Acoustical parameters of the speech signal and trajectories of some reference points inside the vocal tract serve as input data. 3.7%, 3.8% and 2.6% average approximation error for the first three formants, 8.5% for the specific frequencies of fricative spectra, 2.8% for the coordinates of reference points for all kinds of phonemes are obtained when both - acoustic and articulatory data are used. 1.8%, 1.6%, and 1.1% error for the first three formant frequencies, and 6% for the coordinates of reference points are obtained when only acoustic data are used. Original and re-synthesized utterances are found to be very similar in appearance, according to subjective assessment.

doi: 10.21437/Interspeech.2005-847

Cite as: Sorokin, V.N., Leonov, A.S., Makarov, I.S., Tsyplikhin, A.I. (2005) Speech inversion and re-synthesis. Proc. Interspeech 2005, 3209-3212, doi: 10.21437/Interspeech.2005-847

  author={Viktor N. Sorokin and A. S. Leonov and I. S. Makarov and A. I. Tsyplikhin},
  title={{Speech inversion and re-synthesis}},
  booktitle={Proc. Interspeech 2005},