ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Teaching a vocal tract simulation to imitate stop consonants

Mark Huckvale, Ian Howard

The imitation of spoken stop consonants by an articulatory synthesizer using only general learning principles addresses significant issues in speech inversion and speech acquisition. Stop consonants are relatively large, complex acoustic events resulting from discrete articulations, so inversion based on the use of small time windows or based on the minimisation of average articulatory error across multiple places of articulation will not provide a satisfactory solution. This paper explores the effect of variation in inversion window size and the use of smoothing constraints on the quality of imitation of the stops [b], [d] and [g]. However good results are only obtained when inversion is supplemented by a phonetic labelling performed over a large time window. This source of additional phonetic information allows inversion to exploit different discrete gestures for the different places of articulation. The results demonstrate the importance of a phonological layer of perceptual analysis prior to imitation and speech acquisition.

doi: 10.21437/Interspeech.2005-848

Cite as: Huckvale, M., Howard, I. (2005) Teaching a vocal tract simulation to imitate stop consonants. Proc. Interspeech 2005, 3213-3216, doi: 10.21437/Interspeech.2005-848

  author={Mark Huckvale and Ian Howard},
  title={{Teaching a vocal tract simulation to imitate stop consonants}},
  booktitle={Proc. Interspeech 2005},