Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Learning Effects for Phonetic Properties of Synthetic Speech

Martine van Zundert, Jacques Terken

IPO, Center for User-System Interaction, Eindhoven, The Netherlands

We address the question of what is learned while listening to synthetic speech produced by means of diphone-based synthesis. In standard diphone-based speech synthesis, the diphone database contains a single token for each phoneme transition. Learning may occur at different levels: listeners may learn the mapping between acoustic properties of particular diphones and their phonemic labelling; or they may learn phoneme models; or they may learn realisations of phonological features. Predictions of the different hypotheses are tested in an experiment in which we determine the improvement in intelligibility as a result of training for a specially constructed set of stimuli. The results force us to reject the hypothesis that listeners learn realisations of phonological features. However, they do not exclude the possibility that subjects learn phoneme models, although the results supporting this hypothesis do not reach significance.

