On the automatic comparison and cloning of native and non-native speech prosody.

Daniel Hirst


It is notoriously difficult to evaluate prosody objectively, since there is little consensus as to what constitutes a correct prosody for a given utterance. This presentation describes an automatic procedure which consists in comparing a non-native speaker’s production with 10 instances of the same utterance, taken from the OMProDat database, and read by native speakers . The pitch and relative syllable durations of the native and non-native versions are normalised and compared and the version from the native speaker which is most closely correlated with that of the non-native speaker is chosen as a model. The normalised pitch and syllable durations of the native speaker’s recording can then be cloned and transferred to the L2 utterance. The original and re-synthesised versions of the learner’s utterance can then be used to provide both visual and auditory feedback to the language learner.


DOI: 10.21437/SpeechProsody.2016-213

Cite as

Hirst, D. (2016) On the automatic comparison and cloning of native and non-native speech prosody.. Proc. Speech Prosody 2016, 1038-1042.

Bibtex
@inproceedings{Hirst2016,
author={Daniel Hirst},
title={On the automatic comparison and cloning of native and non-native speech prosody.},
year=2016,
booktitle={Speech Prosody 2016},
doi={10.21437/SpeechProsody.2016-213},
url={http://dx.doi.org/10.21437/SpeechProsody.2016-213},
pages={1038--1042}
}