ISCA Archive SpeechProsody 2016
ISCA Archive SpeechProsody 2016

On the automatic comparison and cloning of native and non-native speech prosody.

Daniel Hirst

It is notoriously difficult to evaluate prosody objectively, since there is little consensus as to what constitutes a correct prosody for a given utterance. This presentation describes an automatic procedure which consists in comparing a non-native speaker’s production with 10 instances of the same utterance, taken from the OMProDat database, and read by native speakers . The pitch and relative syllable durations of the native and non-native versions are normalised and compared and the version from the native speaker which is most closely correlated with that of the non-native speaker is chosen as a model. The normalised pitch and syllable durations of the native speaker’s recording can then be cloned and transferred to the L2 utterance. The original and re-synthesised versions of the learner’s utterance can then be used to provide both visual and auditory feedback to the language learner.


doi: 10.21437/SpeechProsody.2016-213

Cite as: Hirst, D. (2016) On the automatic comparison and cloning of native and non-native speech prosody.. Proc. Speech Prosody 2016, 1038-1042, doi: 10.21437/SpeechProsody.2016-213

@inproceedings{hirst16_speechprosody,
  author={Daniel Hirst},
  title={{On the automatic comparison and cloning of native and non-native speech prosody.}},
  year=2016,
  booktitle={Proc. Speech Prosody 2016},
  pages={1038--1042},
  doi={10.21437/SpeechProsody.2016-213}
}