INTERSPEECH 2015
16th Annual Conference of the International Speech Communication Association

Dresden, Germany
September 6-10, 2015

Speaker Adaptation Using Only Vocalic Segments via Frequency Warping

Agustin Alonso, D. Erro, Eva Navas, Inma Hernaez

Universidad del País Vasco, Spain

Speaker adaptation techniques allow hidden Markov model (HMM) based speech synthesis systems to mimic a target voice of which a few samples are available. However, usual adaptation approaches are not applicable when the target voice is dysarthric, i.e. the target speaker has an impairment which prevents the correct pronunciation of some phonemes. As a first step towards giving personalized synthetic voices to these particular speakers, this paper explores the possibility of adapting the whole statistical voice model using frequency warping (FW) based transformations trained exclusively with vowels. Perceptual evaluations performed for healthy voices show that the proposed method achieves reasonable results even when the adaptation data exhibit medium/low recording quality.

Full Paper

Bibliographic reference.  Alonso, Agustin / Erro, D. / Navas, Eva / Hernaez, Inma (2015): "Speaker adaptation using only vocalic segments via frequency warping", In INTERSPEECH-2015, 2764-2768.