Speaker adaptation techniques allow hidden Markov model (HMM) based speech synthesis systems to mimic a target voice of which a few samples are available. However, usual adaptation approaches are not applicable when the target voice is dysarthric, i.e. the target speaker has an impairment which prevents the correct pronunciation of some phonemes. As a first step towards giving personalized synthetic voices to these particular speakers, this paper explores the possibility of adapting the whole statistical voice model using frequency warping (FW) based transformations trained exclusively with vowels. Perceptual evaluations performed for healthy voices show that the proposed method achieves reasonable results even when the adaptation data exhibit medium/low recording quality.
Bibliographic reference. Alonso, Agustin / Erro, D. / Navas, Eva / Hernaez, Inma (2015): "Speaker adaptation using only vocalic segments via frequency warping", In INTERSPEECH-2015, 2764-2768.