Sixth European Conference on Speech Communication and Technology
The problem of stable vowel sound synthesis using a non-linear free-running recurrent radial basis function (RBF) neural network is addressed. Voiced speech production is modelled as the output of a nonlinear dynamical system, rather than the conventional linear source-filter approach, which, given the nonlinear nature of speech, is expected to produce more natural-sounding synthetic speech. Our RBF network has the centre positions fixed on a hyperlattice, so only the linear-in-the-parameters weights need to be learnt for each vowel realisation. This leads to greatly reduced complexity without degrading performance. The proposed structure, in which regularisation theory is used in learning the weights, is demonstrated to be stable when functioning in recurrent mode with no external input, correctly producing the desired vowel sound.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Mann, Iain / McLaughlin, Steve (1999): "Stable speech synthesis using recurrent radial basis functions", In EUROSPEECH'99, 2315-2318.