Second ESCA/IEEE Workshop on Speech Synthesis
September 12-15, 1994
Abstract I have investigated the ability of neural networks to learn to map from CVC triphones to vowel formant tracks and the effect of a number of factors on that process. Intelligible speech is produced. The input representation has little effect, but the output representation is very important, with a simple triple of frequency values for the start, centre and end of each formant being most effective.
Bibliographic reference. Conway, Stephen M. (1994): "Using feed-forward neural networks to produce vowel formant tracks in CVC triphones", In SSW2-1994, 25-28.