European Conference on Speech Technology
Edinburgh, Scotland, UK
A method for learning phonetic features from speech data using a temporal flow model is described, in which sampled speech data flows through a connectionist network from input to output units. The network uses hidden units with recurrent links to capture spectral/temporal characteristics of phonetic features. A simple experiment to discriminate the consonants [b,d,g] in the context of [i,a,u] using CV words is described. A supervised learning algorithm is used which performs gradient descent using a coarse approximation of the desired output as an target function. Context-dependent internal representations (features) were formed in the process of learning the discrimination task. A second experiment demonstrating learned vowel discrimination in various consonant environments is also presented. Both discrimination tasks were performed successfully without segmentation of the input, and without a direct comparison of the training items.
Bibliographic reference. Watrous, R. L. / Shastri, L. / Waibel, Alex H. (1987): "Learned phonetic discrimination using connectionist networks", In ECST-1987, 1377-1380.