Third European Conference on Speech Communication and Technology

Berlin, Germany
September 22-25, 1993


Neural Time Warping

Bruno Apolloni, Dario Crivelli, Marco Amato

Laboratorio Laren, Dipartimento di Scienze dell' Informazione Milano, Italy

We try to capture subsymbolically through a recurrent neural network the structure of a word utterance in terms both of the dynamic properties of the process generating it [positive knowledge] and of a companion hidden process which declares in its final state the inconsistency of that utterance with some deceiving candidate generating processes [negative knowledge]. Namely, we work on a bench of neural networks. Each one was trained on a set of template utterances of a same word for generating a trajectory of the Mel-cepstral parameters which get close to that of the training word and an adviser signal which is frustrated in its growing when the net runs on some misleading words. We use the generalization capability of that network for adapting the output trajectory to the utterances under recognition. This gives rise to a neural time warping which stretches or compresses the template signal in function of the actual utterance, unlike the usual time warpers which work on the current vs template utterance. A proper two-phase training strategy is developed. Classifying the word with the label of the better warping network not rejected by the adviser signal gives rise to a rate of success about 96% on a speaker independent vocabulary of the ten digits.

Full Paper

Bibliographic reference.  Apolloni, Bruno / Crivelli, Dario / Amato, Marco (1993): "Neural time warping", In EUROSPEECH'93, 139-142.