Second European Conference on Speech Communication and Technology

Genova, Italy
September 24-26, 1991


Time-Delay Neural Networks Embedding Time Alignment: A Performance Analysis

Patrick Haffner (1), Alex H. Waibel (2)

(1) Centre National d'Etudes des Telecommunications, Lannion, France
(2) School of Computer Science Carnegie Mellon University, Pittsburgh, PA, USA

Multi-State Time Delay Neural Networks (MS-TDNNs), using a new connectionist architecture with embedded time alignement, have been successfully applied to speaker-dependent continuous spoken letter recognition[lj. This shows the value of extending the classification capabilities of connectionist networks up to the word level in recognizing confusable vocabularies. This paper describes the application of MS-TDNNs to a very different task; speaker independent telephone-quality isolated digit recognition. The resulting 1. 6% error rate demonstrates the value of embedded time alignement, since multi-feature TDNNs, which do not implement time alignement, have a 6. 5% error rate on the same task. Comparisons with HMMs are also provided.

Full Paper

Bibliographic reference.  Haffner, Patrick / Waibel, Alex H. (1991): "Time-delay neural networks embedding time alignment: a performance analysis", In EUROSPEECH-1991, 1415-1418.