Second International Conference on Spoken Language Processing (ICSLP'92)

Banff, Alberta, Canada
October 13-16, 1992

Diphone-Based Speech Recognition Using Time-Event Neural Networks

Toomas Altosaar, Matti Karjalainen

Helsinki University of Technology, Espoo, Finland

In this paper we present a new speech recognition strategy that is based on diphones as the primary recognition unit in a time-event neural network (TENN) framework. TENN is based on a two- phase approach to identifying a speech unit: event detection followed by classification. We investigate two different implementation configurations, an integrated vs. a cascaded system, and report on their performance. Preliminary results show that for some of the most frequent diphone classes in Finnish recognition rates of 93-97% on the diphone level are possible.

Full Paper

Bibliographic reference.  Altosaar, Toomas / Karjalainen, Matti (1992): "Diphone-based speech recognition using time-event neural networks", In ICSLP-1992, 979-982.