Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
In this paper we present a new speech recognition strategy that is based on diphones as the primary recognition unit in a time-event neural network (TENN) framework. TENN is based on a two- phase approach to identifying a speech unit: event detection followed by classification. We investigate two different implementation configurations, an integrated vs. a cascaded system, and report on their performance. Preliminary results show that for some of the most frequent diphone classes in Finnish recognition rates of 93-97% on the diphone level are possible.
Bibliographic reference. Altosaar, Toomas / Karjalainen, Matti (1992): "Diphone-based speech recognition using time-event neural networks", In ICSLP-1992, 979-982.