10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Evaluation of Phone Lattice Based Speech Decoding

Jacques Duchateau, Kris Demuynck, Hugo Van hamme

Katholieke Universiteit Leuven, Belgium

Previously, we proposed a flexible two-layered speech recogniser architecture, called FLaVoR. In the first layer an unconstrained, task independent phone recogniser generates a phone lattice. Only in the second layer the task specific lexicon and language model are applied to decode the phone lattice and produce a word level recognition result. In this paper, we present a further evaluation of the FLaVoR architecture. The performance of a classical singlelayered architecture and the FLaVoR architecture are compared on two recognition tasks, using the same acoustic, lexical and language models. On the large vocabulary Wall Street Journal 5k and 20k benchmark tasks, the two-layered architecture resulted in slightly but not significantly better word error rates. On a reading error detection task for a reading tutor for children, the FLaVoR architecture clearly outperformed the single-layered architecture.

Full Paper

Bibliographic reference.  Duchateau, Jacques / Demuynck, Kris / Van hamme, Hugo (2009): "Evaluation of phone lattice based speech decoding", In INTERSPEECH-2009, 1179-1182.