ISCA Archive Interspeech 2009
ISCA Archive Interspeech 2009

Evaluation of phone lattice based speech decoding

Jacques Duchateau, Kris Demuynck, Hugo Van hamme

Previously, we proposed a flexible two-layered speech recogniser architecture, called FLaVoR. In the first layer an unconstrained, task independent phone recogniser generates a phone lattice. Only in the second layer the task specific lexicon and language model are applied to decode the phone lattice and produce a word level recognition result. In this paper, we present a further evaluation of the FLaVoR architecture. The performance of a classical singlelayered architecture and the FLaVoR architecture are compared on two recognition tasks, using the same acoustic, lexical and language models. On the large vocabulary Wall Street Journal 5k and 20k benchmark tasks, the two-layered architecture resulted in slightly but not significantly better word error rates. On a reading error detection task for a reading tutor for children, the FLaVoR architecture clearly outperformed the single-layered architecture.

doi: 10.21437/Interspeech.2009-342

Cite as: Duchateau, J., Demuynck, K., Van hamme, H. (2009) Evaluation of phone lattice based speech decoding. Proc. Interspeech 2009, 1179-1182, doi: 10.21437/Interspeech.2009-342

  author={Jacques Duchateau and Kris Demuynck and Hugo {Van hamme}},
  title={{Evaluation of phone lattice based speech decoding}},
  booktitle={Proc. Interspeech 2009},