Applications of virtual-evidence based speech recognizer training

Amarnag Subramanya, Jeff A. Bilmes

We present two applications of our previously proposed virtualevidence (VE) based speech recognizer training algorithm [1, 2]. The first relates to two-pass training where segmentations obtained during the first pass are used as VE to train the subsequent pass. We use the TIMIT phone and SVitchboard continuous speech recognition tasks to demonstrate the benefits of using VE based training in two-pass systems. The second application involves making use of functions that can incorporate prior domain knowledge to generate VE-scores. Here, in the case of TIMIT phone recognition, we show that using the proposed function to generate VE-scores results in about 6% relative error rate reduction over the baseline.

Cite as: Subramanya, A., Bilmes, J.A. (2008) Applications of virtual-evidence based speech recognizer training. Proc. Interspeech 2008, 2562-2565, doi: 10.21437/Interspeech.2008-635

