We describe how complementary search spaces, addressed by two different methods used in Spoken Term Detection (STD), can be merged for German subword STD. We propose fuzzy-search techniques on lattices to narrow the gap between subword and word retrieval. The first technique is based on an edit-distance, where no a priori knowledge about confusions is employed. Additionally, we propose a weighting method which explicitly models pronunciation variation on a subword level and thus improves robustness against false positives. Recall is improved by 6% absolute when retrieving on the merged search space rather than using an exact lattice search. By modeling subword pronunciation variation, we increase recall in a high-precision setting by 2% absolute compared to the edit-distance method.
Cite as: Mertens, T., Schneider, D., Köhler, J. (2009) Merging search spaces for subword spoken term detection. Proc. Interspeech 2009, 2127-2130, doi: 10.21437/Interspeech.2009-608
@inproceedings{mertens09b_interspeech, author={Timo Mertens and Daniel Schneider and Joachim Köhler}, title={{Merging search spaces for subword spoken term detection}}, year=2009, booktitle={Proc. Interspeech 2009}, pages={2127--2130}, doi={10.21437/Interspeech.2009-608} }