In this paper we present a rescoring approach for keyword search (KWS) based on neural networks (NN). This approach exploits only the lattice context in a detected time interval instead of its corresponding audio. The most informative arcs in lattice context are selected and represented as a matrix, where words on arcs are represented in an embedding space with respect to their pronunciations. Then convolutional neural networks (CNNs) are employed to capture distinctive features from this matrix. A rescoring model is trained to minimize term-weighted sigmoid cross entropy so as to match the evaluation metric. Experiments on single-word queries show that lattice context brings complementary gains over normalized posterior scores. Performance on both in-vocabulary (IV) and out-of-vocabulary (OOV) queries are improved by combining NN-based scores with standard posterior scores.
Cite as: Chen, Z., Wu, J. (2017) A Rescoring Approach for Keyword Search Using Lattice Context Information. Proc. Interspeech 2017, 3592-3596, doi: 10.21437/Interspeech.2017-1328
@inproceedings{chen17n_interspeech, author={Zhipeng Chen and Ji Wu}, title={{A Rescoring Approach for Keyword Search Using Lattice Context Information}}, year=2017, booktitle={Proc. Interspeech 2017}, pages={3592--3596}, doi={10.21437/Interspeech.2017-1328} }