7th International Conference on Spoken Language Processing
September 16-20, 2002
We present a novel approach to the out of vocabulary (OOV) query problem for audio indexing. Our technique first builds a word index for the audio using speech recognition. It then expands query words into in-vocabulary phrases according to intrinsic acoustic confusability and language model scores. The aim is to mimic the mistakes the speech recognizer makes when transcribing the OOV words. We present results of retrieval experiments on a broadcast news repository of 75 hours. Our results indicate that our approach is promising. Our technique is better than simply using word queries and only slightly worse than a more sophisticated scheme which expands queries into overlapping sequences of phonemes. We can also combine our technique with the phoneme indexing system to further improve performance. Finally, our approach is simple, requires only a word index be built for the audio and has little computational overhead.
Bibliographic reference. Logan, Beth / Thong, J. M. Van (2002): "Confusion-based query expansion for OOV words in spoken document retrieval", In ICSLP-2002, 1997-2000.