This study presents a novel approach to spoken document retrieval based on semantic context inference for speech indexing. Each recognized term in a spoken document is mapped onto a semantic inference vector containing a bag of semantic terms through a semantic relation matrix. The semantic context inference vector is then constructed by summing up all the semantic inference vectors. Such a semantic term expansion and re-weighting make the semantic context inference vector a suitable representation for speech indexing. The experiments were conducted on 1550 anchor news stories collected from Mandarin Chinese broadcast news of 198 hours. The experimental results indicate that the proposed speech indexing using the semantic context inference contributes to a substantial performance improvement of spoken document retrieval.
Bibliographic reference. Huang, Chien-Lin / Ma, Bin / Li, Haizhou / Wu, Chung-Hsien (2011): "Speech indexing using semantic context inference", In INTERSPEECH-2011, 717-720.