This paper presents a method of document expansion using a side collection for improving the overall performance in retrieving spoken documents using text queries. This method is applied to Chinese spoken document retrieval (SDR) tasks where a series of experiments have been carried out for both monolingual and cross-language SDR systems. In our monolingual retrieval experiments, Cantonese broadcast news documents are retrieved using a multi-scale syllable-based approach. Experimental results show that application of document expansion can achieve an improvement of 56% in average inverse rank (AIR). For the cross-language spoken document retrieval (CL-SDR) task where Mandarin broadcast news is retrieved using English textual queries, experimental results show that the use of document expansion has brought 14% relative improvement in retrieval performance.
Cite as: Li, Y.-C., Meng, H.M. (2003) Document expansion using a side collection for monolingual and cross-language spoken document retrieval. Proc. ISCA Workshop on Multilingual Spoken Document Retrieval (MSDR 2003), 85-90
@inproceedings{li03_msdr, author={Yuk-Chi Li and Helen M. Meng}, title={{Document expansion using a side collection for monolingual and cross-language spoken document retrieval}}, year=2003, booktitle={Proc. ISCA Workshop on Multilingual Spoken Document Retrieval (MSDR 2003)}, pages={85--90} }