Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Retrieval of Mandarin Broadcast News Using Spoken Queries

Berlin Chen (1,2), Hsin-min Wang (1), Lin-shan Lee (1,2)

(1) Institute of Information Science, Academia Sinica, (2) Dept. of Computer Science & Information Engineering, Taiwan University, Taipei, Taiwan, ROC

Considering the monosyllabic structure of the Chinese language, a whole class of indexing features for retrieval of Mandarin broadcast news using syllable-level statistical characteristics has been previously investigated. This paper presents the improvements achieved over the previous results. The major differences are: (1) Multi-scale character- and word-level indexing terms have been integrated with the syllable-level information. (2) Information cues from the contemporary newswire text corpus have been used to create more accurate syllable indexing terms. (3) Automatic document expansion, blind relevance feedback, and query expansion via the term association matrix have been applied in retrieval. With all these schemes, the average precision can be improved from 55.46% to 71.29%.

Full Paper

Bibliographic reference.  Chen, Berlin / Wang, Hsin-min / Lee, Lin-shan (2000): "Retrieval of mandarin broadcast news using spoken queries", In ICSLP-2000, vol.1, 520-523.