Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Multi-Scale Audio Indexing for Chinese Spoken Document Retrieval

Helen M. Meng (1), W. K. Lo (2), Yuk Chi Li (2), P. C. Ching (1)

(1) Human-Computer Communications Laboratory, Dept. of Systems Engineering; (2) Digital Signal Processing Laboratory, Department of Electronic Engineering, The Chinese University of Hong Kong

The advent of the information age has brought massive digital libraries of multimedia content. This development creates a high demand for information indexing and retrieval technologies, and the capability of browsing through audio archives is much desired. This paper reports on our initial attempt in the use of syllable units for Chinese spoken document retrieval. Our experiments are based on 1861 news stories from local television broadcasts in Cantonese, a monosyllabic Chinese dialect with a rich tonal structure. Results show that indexing with overlapping bi-syllables (tonal syllables) mapped from text delivers the reference retrieval performance at average inverse rank (AIR)=0.830. Retrieval based on overlapping bisyllables (base syllables) recognized from audio achieved an AIR of 0.460.

Full Paper

Bibliographic reference.  Meng, Helen M. / Lo, W. K. / Li, Yuk Chi / Ching, P. C. (2000): "Multi-scale audio indexing for Chinese spoken document retrieval", In ICSLP-2000, vol.4, 101-104.