Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

An Experimental Study of an Audio Indexing System for the Web

Beth Logan, Pedro Moreno, Jean-Manuel van Thong, Ed Whittaker

Cambridge Research Laboratory, Compaq Computer Corporation, Cambridge, MA, USA

We have developed a speech recognition based audio search engine for indexing spoken documents found on the World Wide Web. Our site (http://www.compaq.com/speechbot) indexes around 20 news and talk radio shows covering a wide range of topics, speaking styles and acoustic conditions from a selection of public Web sites with multimedia archives. In this paper, we describe our system and its performance, focusing on the speech recognition and retrieval aspects. We describe our training procedure in some detail and report our historical error rate since the site launch. We also investigate the impact of Out Of Vocabulary (OOV) words. Finally we report the results of retrieval experiments which demonstrate that our system can index effectively.


Full Paper

Bibliographic reference.  Logan, Beth / Moreno, Pedro / Thong, Jean-Manuel van / Whittaker, Ed (2000): "An experimental study of an audio indexing system for the web", In ICSLP-2000, vol.2, 676-679.