7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Real-Time Rich-Content Transcription of Chinese Broadcast News

Daben Liu, Jeffrey Ma, Dongxin Xu, Amit Srivastava, Francis Kubala

BBN Technologies, USA

This paper describes the recent development of an Audio Indexing System for Chinese (Mandarin) broadcast news. Key issues of the three major components: automatic speech recognition, speaker identification and named entity extraction are addressed. The Chinese-language-specific challenges are discussed and our solutions are described. The recognition accuracy of the final system is comparable to the best-known state-of-the-art systems, while the throughput is below real time. The accuracy of the speaker identification and named entity extraction is comparable to our English system. The Chinese system currently runs 247 on a satellite feed of CCTV-4 broadcast news data.

Full Paper

Bibliographic reference.  Liu, Daben / Ma, Jeffrey / Xu, Dongxin / Srivastava, Amit / Kubala, Francis (2002): "Real-time rich-content transcription of Chinese broadcast news", In ICSLP-2002, 1981-1984.