7th International Conference on Spoken Language Processing
September 16-20, 2002
This paper describes the recent development of an Audio Indexing System for Chinese (Mandarin) broadcast news. Key issues of the three major components: automatic speech recognition, speaker identification and named entity extraction are addressed. The Chinese-language-specific challenges are discussed and our solutions are described. The recognition accuracy of the final system is comparable to the best-known state-of-the-art systems, while the throughput is below real time. The accuracy of the speaker identification and named entity extraction is comparable to our English system. The Chinese system currently runs 24×7 on a satellite feed of CCTV-4 broadcast news data.
Bibliographic reference. Liu, Daben / Ma, Jeffrey / Xu, Dongxin / Srivastava, Amit / Kubala, Francis (2002): "Real-time rich-content transcription of Chinese broadcast news", In ICSLP-2002, 1981-1984.