This paper presents a system for speech retrieval of Mandarin broadcast news. First, several data-driven and unsupervised approaches are integrated into the broadcast news transcription system to improve the speech recognition accuracy and efficiency. Then, a multi-scale indexing paradigm for broadcast news retrieval is proposed to make use of the special structural properties of the Chinese language as well as to alleviate the problems caused by the speech recognition errors. Finally, we use the PDA as the platform and Mandarin broadcast news stories collected in Taiwan as the document collection to establish a speech-based multimedia information retrieval prototype system. Very encouraging results are obtained.
Cite as: Chen, B., Chen, Y.-T., Chang, C.-H., Chen, H.-B. (2005) Speech retrieval of Mandarin broadcast news via mobile devices. Proc. Interspeech 2005, 109-112, doi: 10.21437/Interspeech.2005-80
@inproceedings{chen05_interspeech, author={Berlin Chen and Yi-Ting Chen and Chih-Hao Chang and Hung-Bin Chen}, title={{Speech retrieval of Mandarin broadcast news via mobile devices}}, year=2005, booktitle={Proc. Interspeech 2005}, pages={109--112}, doi={10.21437/Interspeech.2005-80} }