Sixth International Conference on Spoken Language Processing
In this paper we report on the first phase of the speech corpus ISS_CSS collection for purposes of the CEST(Chinese-English speech translation) project. The corpus is intended to provide training material for speaker independent spontaneous Chinese speech recognition and automatic dialogue management over the telephone line. This paper describes the collection measures, processing methods, annotation and contents of this corpus. It consists of two parts: human-human dialogues and human-machine dialogues. Presently, the corpus has finished 10-hour speech and the associated annotation. Finally, we will present our collecting plan in the future.
Bibliographic reference. Feng, JunLan / Wang, XianFang / Du, LiMin (2000): "Data collection and processing in a Chinese spontaneous speech corpus IIS_CSS", In ICSLP-2000, vol.3, 394-397.