![]() |
ASR2000 - Automatic Speech Recognition: Challenges for the new MilleniumSeptember 18-20, 2000 |
![]() |
A new national project for raising the technological level of speech recognition and understanding has recently commenced in Japan. This project aims at a) building a large-scale spontaneous speech corpus consisting of roughly 7M words and 800 hours of speech, b) acoustic and linguistic modeling for spontaneous speech understanding and summarization using linguistic as well as para-linguistic information in speech, and c) building a prototype of a spontaneous speech summarization system. The corpus under compilation will contain spontaneously uttered Common Japanese speech and the morphologically annotated transcriptions. Also, segmental and intonation labeling will be provided for a subset of the corpus. The primary application domain of the corpus is speech recognition of spontaneous speech, but it is also planned to become a useful research corpus both for natural language processing and phonetic/linguistic studies.
Full Paper (PDF) Full Paper (Zipped Postscript)
Bibliographic reference. Furui, Sadaoki / Maekawa, Kikuo / Isahara, Hitoshi (2000): "A Japanese national project on spontaneous speech corpus and processing technology", In ASR-2000, 244-248.