ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition
April 13-16, 2003
This paper describes two methods for detecting word segments and their morphological information in a Japanese spontaneous speech corpus, and a method for accurately tagging a large spontaneous speech corpus. In this paper, we show that by using semi-automatic analysis we can expect a precision of over 99% for detecting and tagging short words and 97% for long words; the two types of words comprising the corpus.
Bibliographic reference. Uchimoto, Kiyotaka / Nobata, Chikashi / Yamada, Atsushi / Sekine, Satoshi / Isahara, Hitoshi (2003): "Morphological analysis of the corpus of spontaneous Japanese", in SSPR-2003, paper TAO2.