ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition

April 13-16, 2003
Tokyo Institute of Technology, Tokyo, Japan

Filler and Disfluency Identification Based on Morphological Analysis and Chunmng

Masayuki Asahara, Yuji Matsumoto

Graduate School of Information Science, Nara Institute of Science and Technology, Japan

We propose a novel filler/disfluency identification method for transcription of spontaneous speech in Japanese. Our method is hased on Japanese morphological analysis and chunking. Firstly, input sentences are analyzed with redundant outputs by a statistical morphological analyzer. Since fillers and disfluencies produce ambiguity in morphological analysis, we do this so as to take into account several possible roles for each character in the input. Secondly, a support vector machine-based chunker detects some ambiguous points as fillers or disfluencies. Although it cannot detect disfluency of function words satisfactorily, it achieves high performance for fillers and disfluencies of content words.

Full Paper

Bibliographic reference.  Asahara, Masayuki / Matsumoto, Yuji (2003): "Filler and disfluency identification based on morphological analysis and chunking", in SSPR-2003, paper TAO3.