ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition
April 13-16, 2003
We propose a novel filler/disfluency identification method for transcription of spontaneous speech in Japanese. Our method is hased on Japanese morphological analysis and chunking. Firstly, input sentences are analyzed with redundant outputs by a statistical morphological analyzer. Since fillers and disfluencies produce ambiguity in morphological analysis, we do this so as to take into account several possible roles for each character in the input. Secondly, a support vector machine-based chunker detects some ambiguous points as fillers or disfluencies. Although it cannot detect disfluency of function words satisfactorily, it achieves high performance for fillers and disfluencies of content words.
Bibliographic reference. Asahara, Masayuki / Matsumoto, Yuji (2003): "Filler and disfluency identification based on morphological analysis and chunking", in SSPR-2003, paper TAO3.