INTERSPEECH 2004 - ICSLP
Spontaneously spoken Japanese includes a lot of grammatically ill-formed linguistic phenomena such as fillers, hesitations, inversions, and so on, which do not appear in written language. This paper proposes a method of robust dependency parsing using a large-scale spoken language corpus, and evaluates the availability and robustness of the method using spontaneously spoken dialogue sentences. By utilizing stochastic information about the appearance of ill-formed phenomena, the method can robustly parse spoken Japanese including fillers, inversions, or dependencies over utterance units. As a result of an experiment, the parsing accuracy provided 87.0%, and we confirmed that it is effective to utilize the location information of a bunsetsu, and the distance information between bunsetsus as stochastic information.
Bibliographic reference. Ohno, Tomohiro / Matsubara, Shigeki / Kawaguchi, Nobuo / Inagaki, Yasuyoshi (2004): "Robust dependency parsing of spontaneous Japanese speech and its evaluation", In INTERSPEECH-2004, 2173-2176.