8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Robust Dependency Parsing of Spontaneous Japanese Speech and Its Evaluation

Tomohiro Ohno (1), Shigeki Matsubara (1), Nobuo Kawaguchi (1), Yasuyoshi Inagaki (2)

(1) Nagoya University, Japan
(2) Aichi Prefectural University, Japan

Spontaneously spoken Japanese includes a lot of grammatically ill-formed linguistic phenomena such as fillers, hesitations, inversions, and so on, which do not appear in written language. This paper proposes a method of robust dependency parsing using a large-scale spoken language corpus, and evaluates the availability and robustness of the method using spontaneously spoken dialogue sentences. By utilizing stochastic information about the appearance of ill-formed phenomena, the method can robustly parse spoken Japanese including fillers, inversions, or dependencies over utterance units. As a result of an experiment, the parsing accuracy provided 87.0%, and we confirmed that it is effective to utilize the location information of a bunsetsu, and the distance information between bunsetsus as stochastic information.

Full Paper

Bibliographic reference.  Ohno, Tomohiro / Matsubara, Shigeki / Kawaguchi, Nobuo / Inagaki, Yasuyoshi (2004): "Robust dependency parsing of spontaneous Japanese speech and its evaluation", In INTERSPEECH-2004, 2173-2176.