8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Construction of Spoken Language Model Including Fillers Using Filler Prediction Model

Kengo Ohta, Masatoshi Tsuchiya, Seiichi Nakagawa

Toyohashi University of Technology, Japan

This paper proposes a novel method to construct a spoken language model including fillers from a corpus including no fillers using a filler prediction model. It consists of two submodels: a filler insertion model which predicts places where fillers should be inserted, and a filler selection model which predicts appropriate fillers for given places. It converts a corpus that covers domain-relevant topics but includes no fillers into a corpus that contains fillers as well as domain-relevant topics. The experiment against the corpus of spontaneous Japanese shows that language models constructed by the proposed method achieve quite near performance of the traditional trigram language model constructed from the real spontaneous corpus including fillers.

