This paper proposes a novel method to construct a spoken language model including fillers from a corpus including no fillers using a filler prediction model. It consists of two submodels: a filler insertion model which predicts places where fillers should be inserted, and a filler selection model which predicts appropriate fillers for given places. It converts a corpus that covers domain-relevant topics but includes no fillers into a corpus that contains fillers as well as domain-relevant topics. The experiment against the corpus of spontaneous Japanese shows that language models constructed by the proposed method achieve quite near performance of the traditional trigram language model constructed from the real spontaneous corpus including fillers.
Bibliographic reference. Ohta, Kengo / Tsuchiya, Masatoshi / Nakagawa, Seiichi (2007): "Construction of spoken language model including fillers using filler prediction model", In INTERSPEECH-2007, 1489-1492.