ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

A method for style adaptation to spontaneous speech by using a semi-linear interpolation technique

Nobuyasu Itoh, Masafumi Nishimura, Shinsuke Mori

This paper deals with a method for adapting a language model created from written-text corpora to spontaneous speech by using a semi-linear interpolation technique. Sizes and topic coverages of spoken language corpora are usually far smaller those of written-text corpora. We propose an approach to adapt a base language model to the styles of spontaneous speech on the basis of the following assumptions. The words that are topic-independent, that is to say, common in spontaneous speech should be predicted mainly by a model created from spontaneous speech corpora (style model), while the base model is more reliable for predicting topic-related words, because they are dicult to predict from a model based on a small corpus. We classified all words into disfluencies and normal words. The normal words are classified into two more categories; common words and topic words according to mutual information. For each category, the qualified models (base or style) with the optimal weights for linear interpolation are selected. In other words, a different linear combination of the models is used for each category of a predicted word. We conducted experiments by using a spoken-language corpus of Japanese for creating the style model. We achieved 159.1 in test-set perplexity compared with the baseline of 189.3 (simple linear interpolation) and the perplexity of the style specific model, which was 230.7.


Cite as: Itoh, N., Nishimura, M., Mori, S. (2000) A method for style adaptation to spontaneous speech by using a semi-linear interpolation technique. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 4, 374-377

@inproceedings{itoh00b_icslp,
  author={Nobuyasu Itoh and Masafumi Nishimura and Shinsuke Mori},
  title={{A method for style adaptation to spontaneous speech by using a semi-linear interpolation technique}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 4, 374-377}
}