Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Effective Topic-Tree Based Language Model Adaptation

Javier Dieguez-Tirado, Carmen García Mateo, Antonio Cardenal-Lopez

Universidade de Vigo, Spain

We work on adaptation schemes for language modeling well suited for limited resources scenarios. In order to take advantage of available out-of-domain corpora, language model adaptation using topic mixtures was investigated. This technique has not given good practical results in the past. In this paper, we have performed several modifications to an existing tree-based approach. The tree was obtained from the background corpus by means of partitional clustering. All the nodes were exploited in the adapted model, and non-erroneous in-domain transcriptions were used as the adaptation corpus. The modified technique yielded a 14% perplexity improvement in a bilingual BN task, outperforming several nonhierarchical approaches. A strategy for an early application of the language model allowed to translate this perplexity improvement into a 4% WER reduction.

Full Paper

Bibliographic reference.  Dieguez-Tirado, Javier / Mateo, Carmen García / Cardenal-Lopez, Antonio (2005): "Effective topic-tree based language model adaptation", In INTERSPEECH-2005, 1289-1292.