Word N-gram models can be used for word-based age-group verification. In this paper the agglomerative information bottleneck (AIB) approach is used to tackle one of the most fundamental drawbacks of word N-gram models: its abundant amount of irrelevant information. It is demonstrated that irrelevant information can be omitted by joining words to form word-clusters; this provides a mechanism to transform any sequence of words to a sequence of word-cluster labels. Consequently, word N-gram models are converted to word-cluster N-gram models which are more compact. Age verification experiments were conducted on the Fisher corpora. Their goal was to verify the age-group of the speaker of an unknown speech segment. In these experiments an N-gram model was compressed to a fifth of its original size without reducing the verification performance. In addition, a verification accuracy improvement is demonstrated by disposing irrelevant information.
Bibliographic reference. Hecht, Ron M. / Hezroni, Omer / Manna, Amit / Dobry, Gil / Zigel, Yaniv / Tishby, Naftali (2009): "Information bottleneck based age verification", In INTERSPEECH-2009, 188-191.