Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Using Mutual Information to Design Feature Combinations

Daniel P. W. Ellis (1), Jeff A. Bilmes (2)

(1) International Computer Science Institute, Berkeley, CA, USA
(2) University of Washington, Seattle, WA, USA

Combination of different feature streams is a well-established method for improving speech recognition performance. This empirical success, however, poses theoretical problems when trying to design combination systems: is it possible to predict which feature streams will combine most advantageously, and which of the many possible combination strategies will be most successful for the particular feature streams in question? We approach these questions with the tool of conditional mutual information (CMI), estimating the amount of information that one feature stream contains about the other, given knowledge of the correct subword unit label. We argue that CMI of the raw feature streams should be useful in deciding whether to merge them together as one large stream, or to feed them separately into independent classifiers for later combination; this is only weakly supported by our results. We also argue that CMI between the outputs of independent classifiers based on each stream should help predict which streams can be combined most beneficially. Our results confirm the usefulness of this measure.

Full Paper

Bibliographic reference.  Ellis, Daniel P. W. / Bilmes, Jeff A. (2000): "Using mutual information to design feature combinations", In ICSLP-2000, vol.3, 79-82.