Empirical Analysis of Score Fusion Application to Combined Neural Networks for Open Vocabulary Spoken Term Detection

Shi-wook Lee, Kazuyo Tanaka, Yoshiaki Itoh


System combination, which combines the outputs of multiple systems or internal representations, is a powerful method to improve the performance of machine learning tasks and has been widely adopted in recent knowledge transfer learning. In this study, to describe how to extract effective knowledge from an ensemble of neural networks, we first examine several score fusions from an ensemble of neural networks tasked with open vocabulary spoken term detection, where the class probability of the neural network is utilized as a similarity metric; then, we investigate the trade-off between confusion and dark knowledge. From the experimental evaluation on open vocabulary spoken term detection, we obtain 2.09% absolute gain as compared to the best result from single systems. Furthermore, the performance gains achieved via score fusion of class probabilities exactly match the mathematical inequality for sum and power means results and that the gain achieved via summation of class probabilities is consistently better than that achieved via score fusion of power means. The experimental analysis confirms that summation, which enhances the discriminative capability of the superior class probability, can implement smoothed probability distribution to yield more effective dark knowledge, while adequately suppressing undesirable effects.


 DOI: 10.21437/Interspeech.2018-1776

Cite as: Lee, S., Tanaka, K., Itoh, Y. (2018) Empirical Analysis of Score Fusion Application to Combined Neural Networks for Open Vocabulary Spoken Term Detection. Proc. Interspeech 2018, 2062-2066, DOI: 10.21437/Interspeech.2018-1776.


@inproceedings{Lee2018,
  author={Shi-wook Lee and Kazuyo Tanaka and Yoshiaki Itoh},
  title={Empirical Analysis of Score Fusion Application to Combined Neural Networks for Open Vocabulary Spoken Term Detection},
  year=2018,
  booktitle={Proc. Interspeech 2018},
  pages={2062--2066},
  doi={10.21437/Interspeech.2018-1776},
  url={http://dx.doi.org/10.21437/Interspeech.2018-1776}
}