ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

f-divergence is a generalized invariant measure between distributions

Yu Qiao, Nobuaki Minematsu

WedSe4.P3-10, Poster Finding measures (or features) invariant to inevitable variations caused by non-linguistical factors (transformations) is a fundamental yet important problem in speech recognition. Recently, Minematsu [1, 2] proved that Bhattacharyya distance (BD) between two distributions is invariant to invertible transforms on feature space, and develop an invariant structural representation of speech based on it. There is a question: which kind of measures can be invariant? In this paper, we prove that f -divergence yields a generalized family of invariant measures, and show that all the invariant measures have to be written in the forms of f -divergence. Many famous distances and divergences in information and statistics, such as Bhattacharyya distance (BD), KL-divergence, Hellinger distance, can be written into forms of f -divergence. As an application, we carried out experiments on recognizing the utterances of connected Japanese vowels. The experimental results show that BD and KL have the best performance among the measures compared.

s N. Minematsu, "Yet another acoustic representation of speech sounds," Proc. ICASSP, pp. 585-588, 2004. N. Minematsu, "Mathematical Evidence of the Acoustic Universal Structure in Speech," Proc. ICASSP, pp. 889-892, 2005.


doi: 10.21437/Interspeech.2008-393

Cite as: Qiao, Y., Minematsu, N. (2008) f-divergence is a generalized invariant measure between distributions. Proc. Interspeech 2008, 1349-1352, doi: 10.21437/Interspeech.2008-393

@inproceedings{qiao08b_interspeech,
  author={Yu Qiao and Nobuaki Minematsu},
  title={{f-divergence is a generalized invariant measure between distributions}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={1349--1352},
  doi={10.21437/Interspeech.2008-393}
}