INTERSPEECH 2008
9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

f-Divergence is a Generalized Invariant Measure Between Distributions

Yu Qiao, Nobuaki Minematsu

University of Tokyo, Japan

WedSe4.P3-10, Poster Finding measures (or features) invariant to inevitable variations caused by non-linguistical factors (transformations) is a fundamental yet important problem in speech recognition. Recently, Minematsu [1, 2] proved that Bhattacharyya distance (BD) between two distributions is invariant to invertible transforms on feature space, and develop an invariant structural representation of speech based on it. There is a question: which kind of measures can be invariant? In this paper, we prove that f -divergence yields a generalized family of invariant measures, and show that all the invariant measures have to be written in the forms of f -divergence. Many famous distances and divergences in information and statistics, such as Bhattacharyya distance (BD), KL-divergence, Hellinger distance, can be written into forms of f -divergence. As an application, we carried out experiments on recognizing the utterances of connected Japanese vowels. The experimental results show that BD and KL have the best performance among the measures compared.

References

  1. N. Minematsu, "Yet another acoustic representation of speech sounds," Proc. ICASSP, pp. 585-588, 2004.
  2. N. Minematsu, "Mathematical Evidence of the Acoustic Universal Structure in Speech," Proc. ICASSP, pp. 889-892, 2005.

Full Paper

Bibliographic reference.  Qiao, Yu / Minematsu, Nobuaki (2008): "f-divergence is a generalized invariant measure between distributions", In INTERSPEECH-2008, 1349-1352.