ISCA Archive Odyssey 2012
ISCA Archive Odyssey 2012

Generalized Viterbi-based models for time-series segmentation applied to speaker diarization

Itshak Lapidot, Jean-Fran├žois Bonastre

Time-series clustering is a process which takes into account the input samples chronological sequence. So, in time-series clustering, the samples are not processed independently as a result for a given sample depends on the clustering result of the whole sequence. One of the popular clustering algorithms to handle such dependency is the well-known Hidden- Markov-Model (HMM) trained by the Viterbi statistics.

In this work we propose a generalization of the broadly used HMM, denoted Hidden-Distortion-Models (HDMs). Our proposal is based on distortion-based models and transition count, for which probabilistic calculations are no longer mandatory. We will introduce our approach by its mathematical bases. It will be shown that Viterbi based HMM can be seen as a special case of HDM. This proximity allows to us to apply similar approaches for state-model training when the new paradigm is used to learn the sequence dependencies.

Speaker diarization application will be presented to show the advantages of the HDM as a clustering algorithm.


Cite as: Lapidot, I., Bonastre, J.-F. (2012) Generalized Viterbi-based models for time-series segmentation applied to speaker diarization. Proc. The Speaker and Language Recognition Workshop (Odyssey 2012), 138-145

@inproceedings{lapidot12_odyssey,
  author={Itshak Lapidot and Jean-Fran├žois Bonastre},
  title={{Generalized Viterbi-based models for time-series segmentation applied to speaker diarization}},
  year=2012,
  booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2012)},
  pages={138--145}
}