ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Unsupervised audio stream segmentation and clustering via the Bayesian information criterion

Bowen Zhou, John H. L. Hansen

In this paper, we propose an eƆcient approach for unsupervised audio stream segmentation and clustering via the Bayesian Information Criterion (BIC). The proposed method extends an earlier formulation by Chen and Gopalakrishnan [1]. In our segmentation formulation, Hotelling's T2-Statistic is used to pre-select candidate segmentation boundaries followed by BIC to make the segmentation decision. Our experiments show that we can improve the final algorithm speed by an order of 100 compared to that in [1] while achieving a 7% reduced miss rate at the expense of a 6% increase in false alarm rate using DARPA Hub4 1997 evaluation data. In the clustering stage, Gaussian Mixture Models are used for gender labeling prior to hierarchical BIC-based clustering within the gender class. Our cluster experiment show that we can achieve a cluster purity of 99.3%.

S. Chen, P.Gopalakrishnan, "Speaker, Environment and Channel Change Detection and Clustering via The Bayesian Information Criterion," Proc. Broadcast News Trans. & Under. Workshop, pp. 127-132, Feb., 1998.


Cite as: Zhou, B., Hansen, J.H.L. (2000) Unsupervised audio stream segmentation and clustering via the Bayesian information criterion. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 3, 714-717

@inproceedings{zhou00d_icslp,
  author={Bowen Zhou and John H. L. Hansen},
  title={{Unsupervised audio stream segmentation and clustering via the Bayesian information criterion}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 3, 714-717}
}