Speaker segmentation is the task of finding speaker turns in an audio stream. We propose a metric-based algorithm based on Discrete Wavelet Transform (DWT) features. Principal component analysis (PCA) or linear discriminant analysis (LDA)  are further used to reduce the dimensionality of the feature space and remove redundant information. In the experiments our methods referred to as DWT-PCA and DWT-LDA are compared to the DISTBIC algorithm  using clean and noisy data of the TIMIT database. Especially, under conditions with strong noise, i.e. -10dB SNR, our DWT-PCA approach is very robust, the false alarm rate (FAR) increases by ¡«2% and the missed detection rate (MDR) stays about the same compared to clean speech, whereas the DISTBIC method fails ¡ª the FAR and MDR is almost ¡«0% and ¡«100%, respectively. For clean speech DWT-PCA shows an improvement of ¡«30% (relative) for both the FAR and MDR in comparison to the DISTBIC algorithm. DWT-LDA is performing slightly worse than DWT-PCA.
Bibliographic reference. Wiesenegger, Michael / Pernkopf, Franz (2009): "Wavelet-based speaker change detection in single channel speech data", In INTERSPEECH-2009, 836-839.