7th International Conference on Spoken Language Processing
September 16-20, 2002
Speaker change detection is a key pre-requisite to speaker tracking and speaker adaptation. It detects the points where a speaker identity changes in a multi-speaker audio stream. We first extract the speech segments from an audio stream by segmentation and classi- fication techniques. Using the extracted speech segments, the proposed weighted metric-based technique detects the speaker change points. New weights are originated from Fisher Linear Discriminant Analysis and, when used with Mel Cepstrum feature vectors, it has an effect of subband processing. Experiments were performed with HUB-4 Broadcast News Evaluation English Test Material (1999) and a movie audio track. Results showed that our technique gave about 37.7% improvement compared with Euclidean distance on the broadcast news data and about 27.1% on the movie data; with Mahalanobis distance, the improvements were 37.7% and 25.3% for broadcast news and movie data, respectively.
Bibliographic reference. Kwon, Soonil / Narayanan, Shrikanth S. (2002): "Speaker change detection using a new weighted distance measure", In ICSLP-2002, 2537-2540.