EUROSPEECH 2003 - INTERSPEECH 2003
On-line Speaker indexing is useful for multimedia applications such as meeting or teleconference archiving and browsing. It sequentially detects the points where a speaker identity changes in a multi-speaker audio stream, and classifies each speaker segment. The main problem of on-line processing is that we can use only current and previous information in the data stream for any decisioning. To address this difficulty, we apply a predetermined reference speaker-independent model set. This set can be useful for more accurate speaker modeling and clustering without actual training of target data speaker models. Once a speaker-independent model is selected from the reference set, it is adapted into a speaker-dependent model progressively. Experiments were performed with HUB-4 Broadcast News Evaluation English Test Material(1999) and Speaker Recognition Benchmark NIST Speech(1999). Results showed that our new technique gave 96.5% indexing accuracy on a telephone conversation data source and 84.3% accuracy on a broadcast news source.
Bibliographic reference. Kwon, Soonil / Narayanan, Shrikanth (2003): "A method for on-line speaker indexing using generic reference models", In EUROSPEECH-2003, 2653-2656.