INTERSPEECH 2010
11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Speaker Tracking in an Unsupervised Speech Controlled System

Tobias Herbig (1), Franz Gerl (2), Wolfgang Minker (3)

(1) Nuance Communications Aachen GmbH, Germany
(2) SVOX Deutschland GmbH, Germany
(3) Universität Ulm, Germany

In this paper we present a technique to increase the robustness of a self-learning speech controlled system comprising speech recognition, speaker identification and speaker adaptation. Our goal is the automatic personalization of a speech controlled device for groups of 5-10 recurring speakers. Speakers should be identified and tracked across speaker turns only by their voice patterns. Efficient information retrieval and the statistical representation of speaker characteristics have to be combined with a reliable and flexible speaker identification. Even on limited adaptation data, e.g. 2-3 command and control utterances, speakers have to be reliably tracked to allow continuous adaptation of complex statistical models. We present a novel approach of speaker identification on different time-scales based on a unified speech and speaker model. Experiments were carried out on a subset of the SPEECON database.

Full Paper

Bibliographic reference.  Herbig, Tobias / Gerl, Franz / Minker, Wolfgang (2010): "Speaker tracking in an unsupervised speech controlled system", In INTERSPEECH-2010, 2666-2669.