Odyssey 2010: The Speaker and Language Recognition Workshop
Brno, Czech Republic
This paper presents a preliminary study on the use of the Factor Analysis (FA) methods in an automatic speaker diarization process, dedicated to the meeting rooms. Indeed, the speaker diarization process, based on the top-down E-HMM approach, integrates a FA-based speaker modeling in an additional resegmentation step, which aims at helping the refinement of the output segmentation. Classically applied in speaker recognition to deal with channel variability issues, two main schemes of the FA application are proposed here: to deal with the (1) inter-speaker variability and with (2) the inter-segment variability. Different kinds of experiments have been conducted on the dataset of the last NIST/RT'09 evaluation campaign, leading to very interesting and promising results. For instance, they show that the couple of schemes proposed in this paper obtained competitive performance, compared to the baseline process, despite the small amount of development data used in this paper for the FA parameter estimation. Unexpectedly, they tend to show that the inter-segment variability component can be helpful for speaker diarization.
Full Paper (PDF)
Bibliographic reference. Tomasek, Pavel / Fredouille, Corinne / Matrouf, Driss (2010): "Factor analysis-based approaches applied to the speaker diarization task of meetings : a preliminary study", In Odyssey-2010, paper 024.