Non-Linear Speech Processing (NOLISP 03)

May 20-23, 2003
Le Croisic, France

Data-driven Speech Segmentation for Language Identification and Speaker Verification

Dijana Petrovska-Delacrétaz (1), Marcos Abalo (1), Asmaa El Hannani (1), Gérard Chollet (2)

(1) DIVA group, University of Fribourg, Switzerland
(2) ENST, CNRS-LTCI, Paris, France

The common denorninator of many speech processing methods is the set of acoustic units chosen to represent the structure of the data. The majority of current systems use phones (or related units) as an atomic representation of speech. The major problems that arise when phone based systems are being developed is the possible mismatch with the data being used and the lack of transcribed databases. The set of speech units can also be learned from examples, like in data-driven approaches. We have already used data-driven acoustic speech units, denoted as Automatic Language Independent Speech Processing (ALISP) units, for segmental speaker verification experiments, based on Multiple Layer Perceptrons, and on Dynamic Time Warping (DTW). In this paper we give an overview of the DTW based speaker verification and we present further developments of the data-driven ALISP speech segmentation for language identification experiments. The results confirm the applicability of the proposed method for these two tasks.

Full Paper

Bibliographic reference.  Petrovska-Delacrétaz, Dijana / Abalo, Marcos / Hannani, Asmaa El / Chollet, Gérard (2003): "Data-driven speech segmentation for language identification and speaker verification", In NOLISP-2003, paper 029.