15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

DIAPIX-FL: A Symmetric Corpus of Problem-Solving Dialogues in First and Second Languages

Mirjam Wester (1), María Luisa García Lecumberri (2), Martin Cooke (3)

(1) University of Edinburgh, UK
(2) Universidad del País Vasco, Spain
(3) Ikerbasque, Spain

This paper describes a corpus of conversations recorded using an extension of the DiapixUK task: the Diapix Foreign Language corpus (DIAPIX-FL) . English and Spanish native talkers were recorded speaking both English and Spanish. The bidirectionality of the corpus makes it possible to separate language (English or Spanish) from speaking in a first language (L1) or second language (L2). An acoustic analysis was carried out to analyse changes in F0, voicing, intensity, spectral tilt and formants that might result from speaking in an L2. The effect of L1 and nativeness on turn types was also studied. Factors that were investigated were pausing, elongations, and incomplete words. Speakers displayed certain patterns that suggest an on-going process of L2 phonological acquisition, such as the overall percentage of voicing in their speech. Results also show an increase in hesitation phenomena (pauses, elongations, incomplete turns), a decrease in produced speech and speech rate, a reduction of F0 range, raising of minimum F0 when speaking in the non-native language which are consistent with more tentative speech and may be used as indicators of non-nativeness.

Full Paper

Bibliographic reference.  Wester, Mirjam / Lecumberri, María Luisa García / Cooke, Martin (2014): "DIAPIX-FL: a symmetric corpus of problem-solving dialogues in first and second languages", In INTERSPEECH-2014, 509-513.