SLaTE 2015 - Workshop on Speech and Language Technology in Education

Leipzig, Germany
September 4-5, 2015

Inter-annotator Agreement for a Speech Corpus Pronounced by French and German Language Learners

Odile Mella, Dominique Fohr, Anne Bonneau

Université de Lorraine, LORIA, UMR 7503, Inria, Villers-lès-Nancy, France
CNRS, LORIA, UMR 7503, Vandoeuvre-lès-Nancy, France

This paper presents the results of an investigation of interannotator agreement for the non-native and native French part of the IFCASL corpus. This large bilingual speech corpus for French and German language learners was manually annotated by several annotators. This manual annotation is the starting point which will be used both to improve the automatic segmentation algorithms and derive diagnosis and feedback. The agreement is evaluated by comparing the manual alignments of seven annotators to the manual alignment of an expert, for 18 sentences. Whereas results for the presence of the devoicing diacritic show a certain degree of disagreement between the annotators and the expert, there is a very good consistency between annotators and the expert for temporal boundaries as well as insertions and deletions. We find a good overall agreement for boundaries between annotators and expert with a mean deviation of 7.6 ms and 93% of boundaries within 20 ms.

Full Paper

Bibliographic reference.  Mella, Odile / Fohr, Dominique / Bonneau, Anne (2015): "Inter-annotator agreement for a speech corpus pronounced by French and German language learners", In SLaTE-2015, 143-147.