Interspeech'2005 - Eurospeech
This paper introduces the new OLdenburg LOgatome speech corpus (OLLO) and outlines design considerations during its creation. OLLO is distinct from previous ASR corpora as it specifically targets (1) the fair comparison between human and machine speech recognition performance, and (2) the realistic representation of intrinsic variabilities in speech that are significant for automatic speech recognition (ASR) systems. To enable an unbiased human-machine comparison, OLLO is designed for recognition of individual phonemes that are embedded in logatomes, specifically, three-phoneme sequences with no semantic information. A balanced set of target-phonemes important for human and automatic speech recognition has been chosen, drawing on pilot ASR studies and cross-fertilization from the field of human speech intelligibility testing. Several intrinsic variabilities in speech are represented in OLLO, by recording from 40 speakers from four German dialect regions, and by covering six articulation characteristics. Results from preliminary phonetic time-labeling and ASR experiments are promising and consistent with corpus variabilities.
Bibliographic reference. Wesker, Thorsten / Meyer, Bernd / Wagener, Kirsten / Anemüller, Jörn / Mertins, Alfred / Kollmeier, Birger (2005): "Oldenburg logatome speech corpus (OLLO) for speech recognition experiments with humans and machines", In INTERSPEECH-2005, 1273-1276.