The 1st Workshop on Child, Computer and Interaction (WOCCI2008)
Chania, Crete, Greece
This paper introduces the acquisition, evaluation and baseline Automatic Speech Recognition (ASR) experiments of a novel corpus containing speech from a set of impaired and unimpaired young speakers. A group of 14 speakers with different speech disorders have uttered several sessions over a 57-word vocabulary in Spanish to gather more than 3 hours of speech. In addition to this work, a parallel corpus of speech from unimpaired young speakers has been recorded with more than 6 hours of speech with the same vocabulary. The impaired speech corpus has been evaluated through a manual labeling to detect the mispronunciations made by the speakers, and the outcome of this work show that 17.31% of the phonemes have been either mispronounced or deleted in an isolated work task. A baseline evaluation of the performance of an state-of-the-art ASR system shows a 35.02% of Word Error Rate (WER) when using Speaker Independent models based on adult speech. This WER is reduced to 27.60% using models based on children speech and further reduced to 15.35% using speaker dependent models. Finally, experiments on connected speech show how ASR performance degrades on 4 impaired speakers on the transition from isolated words to connected speech due to the language impairments of the speakers and the coarticulation in connected speech.
Bibliographic reference. Saz, Oscar / Rodríguez, William / Lleida, Eduardo / Vaquero, Carlos (2008): "A novel corpus of children²s disordered speech", In WOCCI-2008, paper 13.