8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Large Vocabulary Continuous Speech Recognition in Greek: Corpus and an Automatic Dictation System

Vassilios Digalakis, Dimitrios Oikonomidis, D. Pratsolis, N. Tsourakis, C. Vosnidis, N. Chatzichrisafis, V. Diakoloukas

Technical University of Crete, Greece

In this work, we present the creation of the first Greek Speech Corpus and the implementation of a Dictation System for workflow improvement in the field of journalism. The current work was implemented under the project called Logotypografia ( Logos = logos, speech and T ypografia = typography) sponsored by the General Secretariat of Research and Development of Greece. This paper presents the process of data collection (texts and recordings), waveform processing (transcriptions), creation of the acoustic and language models and the final integration to a fully functional dictation system. The evaluation of this system is also presented. The Logotypografia database, described here, is available by ELRA.

Full Paper

Bibliographic reference.  Digalakis, Vassilios / Oikonomidis, Dimitrios / Pratsolis, D. / Tsourakis, N. / Vosnidis, C. / Chatzichrisafis, N. / Diakoloukas, V. (2003): "Large vocabulary continuous speech recognition in greek: corpus and an automatic dictation system", In EUROSPEECH-2003, 1565-1568.