Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

SpeechDat Multilingual Speech Databases for Teleservices: Across the Finish Line

Harald Höge (1), Christoph Draxler (2), Henk van den Heuvel (3), Finn Tore Johansen (4), Eric Sanders (3), Herbert S. Tropf (1)

(1) Siemens AG, Corporate Technology Department, Munich, Germany; (2) Ludwig-Maximilian University Munich, Germany; (3) SPEX, Nijmegen, Netherlands; (4) Telenor R&D, Kjeller, Norway

The goal of the SpeechDat project is to develop spoken language resources for speech recognisers suited to realise voice driven teleservices. SpeechDat created speech databases for all official languages of the European Union and some major dialectal varieties and minority languages. The size of the databases ranges between 500 and 5000 speakers. In total 20 databases are recorded over the fixed telephone network, 5 databases over the cellular network, and 3 databases are designed for speaker verification. To date the project has successfully reached its end. This paper briefly describes the project, addresses the validation of the databases, their availability to consortium members and third parties, publicity and awareness, and the spin-off of the project in speech recognition research.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Höge, Harald / Draxler, Christoph / Heuvel, Henk van den / Johansen, Finn Tore / Sanders, Eric / Tropf, Herbert S. (1999): "Speechdat multilingual speech databases for teleservices: across the finish line", In EUROSPEECH'99, 2699-2702.