1st Joint SIG-IL/Microsoft Workshop on Speech and Language Technologies for Iberian Languages

Porto Salvo, Portugal
September 3-4, 2009

A Catalan Broadcast Conversational Speech Database

Henrik Schulz, Josť A. R. Fonollosa

Department of Signal Theory and Communications, Technical University of Catalunya (UPC), Barcelona, Spain

Data driven methods in speech and linguistic research, and system develoment require appropriate speech databases. A new Catalan speech database has been developed with a particular emphasis on broadcast conversational speech. The article describes origin and nature of the broadcasts and its acoustic environment. Annotation and transcription provide statistics on specific phenomena of exhibited speech, speaker characteristics and acoustic events. It concludes with perspective uses and limitations.

Index Terms: Catalan, audio video speech database, broadcast conversation, literal transcription, spontaneous speech

Full Paper

Bibliographic reference.  Schulz, Henrik / Fonollosa, Josť A. R. (2009): "A Catalan broadcast conversational speech database", In SLTECH-2009, 27-29.