8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

OrienTel-Turkish: Telephone Speech Database Description and Notes on the Experience

Tolga Ciloglu (1), Dinc Acar (2), Ahmet Tokatli (1)

(1) Middle East Technical University, Ankara, Turkey
(2) Aselsan, Ankara, Turkey

OrienTel-Turkish includes telephone speech recordings and annotations of 1700 Turkish speakers balanced in gender, dialect, age and calling environment; approximately one third of calls are over the fixed network and the rest are over the mobile network. Each speaker contributes with 48 items containing digits, digit/number strings, time/date expressions, phonetically rich words and sentences, command words, and answers to spontaneous questions. The paper describes the contents of the completed database and presents notes on experience related to the preparation of the textual content, speaker recruitment, annotation, and error correction. SAMPA-Turkish has been created during the work.

Full Paper

Bibliographic reference.  Ciloglu, Tolga / Acar, Dinc / Tokatli, Ahmet (2004): "Orientel-turkish: telephone speech database description and notes on the experience", In INTERSPEECH-2004, 2725-2728.