ODYSSEY 2004 - The Speaker and Language Recognition Workshop

May 31 - June 3, 2004
Toledo, Spain

A Bilingual Multi-Modal Voice Corpus for Language and Speaker Recognition (LASR) Services

Steven D. Beck (1), Reva Schwartz (2), Hirotaka Nakasone (3)

(1) BAE Systems Austin, TX, USA; (2) USSS Washington DC, USA; (3) FBI Quantico, VA, USA

Language and channel variations are two important concerns currently affecting practical automatic language and speaker recognition performance. To address these challenges, a corpus of speech was collected from 100 bilingual speakers in each of three foreign languages (Arabic-English, Korean-English, and Spanish-English). The recordings were made in highly controlled conditions using multiple microphones simultaneously, each with different measured response characteristics. The speakers were asked to perform a set of speaking tasks including conversations, text independent readings, and prescribed text readings. These tasks were performed in English and in each speakerís native language. The equipment, the recording procedures, and the data formats are presented, along with a preliminary analysis of recorded signal quality.

Full Paper

Bibliographic reference.  Beck, Steven D. / Schwartz, Reva / Nakasone, Hirotaka (2004): "A bilingual multi-modal voice corpus for language and speaker recognition (LASR) services", In ODYS-2004, 265-270.