ISCA Archive Odyssey 2004
ISCA Archive Odyssey 2004

A bilingual multi-modal voice corpus for language and speaker recognition (LASR) services

Steven D. Beck, Reva Schwartz, Hirotaka Nakasone

Language and channel variations are two important concerns currently affecting practical automatic language and speaker recognition performance. To address these challenges, a corpus of speech was collected from 100 bilingual speakers in each of three foreign languages (Arabic-English, Korean-English, and Spanish-English). The recordings were made in highly controlled conditions using multiple microphones simultaneously, each with different measured response characteristics. The speakers were asked to perform a set of speaking tasks including conversations, text independent readings, and prescribed text readings. These tasks were performed in English and in each speaker’s native language. The equipment, the recording procedures, and the data formats are presented, along with a preliminary analysis of recorded signal quality.


Cite as: Beck, S.D., Schwartz, R., Nakasone, H. (2004) A bilingual multi-modal voice corpus for language and speaker recognition (LASR) services. Proc. The Speaker and Language Recognition Workshop (Odyssey 2004), 265-270

@inproceedings{beck04_odyssey,
  author={Steven D. Beck and Reva Schwartz and Hirotaka Nakasone},
  title={{A bilingual multi-modal voice corpus for language and speaker recognition (LASR) services}},
  year=2004,
  booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2004)},
  pages={265--270}
}