ISCA - International Speech
Communication Association

SCOOT: Databases

Modern Speech technology relies on databases (or corpora) for training applications based on Machine Learning.

Corpus linguistics uses databases as a resource for language studies.


The European Language Resource Association (ELRA) is a non-profit organisation whose main mission is to make Language Resources (LRs) for Human Language Technologies (HLT) available to the community at large.

To achieve this goal, ELRA carries out a wide variety of activities around LRs, including Identification & Distribution, Production & Validation, Technology Evaluation, Information Dissemination on HLT.


The Linguistic Data Consortium (LDC) is an open consortium of universities, libraries, corporations and government research laboratories,  based in the USA. LDC was formed in 1992 to address the critical data shortage then facing language technology research and development.

Corpora can be very expensive but many of the classic ones are free or relatively cheap, e.g. TIMIT, the Wall Street Journal CorpusResource Management.

 Organisation  Events   Membership   Help 
 > Board  > Interspeech  > Join - renew  > Sitemap
 > Legal documents  > Workshops  > Membership directory  > Contact
 > Logos      > FAQ
       > Privacy policy

© Copyright 2024 - ISCA International Speech Communication Association - All right reserved.

Powered by Wild Apricot Membership Software