SCOOT Toolkits


SCOOT: Toolkits

Many speech researchers use software toolkits which implement popular algorithms, e.g. for defining and training an ASR or building a synthesiser. Toolkits often provide recipes for common tasks.

  • ASR Toolkits

  • Synthesis Toolkits

  • The Speech Recognition Virtual Kitchen

    • aims to 'provide straightforward access to the tools and techniques used by advanced researchers' by the use of virtual machines, which 'provide a consistent, end-to-end environment for experimentation, without the need to install other software or data, and cope with their incompatibilities and peculiarities.'


SCOOT: Linguistics




Linguistics is the scientific study of language and involves an analysis of language form, language meaning, and language in context.

  • Psycholinguistics is the study of the relationships between linguistic behaviour and psychological processes, including the process of language acquisition.
  • Phonetics is the study of speech sounds
  • Phonology is the organisation of speech sounds in a language


SCOOT: Databases


SCOOT: Databases

Modern Speech technology relies on databases (or corpora) for training applications based on Machine Learning.

Corpus linguistics uses databases as a resource for language studies.


The European Language Resource Association (ELRA) is a non-profit organisation whose main mission is to make Language Resources (LRs) for Human Language Technologies (HLT) available to the community at large.

To achieve this goal, ELRA carries out a wide variety of activities around LRs, including Identification & Distribution, Production & Validation, Technology Evaluation, Information Dissemination on HLT.


The Linguistic Data Consortium (LDC) is an open consortium of universities, libraries, corporations and government research laboratories,  based in the USA. LDC was formed in 1992 to address the critical data shortage then facing language technology research and development.

Corpora can be very expensive but many of the classic ones are free or relatively cheap, e.g. TIMIT, the Wall Street Journal CorpusResource Management.