|Saturday, February 11, 2012 by Chris Wellekens|
5-2-3 LDC Newsletter (January 2012)
In this newsletter:
LDC Celebrates its 20th Anniversary!
2012 marks LDC’s 20th Anniversary year – officially on April 15 – but this is cause for a yearlong celebration! From our founding in 1992 as a data repository and language resource distribution center, our online catalog has grown to include over 500 databases in 60 languages that have been licensed by over 3000 organizations from 80 different nations. This data has been made available through donations, funded projects at LDC or elsewhere, community initiatives, and from LDC resources, an indication of the collective strength of this consortium. And, LDC has evolved from an organization that shares language resources to one that also is at the forefront of language technology research that includes the development of new data resources, software tools, and standards and best practices.
2012 LDC Survey – Be on the Lookout!
It’s been four years since our last survey of LDC members and data licensees and we would like to again ask you to share your views on LDC and its language resources as well as your thoughts about data distribution in general and the impact of social media on language-related research and technology development. These topics are particularly timely as LDC enters its 20th anniversary year.
The 2012 LDC Survey will be sent to every person and organization that licensed LDC data and/or joined LDC as a Member during the period from 2009 through 2011. Those who complete the survey on or before February 7, 2012 will make their organization eligible for a $500 benefit to be applied to any corpus or membership purchase in 2012. LDC will conduct a blind drawing and one lucky winner will be chosen from the pool of respondents.
Many thanks for your continued support and for your participation in the 2012 Survey!
Membership Discounts for MY 2012 Still Available
If you are considering joining for Membership Year 2012 (MY2012), there is still time to save on membership fees. Any organization which joins or renews membership for 2012 through Thursday, March 1, 2012, is entitled to a 5% discount on membership fees. Organizations which held membership for MY2011 can receive a 10% discount on fees provided they renew prior to March 1, 2012. For further information on pricing, please consult our Announcements page or contact LDC.
(1) 2006 NIST Speaker Recognition Evaluation Test Set Part 2 was developed by LDC and National Institute of Standards and Technology (NIST). It contains 568 hours of conversational telephone and microphone speech in English, Arabic, Bengali, Chinese, Farsi, Hindi, Korean, Russian, Spanish, Thai and Urdu and associated English transcripts used as test data in the NIST-sponsored 2006 Speaker Recognition Evaluation (SRE).
The task of the 2006 SRE evaluation was speaker detection, that is, to determine whether a specified speaker is speaking during a given segment of conversational telephone speech. The task was divided into 15 distinct and separate tests involving one of five training conditions and one of four test conditions. Further information about the test conditions and additional documentation is available at the NIST web site for the 2006 SRE and within the 2006 SRE Evaluation Plan.
LDC has previously published 2006 NIST Speaker Recognition Evaluation Training Set and 2006 NIST Speaker Recognition Evaluation Test Set Part 1.
The speech data in this release was collected by LDC as part of the Mixer project, in particular Mixer Phases 1, 2 and 3. The Mixer project supports the development of robust speaker recognition technology by providing carefully collected and audited speech from a large pool of speakers recorded simultaneously across numerous microphones and in different communicative situations and/or in multiple languages. The data is mostly English speech, but includes some speech in Arabic, Bengali, Chinese, Farsi, Hindi, Korean, Russian, Spanish, Thai and Urdu.
The telephone speech segments are multi-channel data collected simultaneously from a number of auxiliary microphones. The files are organized into four types: two-channel excerpts of approximately 10 seconds, two-channel conversations of approximately 5 minutes, summed-channel conversations also of approximately 5 minutes and a two-channel conversation with the usual telephone speech replaced by auxiliary microphone data in the putative target speaker channel. The auxiliary microphone conversations are also of approximately five minutes in length. English language transcripts in .ctm format were produced using an automatic speech recognition (ASR) system.
2006 NIST Speaker Recognition Evaluation Test Set Part 2 is distributed on seven DVD-ROM.
2012 Subscription Members will automatically receive two copies of this corpus. 2012 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US$2000.
(2) TORGO Database of Dysarthric Articulation was developed by the University of Toronto's departments of Computer Science and Speech Language Pathology in collaboration with the Holland-Bloorview Kids Rehabilitation Hospital in Toronto, Canada. It contains approximately 23 hours of English speech data, accompanying transcripts and documentation from 8 speakers (5 males, 3 females) with cerebral palsy (CP) or amyotrophic lateral sclerosis (ALS) and from 7 speakers (4 males, 3 females) from a non-dysarthric control group.
CP and ALS are examples of dysarthria which is caused by disruptions in the neuro-motor interface that distort motor commands to the vocal articulators, resulting in atypical and relatively unintelligible speech in most cases. The TORGO database is primarily a resource for developing advanced automatic speaker recognition (ASR) models suited to the needs of people with dysarthria, but it is also applicable to non-dysarthric speech. The inability of modern ASR to effectively understand dysarthric speech is a problem since the more general physical disabilities often associated with the condition can make other forms of computer input, such as computer keyboards or touch screens, difficult to use.
The data consists of aligned acoustics and measured 3D articulatory features from the speakers carried out using the 3D AG500 electro-magnetic articulograph (EMA) system (Carstens Medizinelektronik GmbH, Lenglern, Germany) with fully-automated calibration. This system allows for 3D recordings of articulatory movements inside and outside the vocal tract, thus providing a detailed window on the nature and direction of speech-related activity.
All subjects read text consisting of non-words, short words and restricted sentences from a 19-inch LCD screen. The restricted sentences included 162 sentences from the sentence intelligibility section of Assessment of intelligibility of dysarthric speech (Yorkston & Beukelman, 1981) and 460 sentences derived from the TIMIT database. The unrestricted sentences were elicited by asking participants to spontaneously describe 30 images in interesting situations taken randomly from Webber Photo Cards - Story Starters (Webber, 2005), designed to prompt students to tell or write a story.
Data is organized by speaker and by the session in which each speaker recorded data. Each speaker's directory contains 'Session' directories which encapsulate data recorded in the respective visit and occasionally, a 'Notes' directory which can include Frenchay assessments (test for the measurement, description and diagnosis of dysarthria), notes about sessions (e.g., sensor errors), and other relevant notes.
TORGO Database of Dysarthric Articulation is distributed on 4 DVD-ROM.
2012 Subscription Members will automatically receive two copies of this corpus. 2012 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US$1200.
|Back to Top|