Contents
- 1 . Editorial
- 2 . ISCA News
- 3 . SIG's activities
- 3-1 . SLaTE
- 4 . Future ISCA Conferences and Workshops(ITRW)
- 4-1 . INTERSPEECH 2008
- 4-2 . INTERSPEECH 2009
- 4-3 . INTERSPEECH 2010
- 4-4 . ITRW on experimental linguistics
- 4-5 . International Conference on Auditory-Visual Speech Processing AVSP 2008
- 4-6 . Christian Benoit workshop on Speech and Face to Face Communication
- 4-7 . CfP Second IEEE Spoken Language Technology Workshop Goa
- 5 . Books, databases and softwares
- 5-1 . Books
- 5-2 . LDC News
- 5-3 . Question Answering on speech transcripts (QAst)
- 5-4 . ELRA- Language Resources Catalogue-Update
- 5-5 . MusicSpeech group
- 6 . Jobs openings
- 6-1 . ATT - Labs Research: Research Staff Positions - Florham Park, NJ
- 6-2 . Summer Intern positions at Motorola Schaumburg Illinois USA
- 6-3 . Nuance: Software engineer speech dialog tools
- 6-4 . Nuance: Speech scientist London UK
- 6-5 . Nuance: Research engineer speech engine
- 6-6 . Nuance RESEARCH ENGINEER SPEECH DIALOG SYSTEMS:
- 6-7 . Research Position in Speech Processing at Nagoya Institute of Technology,Japan
- 6-8 . C/C++ Programmer Munich, Germany
- 6-9 . Speech and Natural Language Processing Engineer at M*Modal, Pittsburgh.PA,USA
- 6-10 . Senior Research Scientist -- Speech and Natural Language Processing at M*Modal, Pittsburgh, PA,USA
- 6-11 . Postdoc position at LORIA, Nancy, France
- 6-12 . Internships at Motorola Labs Schaumburg
- 6-13 . Masters in Human Language Technology
- 6-14 . PhD positions at Supelec,
- 6-15 . Speech Faculty Position at CMU, Pittsburgh, Pensylvania
- 6-16 . Opened positions at Microsoft: Danish Linguist (M/F)
- 6-17 . Opened positions at Microsoft: Swedish Linguist (M/F)
- 6-18 . Opened positions at Microsoft: Dutch Linguist (M/F)
- 6-19 . PhD position at Orange Lab
- 6-20 . Social speech scientist at Wright Patterson AFB, Ohio, USA
- 6-21 . Professeur a PHELMA du Grenoble INP (in french)
- 6-22 . POSTDOCTORAL FELLOWSHIP OPENING AT ICSI Berkeley
- 6-23 . PhD positions at GIPSA (formerly ICP) Grenoble France
- 6-24 . PhD in speech signal processing at Infineon Sophia Antipolis
- 6-25 . PhD position at Institut Eurecom Sophia Antipolis France
- 6-26 . Two PhD's positions at the University of Karlsruhe Germany
- 6-27 . Job opening at TFH Berlin University of Applied Sciences, Department of Computer Sciences and Media, Germany
- 6-28 . Offre d' Allocation de Recherche - Rentree Universitaire 2008 (in french)
- 6-29 . Theses de l' ecole doctorale MITT, Universite Paul Sabatier Toulouse III (mainly in french)
- 6-30 . Cambridge University Research Position in Speech processing
- 6-31 . Head of NLP at Voxid UK
- 7 . Journals
- 7-1 . Papers accepted for FUTURE PUBLICATION in Speech Communication
- 7-2 . Journal of Multimedia User Interfaces
- 7-3 . CURRENT RESEARCH IN PHONOLOGY AND PHONETICS: INTERFACES WITH NATURAL LANGUAGE PROCESSING
- 7-4 . IEEE Signal Processing Magazine: Special Issue on Digital Forensics
- 7-5 . Special Issue on Integration of Context and Content for Multimedia Management
- 7-6 . CfP Speech Communication: Special Issue On Spoken Language Technology for Education
- 7-7 . CfP Special Issue on Processing Morphologically Rich Languages IEEE Trans ASL
- 8 . Forthcoming events supported (but not organized) by ISCA
- 9 . Future Speech Science and Technology Events
- 9-1 . Call for participation INFILE@CLEF2008 Evaluation
- 9-2 . AERFAISS'08 Bilbao
- 9-3 . Speech production workshop: Paris
- 9-4 . Seminaires du Dpt Parole et Cognition du GIPSA (ex ICP Grenoble) (in french)
- 9-5 . 6th Intl Conference on Content-based Multimedia Indexing CBMI '08
- 9-6 . IIS2008 Workshop on Spoken Language and Understanding and Dialog Systems
- 9-7 . HLT Workshop on Mobile Language Technology (ACL-08)
- 9-8 . 4TH TUTORIAL AND RESEARCH WORKSHOP PIT08
- 9-9 . YRR-2008 Young Researchers' Roundtable
- 9-10 . SIGIR 2008 workshop: Searching Spontaneous Conversational Speech
- 9-11 . Summer school New Trends in Pattern Recognition for Language Bilbao Spain
- 9-12 . 2nd Workshop on Analytics for Noisy Unstructured Text Data
- 9-13 . eNTERFACE 2008 Orsay Paris
- 9-14 . 2nd IEEE Intl Conference on Semantic Computing
- 9-15 . EUSIPCO-2008 - 16th European Signal Processing Conference - Lausanne Switzerland
- 9-16 . 5th Joint Workshop on Machine Learning and Multimodal Interaction MLMI 2008
- 9-17 . TDS 2008 11th Int.Conf. on Text, Speech and Dialogue
- 9-18 . Third Workshop on Speech in Mobile and Pervasive Environments
- 9-19 . 50th International Symposium ELMAR-2008
- 9-20 . Dynamique de la nasalite (in french) Ile de Porquerolles
- 9-21 . 2008 International Workshop on Multimedia Signal Processing
- 9-22 . 4th IBM Watson Emerging leaders in Multimedia at IBM Watson.
- 9-23 . 2008 IEEE Intl Workshop on MACHINE LEARNING FOR SIGNAL PROCESSING
- 9-24 . V Jornadas en Tecnologia de Habla and Evaluation campaigns Bilbao Spain
- 9-25 . 10th International Conference on Multimodal Interfaces (ICMI 2008)
- 9-26 . 9th International Conference on Signal Processing
- 9-27 . 8th International Seminar on Speech Production - ISSP 2008
- 9-28 . 1st CfP 5th International MultiMedia Modeling Conference (MMM2009)
1 . Editorial
Dear Members,
The Board has taken an important decision: INTERSPEECH 2011 will take place in Florence,Italy. I am sure that it will be successful and will attract a lot of you into this wonderful city, cradle of the European Renaissance.
Meanwhile, life is going on. You have to prepare your trip to Brisbane. But do not forget all the appealing workshops listed below.
I still receive interesting job offers: I draw the attention of our young members to the possibilities of thesis funding and postdoc positions.
We are still working to improve ISCApad with the efficient help of Laurence Liu, a student of Helen Meng's from Hong Kong.
Please pay attention to our section ISCA News: the association needs your help.
Prof. em. Chris Wellekens
Institut Eurecom
France
2 . ISCA News
2-1 . ISCA Scientific Achievement Medalist 2008
ISCA Scientific Achievement Medal for 2008 It is with great pleasure that I announce the ISCA Medalist for 2008 - Hiroya Fujisaki. Prof. Fujisaki has contributed to the speech research community in so many aspects, in speech analysis, synthesis and prosody, that it will be a very hard task for me to summarize his long list of achievements. He is also the founder of the ICSLP series of conferences which, being now fully integrated as one of ISCA's yearly conferences, will have its 10th anniversary this year.
2-2 . INTERSPEECH 2011 in Florence
ISCA announces with great pleasure that the venue for
Interspeech 2011 will be FLORENCE.
2-3 . Help ISCA serve you better
The ISCA board is always interested in improving its activities and the membership services it provides. To help us with this, could you please send us your ideas/comments/suggestions/impressions? We would be most grateful if you could take a moment to complete the form on the ISCA website : http://www.isca-speech.org/index.php and send us your feedback.
Your message will be sent to the ISCA secretariat : secretariat@isca-speech.org
Please enter ideas/comments/suggestions/impressions you may have on any new (or old) activities and membership services.
Please note: you can send us your comments anonymously, if you so wish.
Eva Hajicova - Membership Services
3 . SIG's activities
3-1 . SLaTE
The International Speech Communication Association Special Interest Group (ISCA SIG) on
Speech and Language Technology in Education
A special interest group was created in mid-September 2006 at the Interspeech 2006 conference in
The next SLaTE ITRW will be in 2009 in
The purpose of the International Speech Communication Association (ISCA) Special Interest Group on Speech and Language Technology in Education (SLaTE) shall be to promote interest in the use of speech and natural language processing for education; to provide members of ISCA with a special interest in speech and language technology in education with a means of exchanging news of recent research developments and other matters of interest in Speech and Language Technology in Education; to sponsor meetings and workshops on that subject that appear to be timely and worthwhile, operating within the framework of ISCA's by-laws for SIGs; and to provide and make available resources relevant to speech and language technology in education, including text and speech corpora, analysis tools, analysis and generation software, research papers and generated data.
4 . Future ISCA Conferences and Workshops(ITRW)
4-1 . INTERSPEECH 2008
INTERSPEECH 2008 incorporating SST 08
September 22-26, 2008
Brisbane Convention & Exhibition Centre
Brisbane, Australia
http://www.interspeech2008.org/
Interspeech is the world's largest and most comprehensive conference on Speech
Science and Speech Technology. We invite original papers in any related area,
including (but not limited to):
Human Speech Production, Perception and Communication;
Speech and Language Technology;
Spoken Language Systems; and
Applications, Resources, Standardisation and Evaluation
In addition, a number of Special Sessions on selected topics have been organised and we invite you to submit for these also (see website for a complete list).
Interspeech 2008 has two types of submission formats: Full 4-page Papers and
Short 1-page Papers. Prospective authors are invited to submit papers in either
format via the conference website by 7 April 2008.
Important Dates
Paper Submission: Monday, 7 April 2008, 3pm GMT
Notification of Acceptance/Rejection: Monday, 16 June 2008, 3pm GMT
Early Registration Deadline: Monday, 7 July 2008, 3pm GMT
Tutorial Day: Monday, 22 September 2008
Main conference: 23-26 September 2008
For more information please visit the website http://www.interspeech2008.org
Chairman: Denis Burnham, MARCS, University of West Sydney.
4-2 . INTERSPEECH 2009
Brighton, UK,
Conference Website
Chairman: Prof. Roger Moore, University of Sheffield.
4-3 . INTERSPEECH 2010
Chiba, Japan
Conference Website
ISCA is pleased to announce that INTERSPEECH 2010 will take place in Makuhari-Messe, Chiba, Japan, September 26-30, 2010. The event will be chaired by Keikichi Hirose (Univ. Tokyo), and will have as a theme "Towards Spoken Language Processing for All - Regardless of Age, Health Conditions, Native Languages, Environment, etc."
4-4 . ITRW on experimental linguistics
August 2008, Athens, Greece
Website
Prof. Antonis Botinis
4-5 . International Conference on Auditory-Visual Speech Processing AVSP 2008
Dates: 26-29 September 2008
Location: Moreton Island, Queensland, Australia
Website: http://express.hid.ri.cmu.edu/AVSP2008/Main.html
AVSP 2008 will be held as an ISCA Tutorial and Research Workshop at
Tangalooma Wild Dolphin Resort on Moreton Island from the 26-29
September 2008. AVSP 2008 is a satellite conference to Interspeech 2008,
being held in Brisbane from the 22-26 September 2008. Tangalooma is
located at close distance from Brisbane, so that attendance at AVSP 2008
can easily be combined with participation in Interspeech 2008.
Auditory-visual speech production and perception by human and machine is
an interdisciplinary and cross-linguistic field which has attracted
speech scientists, cognitive psychologists, phoneticians, computational
engineers, and researchers in language learning studies. Since the
inaugural workshop in Bonas in 1995, Auditory-Visual Speech Processing
workshops have been organised on a regular basis (see an overview at the
avisa website). In line with previous meetings, this conference will
consist of a mixture of regular presentations (both posters and oral),
and lectures by invited speakers.
Topics include but are not limited to:
- Machine recognition
- Human and machine models of integration
- Multimodal processing of spoken events
- Cross-linguistic studies
- Developmental studies
- Gesture and expression animation
- Modelling of facial gestures
- Speech synthesis
- Prosody
- Neurophysiology and neuro-psychology of audition and vision
- Scene analysis
Paper submission:
Details of the paper submission procedure will be available on the
website in a few weeks time.
Chairs:
Simon Lucey
Roland Goecke
Patrick Lucey
4-6 . Christian Benoit workshop on Speech and Face to Face Communication
NEW Deadline for sending one page abstract = JUNE 9TH
Ten years after our colleague Christian Benoît departed, the mark that
he left is still very vivid in the international community. There will
soon be several occasions to honour his memory: during the next
Interspeech conference (Christian was secretary of the ESCA, future
ISCA, for a long time, the association is a French association of the
type described in the 1901 law and its official headquarters are still
in Grenoble), as well as during the next AVSP workshop (workshop of
which he was one of the creators). The Christian Benoît Association was
created in 1999 and regularly awards young researchers the "Christian
Benoît prize" to promote their research (the 4^th prize was awarded to
the phonetician Susanne Fuchs in 2007). The Christian Benoît association
http://www.icp.inpg.fr/ICP/_communication.fr.html#prixcb), along with
ICP, now Speech and Cognition Department of Gipsa-lab
(http://www.gipsa-lab.inpg.fr <http://www.gipsa-lab.inpg.fr/>), are
organizing a workshop/summer school to Christian Benoît’s memory, in the
line of his innovative and enthusiastic research style and aiming at
exploring the topic of "Speech and Face to Face Communication" in a
pluridisciplinary perspective: neuroscience, cognitive psychology,
phonetics, linguistics and computer modelling. The workshop "Speech and
Face to Face Communication" will be organized around 11 invited
conferences. All researchers from the field are invited to participate
through a call for papers and students will be encouraged to widely
attend the workshop and present their work.
Website: http://www.icp.inpg.fr/~dohen/face2face/
Deadline for sending one page abstracts: June 9th (see Call for Papers
<http://ww.icp.inpg.fr/%7Edohen/face2face/CallForPapers.html>)
You can subscribe to the Christian Benoît Association by sending 15
euros (active member; 45 euros or more, benefactors) to Pascal Perrier,
secretary of the association: Pascal.Perrier@gipsa-lab.inpg.fr
<mailto:Pascal.Perrier@gipsa-lab.inpg.fr>.
4-7 . CfP Second IEEE Spoken Language Technology Workshop Goa
Call for Papers:
Second IEEE Spoken Language Technology Workshop
Goa, India
December 15-18, 2008
The Second IEEE Spoken Language Technology (SLT) workshop will be held from December 15 to December 18, 2008 in Goa, India. The goal of this workshop is to bring both the speech processing and natural language processing communities together to share and present recent advances in various areas of spoken language technology, with the expectation that such a confluence of the researchers from both communities will foster new ideas, collaborations and new research directions in this area. The SLT 2008 workshop is endorsed by both ISCA and ACL organizations and eligible participants can apply for ISCA grants (http://www.isca-speech.org/grants.html).
Spoken language technology is a vibrant research area, with the potential for significant impact on government and industrial applications especially with the diversity and challenges offered by the multilingual business climates of today's world.
The workshop solicits papers on all aspects of spoken language technology:
o Spoken language understanding
o Spoken document summarization
o Machine translation for speech
o Spoken dialog systems
o Spoken language generation
o Spoken document retrieval
o Human computer Interactions (HCI)
o Speech data mining
o Information extraction from speech
o Question answering from speech
o Multimodal processing
o Spoken language based assistive technologies
o Spoken language systems and applications
o Spoken language databases and standards
In addition, this year's workshop will feature three special sessions:
1) Challenges in Asian spoken language processing with special emphasis on Indian languages
2) Mining human-human conversations: A resource for building efficient human-machine dialogs
3) Spoken Language on the go: Challenges and Opportunities for spoken language processing on mobile devices
Submissions for the Technical Program
-------------------------------------
The workshop program will consist of tutorials, oral and poster presentations, and panel discussions. Attendance will be limited with priority for those who will present technical papers; registration is required of at least one author for each paper. Submissions are encouraged on any of the topics listed above. The style guide, templates, and submission form will follow the IEEE ICASSP style. Three members of the Scientific Committee will review each paper. The workshop proceedings will be published on a CD-ROM.
Important Dates
---------------
*Camera-ready paper submission deadline: August 8, 2008
Hotel Reservation and Workshop registration opens: August 8, 2008
Paper Acceptance / Rejection: September 15, 2008
Hotel Reservation and Early Registration closes: October 5, 2008
Workshop: December 15-18, 2008*
For more information visit the SLT 2008 website http://slt2008.org or contact the organizing committee at info@slt2008.org <mailto:info@slt2008.org> if you have any questions.
5 . Books, databases and softwares
5-1 . Books
La production de la parole
Author: Alain Marchal, Universite d'Aix en Provence, France
Publisher: Hermes Lavoisier
Year: 2007
Speech enhancement-Theory and Practice
Author: Philipos C. Loizou, University of Texas, Dallas, USA
Publisher: CRC Press
Year:2007
Speech and Language Engineering
Editor: Martin Rajman
Publisher: EPFL Press, distributed by CRC Press
Year: 2007
Human Communication Disorders/ Speech therapy
This interesting series can be listed on Wiley website
Incurses em torno do ritmo da fala
Author: Plinio A. Barbosa
Publisher: Pontes Editores (city: Campinas)
Year: 2006 (released 11/24/2006)
(In Portuguese, abstract attached.) Website
Speech Quality of VoIP: Assessment and Prediction
Author: Alexander Raake
Publisher: John Wiley & Sons, UK-Chichester, September 2006
Website
Self-Organization in the Evolution of Speech, Studies in the Evolution of Language
Author: Pierre-Yves Oudeyer
Publisher:Oxford University Press
Website
Speech Recognition Over Digital Channels
Authors: Antonio M. Peinado and Jose C. Segura
Publisher: Wiley, July 2006
Website
Multilingual Speech Processing
Editors: Tanja Schultz and Katrin Kirchhoff ,
Elsevier Academic Press, April 2006
Website
Reconnaissance automatique de la parole: Du signal a l'interpretation
Authors: Jean-Paul Haton
Christophe Cerisara
Dominique Fohr
Yves Laprie
Kamel Smaili
392 Pages Publisher: Dunod
The Application of Hidden Markov Models in Speech Recognition By Mark Gales and Steve Young (University of Cambridge)
http://dx.doi.org/10.1561/2000000004
102 pages
Publisher: Berlin Institute of Technology
Year: 2008
Website http://www.ub.tu-berlin.de/index.php?id=1843
Usability of Speech Dialog Systems
Listening to the Target Audience
Series: Signals and Communication Technology
Hempel, Thomas (Ed.)
2008, X, 175 p. 14 illus., Hardcover
ISBN: 978-3-540-78342-8
Speech and Language Processing, 2nd Edition
By Daniel Jurafsky, James H. Martin
- Published May 16, 2008 by Prentice Hall.
- More Info
- Copyright 2009
- Dimensions 7" x 9-1/4"
- Pages: 1024
- Edition: 2nd.
- ISBN-10: 0-13-187321-0
- ISBN-13: 978-0-13-187321-6
- Request an Instructor or Media review copy
- Copyright 2009
An explosion of Web-based language techniques, merging of distinct fields, availability of phone-based dialogue systems, and much more make this an exciting time in speech and language processing. The first of its kind to thoroughly cover language technology – at all levels and with all modern technologies – this book takes an empirical approach to the subject, based on applying statistical and other machine-learning algorithms to large corporations. KEY TOPICS: Builds each chapter around one or more worked examples demonstrating the main idea of the chapter, usingthe examples to illustrate the relative strengths and weaknesses of various approaches. Adds coverage of statistical sequence labeling, information extraction, question answering and summarization, advanced topics in speech recognition, speech synthesis. Revises coverage of language modeling, formal grammars, statistical parsing, machine translation, and dialog processing. MARKET: A useful reference for professionals in any of the areas of speech and language processing.
5-2 . LDC News
Membership Mailbag - 'Penn' Treebanks and Recent Directions in English Treebanking
The LDC Membership Office responds to over 4000 emailed queries a year, and, over time, we've noticed that some questions tend to crop up with regularity. To address the questions that you, our data users, have asked, we'd like to introduce our new Membership Mailbag series of newsletter articles. This month we will look into the differences between the 'Penn' Treebanks and review recent directions in English treebanking.
Treebank-2 and Treebank-3 both contain 1 million words of Wall Street Journal (WSJ) text and a small sample of ATIS-3 data that have been annotated using a Treebank II annotation-style, plus a part-of-speech tagged version of the Brown corpus. Treebank-3 is considered a super-set of Treebank-2. That is, if you are undecided between Treebank-2 and -3, in most instances, the best choice would be Treebank-3. Treebank-3 corrects known technical errors in Treebank-2 plus it contains Switchboard data which has been tagged, dysfluency-annotated, and a small portion of the Brown corpus which has been parsed in the Treebank II annotation-style.
Note, however, that there are a few items missing from Treebank-3 that are found in Treebank-2. Treebank 3 does not contain the complete parsed Brown corpus. Treebank-2 contains the complete parsed Brown corpus done in the older Treebank I annotation-style; this is not contained in Treebank-3. Also, Treebank-3 does not include the tgrep software for extracting data, but tgrep and a newer version, tgrep2, are freely available online. Finally, Treebank-3 does not contain the raw Wall Street Journal (WSJ) text, but organizations can obtain this by request.
Much recent treebanking has focused on languages other than English, but English treebanking efforts did not come to an end with the release of Treebank-3. Ongoing work uses an updated Treebank II annotation-style and consists of two types of annotation; straight treebanking and treebanking in combination with another kind of annotation. Straight treebank annotation can be found in corpora such as English Chinese Translation Treebank v 1.0 and English-Arabic Treebank v 1.0. In these corpora, the Chinese or Arabic source texts have been translated into English, then POS-tagged and treebanked, thus making them suitable for machine translation work as well. Additional translation treebanks are planned for release and will feature cleaner translation and contain substantially more data.
Corpora which combine treebanking with another type of annotation include the English Conversational Telephone Speech Treebank with Structural Metadata, to be released later this year. This treebank is annotated for structural metadata including fillers, disfluencies and sentence/semantic units, and also tagged for syntactic structure, and so, evaluates the impact of metadata extraction (MDE) on parsing information. While these newer releases are smaller than the Penn Treebanks, the improved Treebank II annotation-style has a very high rate of inter-annotator agreement.. Additionally, the source texts are more varied in both domain and style than the WSJ texts that constitute the bulk of Penn Treebank.
Got a question? About LDC data? Forward it to ldc@ldc.upenn.edu. The answer may appear in a future Membership Mailbag article.
New Publications
(1) Chinese Proposition Bank 2.0 (CPB2.0) is a continuation of the Chinese Proposition Bank project, which aims to create a corpus of Chinese text annotated with information about basic semantic propositions. Chinese Proposition Bank 1.0 consists of predicate-argument annotation on 250,000 words from Chinese Treebank 5.0. Chinese Proposition Bank 2.0 adds predicate-argument annotation on 500,000 words from Chinese Treebank 6.0. The data sources include newswire from Xinhua News Agency, articles from Sinorama Magazine, news from the website of the Hong Kong Special Administrative Region and transcripts from various Chinese broadcast news programs.
This release contains the predicate-argument annotation of 81,009 verb instances (11,171 unique verbs) and 14,525 noun instances (1,421 unique nouns). The annotation of nouns is limited to nominalizations that have a corresponding verb. The general annotation guidelines and the lexical guidelines (called frame files) for each verbal and nominal predicate are included in this release. Chinese Proposition Bank 2.0 is distributed via web download.
2008 Subscription Members will automatically receive two copies of this corpus on disc. 2008 Standard Members may request a copy as part of their 16 free membership corpora. Nonmembers may license this data for US$850.
*
(2) Hindi WordNet was developed by researchers at the Center for Indian Language Technology, Computer Science and Engineering Department, IIT Bombay. Wordnets are systems for analyzing the different lexical and semantic relations between words. Specifically, a wordnet is a word sense network in which words are grouped into semantically equivalent units called synsets. Each synset represents a lexical concept, and synsets are linked to each other by semantic relations (between synsets) and lexical relations (between words). Similar in design to the
Additional information about the development of Hindi Wordnet is available at the Hindi WordNet web site.
Hindi WordNet contains nouns, verbs, adjectives and adverbs. Each entry consists of the following elements:
1. Synset: a set of synonymous words. The words in the synset are arranged according to the frequency of usage.
2. Gloss: the concept. It consists of two parts:
Text definition: explains the concept denoted by the synset.
Example sentence: gives the usage of the words in the sentence.
3. Position in Ontology: An ontology is a hierarchical organization of concepts, or more specifically, a categorization of entities and actions. A separate ontological hierarchy exists for each syntactic category (noun, verb, adjective adverb). Each synset is mapped into some place in the ontology..
This release of Hindi WordNet is made available as a complete Java application along with an API to facilitate further development. Hindi WordNet is distributed via web download.
2008 Subscription Members will automatically receive two copies of this corpus on disc, provided that they have submitted a signed copy of the User License Agreement for Hindi WordNet (LDC2008L02). 2008 Standard Members may request a copy as part of their 16 free membership corpora. Nonmembers may license this data for US$300.
*
(3) West Point Brazilian Portuguese Speech is a database of digital recordings of spoken Brazilian Portuguese designed and collected by staff and faculty of the Department of Foreign Languages (DFL) and Center for Technology Enhanced Language Learning (CTELL) to develop acoustic models for speech recognition systems. The
The data in this corpus was collected in March 1999 in
The speech was collected using four laptop computers running MS Windows. Three of the computers recorded with a 16 bit data size and sampling rate of 22050 Hz, the other laptop recorded with an 8 bit data size at a sampling rate of 11025 Hz. The recording script presented a visual display of the sentence to be recorded. The informant pressed a key and spoke the sentence. The recording was played back for review, allowing the utterance to be re-recorded. West Point Brazilian Portuguese Speech is distributed on one DVD-ROM.
2008 Subscription Members will automatically receive two copies of this corpus. 2008 Standard Members may request a copy as part of their 16 free membership corpora. Nonmembers may license this data for US$500. l5-3 . Question Answering on speech transcripts (QAst)
The QAst organizers are pleased to announce the release of the development dataset for
the CLEF-QA 2008 track "Question Answering on Speech Transcripts" (QAst).
We take this opportunity to launch a first call for participation in
this evaluation exercise.
QAst is a CLEF-QA track that aims at providing an evaluation framework
for QA technology on speech transcripts, both manual and automatic.
A detailed description of this track is available at:
http://www.lsi.upc.edu/~qast <http://www.lsi.upc.edu/~qast>
It is the second evaluation for the QAst track.
Last year (QAst 2007), factual questions had been generated for two
distinct corpora (in English language only). This year, in addition to
factual questions,
some definition questions are generated, and five corpora covering three
different languages are used (3 corpora in English, 1 in Spanish and 1
in French).
Important dates:
# 15 June 2008: evaluation set released
# 30 June 2008: submission deadline
The pilot track is organized jointly by the Technical University of
Catalonia (UPC), the Evaluations and Language resources Distribution
Agency (ELDA) and Laboratoire d'Informatique pour la Mécanique et les
Sciences de l'Ingénieur (LIMSI).
If you are interested in participating please send an email to Jordi
Turmo (turmo_AT_lsi.upc.edu) with "QAst" in the subject line.
5-4 . ELRA- Language Resources Catalogue-Update
5-5 . MusicSpeech group
Music and speech share numerous aspects (language, structural, acoustics, cognitive), as long in their production, that in their representation and their perception. This list has for object to warn its users, various events dealing with the study of the links between music and speech. It thus intends to connect several communities, their allowing each to take advantage of a stimulating interaction.
As a member of the speech or music community, you are invited to
subscribe to musicspeech group. The group will be moderated and
maintained by IRCAM.
Group details:
* Name: musicspeech
* Home page: http://listes.ircam.fr/wws/info/musicspeech
* Email address: musicspeech@ircam.fr
Greg Beller, IRCAM,
moderator, musicspeech list
6 . Jobs openings
We invite all laboratories and industrial companies which have job offers to send them to the ISCApad editor: they will appear in the newsletter and on our website for free. (also have a look at http://www.isca-speech.org/jobs.html as well as http://www.elsnet.org/ Jobs)
6-1 . ATT - Labs Research: Research Staff Positions - Florham Park, NJ
ATT - Labs Research is seeking exceptional candidates for Research Staff positions. AT&T is the premiere broadband, IP, entertainment, and wireless communications company in the U.S. and one of the largest in the world. Our researchers are dedicated to solving real problems in speech and language processing, and are involved in inventing, creating and deploying innovative services. We also explore fundamental research problems in these areas. Outstanding Ph.D.-level candidates at all levels of experience are encouraged to apply. Candidates must demonstrate excellence in research, a collaborative spirit and strong communication and software skills. Areas of particular interest are
- Large-vocabulary automatic speech recognition
- Acoustic and language modeling
- Robust speech recognition
- Signal processing
- Speaker recognition
- Speech data mining
- Natural language understanding and dialog
- Text and web mining
- Voice and multimodal search
AT&T Companies are Equal Opportunity Employers. All qualified candidates will receive full and fair consideration for employment. More information and application instructions are available on our website at http://www.research.att.com/. Click on "Join us". For more information, contact Mazin Gilbert (mazin at research dot att dot com).
6-2 . Summer Intern positions at Motorola Schaumburg Illinois USA
Motorola Labs - Center for Human Interaction Research (CHIR) located in Schaumburg Illinois, USA, is offering summer intern positions in 2008 (12 weeks each).
CHIR's mission:
Our research lab develops technologies that provide access to rich communication, media and information services effortless, based on natural, intelligent interaction. Our research aims on systems that adapt automatically and proactively to changing environments, device capabilities and to continually evolving knowledge about the user.
Intern profiles:
1) Acoustic environment/event detection and classification.
Successful candidate will be a PhD student near the end of his/her PhD study and is skilled in signal processing and/or pattern recognition; he/she knows Linux and C/C++ programming. Candidates with knowledge of acoustic environment/event classification are preferred.
2) Speaker adaptation for applications on speech recognition and spoken document retrieval.
The successful candidate must currently be pursuing a Ph.D. degree in EE or CS with complete understanding and hand-on experience on automatic speech recognition related research. Proficiency in Linux/Unix working environment and C/C++ programming. Strong GPA. A strong background in speaker adaptation is highly preferred.
3) Development of voice search-based web applications on a smartphone
We are looking for an intern candidate to help create an "experience" prototype based on our voice search technology. The app will be deployed on a smartphone and demonstrate intuitive and rich interaction with web resources. This intern project is oriented more towards software engineering than research. We target an intern with a master's degree and strong software engineering background. Mastery of C++ and experience with web programming (AJAX and web services) is required. Development experience on Windows CE/Mobile desired.
4) Integrated Voice Search Technology For Mobile Devices.
Candidate should be proficient in information retrieval, pattern recognition and speech recognition. Candidate should program in C++ and script languages such as Python or Perl in Linux environment. Also, he/she should have knowledge on information retrieval or search engines.
We offer competitive compensation, fun-to-work environment and Chicago-style pizza.
If you are interested, please send your resume to:
Dusan Macho, CHIR-Motorola Labs
Email: dusan.macho@motorola.com
Tel: +1-847-576-6762
6-3 . Nuance: Software engineer speech dialog tools
In order to strengthen our Embedded ASR Research team, we are looking for a:
SOFTWARE ENGINEER SPEECH DIALOGUE TOOLS
As part of our team, you will be creating solutions for voice user interfaces for embedded applications on mobile and automotive platforms.
OVERVIEW:
- You will work in Nuance's Embedded ASR R&D team, developing technology, tools, and run-time software to enable our customers to develop and test embedded speech applications. Together with our team of speech and language experts, you will work on natural language dialogue systems for our customers in the Automotive and Mobile sector.
- You will work either at Nuance's Office in Aachen, a beautiful, old city right in the heart of Europe with great history and culture, or at Nuance's International Headquarters in Merelbeke, a small town just 5km away from the heart of the vibrant and picturesque city of Ghent, in the Flanders region of Belgium. Both Aachen and Ghent offer some of the most spectacular historic town centers in Europe, and are home to large international universities.
- You will work in an international company and cooperate with people on various locations including in Europe, America and Asia. You may occasionally be asked to travel.
RESPONSIBILITIES:
- You will work on the development of tools and solutions for cutting edge speech and language understanding technologies for automotive and mobile devices.
- You will work on enhancing various aspects of our advanced natural language dialogue system, such as the layer of connected applications, the configuration setup, inter-module communication, etc.
- In particular, you will be responsible for the design, implementation, evaluation, optimization and testing, and documentation of tools such as GUI and XML applications that are used to develop, configure, and fine-tune advanced dialogue systems.
QUALIFICATIONS:
- You have a university degree in computer science, engineering, mathematics, physics, computational linguistics, or a related field.
- You have very strong software and programming skills, especially in C/C++, ideally also for embedded applications.
- You have experience with Python or other scripting languages.
- GUI programming experience is a strong asset.
The following skills are a plus:
- Understanding of communication protocols
- Understanding of databases
- Understanding of computational agents and related frameworks (such as OAA).
- A background in (computational) linguistics, dialogue systems, speech processing, grammars, and parsing techniques, statistics and machine learning, especially as related to natural language processing, dialogue, and representation of information
- You can work both as a team player and as goal-oriented independent software engineer.
- You can work in a multi-national team and communicate effectively with people of different cultures.
- You have a strong desire to make things really work in practice, on hardware platforms with limited memory and processing power.
- You are fluent in English and you can write high quality documentation.
- Knowledge of other languages is a plus.
CONTACT:
Please send your applications, including cover letter, CV, and related documents (maximum 5MB total for all documents, please) to
Deanna Roe Deanna.roe@nuance.com
Please make sure to document to us your excellent software engineering skills.
ABOUT US:
Nuance is the leading provider of speech and imaging solutions for businesses and consumers around the world. Every day, millions of users and thousands of businesses experience Nuance by calling directory assistance, requesting account information, dictating patient records, telling a navigation system their destination, or digitally reproducing documents that can be shared and searched. With more than 3000 employees worldwide, we are committed to make the user experience more enjoyable by transforming the way people interact with information and how they create, share and use documents. Making each of those experiences productive and compelling is what Nuance is about.
6-4 . Nuance: Speech scientist London UK
Nuance is the leading provider of speech and imaging solutions for businesses and consumers around the world. Every day, millions of users and thousands of businesses experience Nuance by calling directory assistance, requesting account information, dictating patient records, telling a navigation system their destination, or digitally reproducing documents that can be shared and searched. With more than 2000 employees worldwide, we are committed to make the user experience more enjoyable by transforming the way people interact with information and how they create, share and use documents. Making each of those experiences productive and compelling is what Nuance is about.
To strengthen our International Professional Services team, based in London, we are currently looking for a
Speech Scientist, London, UK
Nuance Professional Services (PS) has designed, developed, and optimized thousands of speech systems across dozens of industries, including directory search, call center automation, applications in telecom, finance, airline, healthcare, and other verticals; applications for video games, mobile dictation, enhanced search services, SMS, and in-car navigation. Nuance PS applications have automated approximately 7 billion phone conversations for some of the world's most respected companies, including British Airways, Vodafone, Amtrak, Bank of America, BellCanada, Citigroup, General Electric, NTT and Verizon.
The PS organization consists of energetic, motivated, and friendly individuals. The Speech Scientists in PS are among the best and brightest, with PhDs from universities such as Cambridge (UK), MIT, McGill, Harvard, Penn, CMU, and Georgia Tech, and having worked at research labs such Bell Labs, Motorola Labs, and ATR (Japan), culminating in over 300 years of Speech Science experience and covering well over 20 languages.
Come and join Nuance PS and work on the latest technology from one of the prominent speech recognition technology providers, and make a difference in the way the world communicates.
Job Overview
As a Speech Scientist in the Professional Services group, you will work on automated speech recognition applications, covering a broad range of activities in all project phases, including the design, development, and optimization of the system. You will:
- Work across application development teams to ensure best possible recognition performance in deployed systems
- Identify recognition challenges and assess accuracy feasibility during the design phase,
- Design, develop, and test VoiceXML grammars and create JSPs, Java, and ECMAscript grammars for dynamic contexts
- Optimize accuracy of applications by analyzing performance and tuning statistical language models, pronunciations, and acoustic models, including identifying areas for improvement by running the recognizer offline
- Contribute to the generation and presentation of client-facing reports
- Act as technical lead on more intensive client projects
- Develop methodologies, scripts, procedures that improve efficiency and quality
- Develop tools and enhance algorithms that facilitate deployment and tuning of recognition components
- Act as subject matter domain expert for specific knowledge domains
- Provide input into the design of future product releases
Required Skills
- MS or PhD in Computer Science, Engineering, Computational Linguistics, Physics, Mathematics, or related field (or equivalent)
- Strong analytical and problem solving skills and ability to troubleshoot issues
- Good judgment and quick-thinking
- Strong programming skills, preferably Perl or Python
- Excellent written and verbal communications skills
- Ability to scope work taking technical, business and time-frame constraints into consideration
- Works well in a team and in a fast-paced environment
Beneficial Skills
- Strong programming skills in either Perl, Python, Java, C/C++, or Matlab
- Speech recognition knowledge
- Strong pattern recognition, linguistics, signal processing, or acoustics knowledge
- Statistical data analysis
- Experience with XML, VoiceXML, and Wiki
- Ability to mentor or supervise others
- Additional language skills, eg French, Dutch, German, Spanish
6-5 . Nuance: Research engineer speech engine
In order to strengthen our Embedded ASR Research team, we are looking for a:
RESEARCH ENGINEER SPEECH ENGINE
As part of our team, you will be creating solutions for voice user interfaces for embedded applications on mobile and automotive platforms.
OVERVIEW:
- You will work in Nuance's Embedded ASR R&D team, developing, improving and maintaining core ASR engine algorithms for our customers in the Automotive and Mobile sector.
- You will work either at Nuance's Office in Aachen, a beautiful, old city right in the heart of Europe with great history and culture, or at Nuance's International Headquarters in Merelbeke, a small town just 5km away from the heart of the vibrant and picturesque city of Ghent, in the Flanders region of Belgium. Both Aachen and Ghent offer some of the most spectacular historic town centers in Europe, and are home to large international universities.
- You will work in an international company and cooperate with people on various locations including in Europe, America and Asia. You may occasionally be asked to travel.
RESPONSIBILITIES:
- You will work on the developing, improving and maintaining core ASR engine algorithms for cutting edge speech and natural language understanding technologies for automotive and mobile devices.
- You will work on the design and development of more efficient, flexible ASR search algorithms with high focus on low memory and processor requirements.
QUALIFICATIONS:
- You have a university degree in computer science, engineering, mathematics, physics, computational linguistics, or a related field. PhD is a plus.
- A background in (computational) linguistics, speech processing, ASR search, confidence values, grammars, statistics and machine learning, especially as related to natural language processing.
- You have very strong software and programming skills, especially in C/C++, ideally also for embedded applications.
The following skills are a plus:
- You have experience with Python or other scripting languages.
- Broad knowledge about architectures of embedded platforms and processors.
- Understanding of databases
- You can work both as a team player and as goal-oriented independent software engineer.
- You can work in a multi-national team and communicate effectively with people of different cultures.
- You have a strong desire to make things really work in practice, on hardware platforms with limited memory and processing power.
- You are fluent in English and you can write high quality documentation.
- Knowledge of other languages is a plus.
CONTACT:
Please send your applications, including cover letter, CV, and related documents (maximum 5MB total for all documents, please) to
Deanna Roe Deanna.roe@nuance.com
Please make sure to document to us your excellent software engineering skills.
ABOUT US:
Nuance is the leading provider of speech and imaging solutions for businesses and consumers around the world. Every day, millions of users and thousands of businesses experience Nuance by calling directory assistance, requesting account information, dictating patient records, telling a navigation system their destination, or digitally reproducing documents that can be shared and searched. With more than 3000 employees worldwide, we are committed to make the user experience more enjoyable by transforming the way people interact with information and how they create, share and use documents. Making each of those experiences productive and compelling is what Nuance is about.
6-6 . Nuance RESEARCH ENGINEER SPEECH DIALOG SYSTEMS:
In order to strengthen our Embedded ASR Research team, we are looking for a:
RESEARCH ENGINEER SPEECH DIALOGUE SYSTEMS
As part of our team, you will be creating speech technologies for embedded applications varying from simple command and control tasks up to natural language speech dialogues on mobile and automotive platforms.
OVERVIEW:
-You will work in Nuance's Embedded ASR research and production team, creating technology, tools and runtime software to enable our customers develop embedded speech applications. In our team of speech and language experts, you will work on natural language dialogue systems that define the state of the art.
- You will work at Nuance's International Headquarters in Merelbeke, a small town just 5km away from the heart of the picturesque city of Ghent, in the Flanders region of Belgium. Ghent has one of the most spectacular historic town centers of Europe and is known for its unique vibrant yet cozy charm, and is home to a large international university.
- You will work in an international company and cooperate with people on various locations including in Europe, America, and Asia. You may occasionally be asked to travel.
RESPONSIBILITIES:
- You will work on the development of cutting edge natural language dialogue and speech recognition technologies for automotive embedded systems and mobile devices.
- You will design, implement, evaluate, optimize, and test new algorithms and tools for our speech recognition systems, both for research prototypes and deployed products, including all aspects of dialogue systems design, such as architecture, natural language understanding, dialogue modeling, statistical framework, and so forth.
- You will help the engine process multi-lingual natural and spontaneous speech in various noise conditions, given the challenging memory and processing power constraints of the embedded world.
QUALIFICATIONS:
- You have a university degree in computer science, (computational) linguistics, engineering, mathematics, physics, or a related field. A graduate degree is an asset.
-You have strong software and programming skills, especially in C/C++, ideally for embedded applications. Knowledge of Python or other scripting languages is a plus. [HQ1]
- You have experience in one or more of the following fields:
dialogue systems
applied (computational) linguistics
natural language understanding
language generation
search engines
speech recognition
grammars and parsing techniques.
statistics and machine learning techniques
XML processing
-You are a team player, willing to take initiative and assume responsibility for your tasks, and are goal-oriented.
-You can work in a multi-national team and communicate effectively with people of different cultures.
-You have a strong desire to make things really work in practice, on hardware platforms with limited memory and processing power.
-You are fluent in English and you can write high quality documentation.
-Knowledge of other languages is a strong asset.
CONTACT:
Please send your applications, including cover letter, CV, and related documents (maximum 5MB total for all documents, please) to
Deanna Roe Deanna.roe@nuance.com
ABOUT US:
Nuance is the leading provider of speech and imaging solutions for businesses and consumers around the world. Every day, millions of users and thousands of businesses experience Nuance by calling directory assistance, requesting account information, dictating patient records, telling a navigation system their destination, or digitally reproducing documents that can be shared and searched. With more than 3000 employees worldwide, we are committed to make the user experience more enjoyable by transforming the way people interact with information and how they create, share and use documents. Making each of those experiences productive and compelling is what Nuance is about.
6-7 . Research Position in Speech Processing at Nagoya Institute of Technology,Japan
Nagoya Institute of Technology is seeking a researcher for a
post-doctoral position in a new European Commission-funded project
EMIME ("Efficient multilingual interaction in mobile environment")
involving Nagoya Institute of Technology and other five European
partners, starting in March 2008 (see the project summary below).
The earliest starting date of the position is March 2007. The initial
duration of the contract will be one year, with a possibility for
prolongation (year-by-year basis, maximum of three years). The
position provides opportunities to collaborate with other researchers
in a variety of national and international projects. The competitive
salary is calculated according to qualifications based on NIT scales.
The candidate should have a strong background in speech signal
processing and some experience with speech synthesis and recognition.
Desired skills include familiarity with latest spectrum of technology
including HTK, HTS, and Festival at the source code level.
For more information, please contact Keiichi Tokuda
(http://www.sp.nitech.ac.jp/~tokuda/).
About us
Nagoya Institute of Technology (NIT), founded on 1905, is situated in
the world-quality manufacturing area of Central Japan (about one hour
and 40 minetes from Tokyo, and 36 minites from Kyoto by Shinkansen).
NIT is a highest-level educational institution of technology and is
one of the leaders of such institutions in Japan. EMIME will be
carried at the Speech Processing Laboratory (SPL) in the Department of
Computer Science and Engineering of NIT. SPL is known for its
outstanding, continuous contribution of developing high-performance,
high-quality opensource software: the HMM-based Speech Synthesis
System "HTS" (http://hts.sp.nitech.ac.jp/), the large vocabulary
continuous speech recognition engine "Julius"
(http://julius.sourceforge.jp/), and the Speech Signal Processing
Toolkit "SPTK" (http://sp-tk.sourceforge.net/). The laboratory is
involved in numerous national and international collaborative
projects. SPL also has close partnerships with many industrial
companies, in order to transfer its research into commercial
applications, including Toyota, Nissan, Panasonic, Brother Inc.,
Funai, Asahi-Kasei, ATR.
Project summary of EMIME
The EMIME project will help to overcome the language barrier by
developing a mobile device that performs personalized speech-to-speech
translation, such that a user's spoken input in one language is used
to produce spoken output in another language, while continuing to
sound like the user's voice. Personalization of systems for
cross-lingual spoken communication is an important, but little
explored, topic. It is essential for providing more natural
interaction and making the computing device a less obtrusive element
when assisting human-human interactions.
We will build on recent developments in speech synthesis using hidden
Markov models, which is the same technology used for automatic speech
recognition. Using a common statistical modeling framework for
automatic speech recognition and speech synthesis will enable the use
of common techniques for adaptation and multilinguality.
Significant progress will be made towards a unified approach for
speech recognition and speech synthesis: this is a very powerful
concept, and will open up many new areas of research. In this
project, we will explore the use of speaker adaptation across
languages so that, by performing automatic speech recognition, we can
learn the characteristics of an individual speaker, and then use those
characteristics when producing output speech in another language.
Our objectives are to:
1. Personalize speech processing systems by learning individual
characteristics of a user's speech and reproducing them in
synthesized speech.
2. Introduce a cross-lingual capability such that personal
characteristics can be reproduced in a second language not spoken
by the user.
3. Develop and better understand the mathematical and theoretical
relationship between speech recognition and synthesis.
4. Eliminate the need for human intervention in the process of
cross-lingual personalization.
5. Evaluate our research against state-of-the art techniques and in a
practical mobile application.
6-8 . C/C++ Programmer Munich, Germany
Digital publishing AG is one of Europe's leading producers of interactive software for foreign language training. In our e- learning courses we want to place the emphasis on speaking and spoken language understanding. In order to strengthen our Research & Development Team in Munich, Germany, we are looking for experienced C or C++ programmers with at least 3 years experience in the design and coding of sophisticated software systems under Windows. We offer -a creative working atmosphere in an international team of software engineers, linguists and editors working on challenging research projects in speech recognition and speech dialogue systems - participation in all phases of a product life cycle, as we are interested in the fast transfer of research results into products. - the possibility to participate in international scientific conferences. - a permanent job in the center of Munich. - excellent possibilities for development within our fast growing company. - flexible working times, competitive compensation and arguably the best espresso in Munich. We expect -several years of practical experience in software development in C or C++ in a commercial or academic environment. -experience with parallel algorithms and thread programming. -experience with object-oriented design of software systems. -good knowledge of English or German. Desirable is -experience with optimization of algorithms. -experience in statistical speech or language processing, preferably speech recognition, speech synthesis, speech dialogue systems or chatbots. -experience with Delphi or Turbo Pascal. Interested? We look forward to your application: (preferably by e-mail) digital publishing AG Freddy Ertl f.ertl@digitalpublishing.de
Tumblinger Straße 32 D-80337 München Germany6-9 . Speech and Natural Language Processing Engineer at M*Modal, Pittsburgh.PA,USA
M*Modal is a fast-moving speech technology company based in Pittsburgh, PA. Our portfolio of conversational speech recognition and natural language understanding technologies is widely recognized as the most advanced in the industry. We are a leading innovator in the field of conversational documentation services (CDS) - where speech recognition and natural language understanding are combined in a unique setup targeted to truly understand conversational speech and turn it directly into actionable and meaningful data. Our proprietary speech understanding technology - operating on M*Modal's computing grid hosted in our national data center - is already redefining the way clinical information is captured in healthcare.
We are seeking an experienced and dedicated speech and natural language processing engineer who wants to push the frontiers of conversational speech understanding. Join our renowned research and development team, and add to our unique blend of scientific and engineering excellence.
Responsibilities:
- You will be working with other members of the R&D team to continuously improve our speech and natural language understanding technologies.
- You will participate in designing and implementing algorithms, tools and methodologies in the area of automatic speech recognition and natural language processing/understanding.
- You will collaborate with other members of the R&D team to identify, analyze and resolve technical issues.
Requirements:
- Solid background in speech recognition, natural language processing, machine learning and information extraction.
- 2+ years of experience participating in software development projects
- Proficient with Java, C++ and scripting (e.g. Python, Perl, ...)
- Excellent analytical and problem-solving skills
- Integrate and communicate well in small R&D teams
- Masters degree in CS or related engineering fields
- Experience in a healthcare-related field a plus
In June 2007 M*Modal moved to a great new office space in the Squirrel Hill area of Pittsburgh. We are excited to be growing and are looking for individuals who have a passion for the work they do and are interested in becoming a member of a dynamic work group of smart passionate drivers who also know how to have fun.
M*Modal offers a top-notch benefits package that includes medical, dental and vision coverage, short-term disability, matching 401K savings plan, holidays, paid-time-off and tuition refund. If you would like to be considered for this opportunity, please send your resume and cover letter to Mary Ann Gamble at maryann.gamble@mmodal.com.
6-10 . Senior Research Scientist -- Speech and Natural Language Processing at M*Modal, Pittsburgh, PA,USA
M*Modal is a fast-moving speech technology company based in Pittsburgh, PA. Our portfolio of conversational speech recognition and natural language understanding technologies is widely recognized as the most advanced in the industry. We are a leading innovator in the field of conversational documentation services (CDS) - where speech recognition and natural language understanding are combined in a unique setup targeted to truly understand conversational speech and turn it directly into actionable and meaningful data. Our proprietary speech understanding technology - operating on M*Modal's computing grid hosted in our national data center - is already redefining the way clinical information is captured in healthcare.
We are seeking an experienced and dedicated senior research scientist who wants to push the frontiers of conversational speech understanding. Join our renowned research and development team, and add to our unique blend of scientific and engineering excellence.
Responsibilities:
- Plan and perform research and development tasks to continuously improve a state-of-the-art speech understanding system
- Take a leading role in identifying solutions to challenging technical problems
- Contribute original ideas and turn them into product-grade software implementations
- Collaborate with other members of the R&D team to identify, analyze and resolve technical issues
Requirements:
- Solid research & development background with 3+ years of experience in speech recognition research, covering at least two of the following topics: speech processing, acoustic modeling, language modeling, decoding, LVCSR, natural language processing/understanding, speaker verification/identification, audio mining
- Working knowledge of Machine Learning, Information Extraction and Natural Language Processing algorithms
- 3+ years of experience participating in large-scale software development projects using C++ and Java.
- Excellent analytical, problem-solving and communication skills
- PhD with focus on speech recognition or Masters degree with 3+ years industry experience working on automatic speech recognition
- Experience and/or education in medical informatics a plus
- Working experience in a healthcare related field a plus
In June 2007 M*Modal moved to a great new office space in the Squirrel Hill area of Pittsburgh. We are excited to be growing and are looking for individuals who have a passion for the work they do and are interested in becoming a member of a dynamic work group of smart passionate drivers who also know how to have fun.
M*Modal offers a top-notch benefits package that includes medical, dental and vision coverage, short-term disability, matching 401K savings plan, holidays, paid-time-off and tuition refund. If you would like to be considered for this opportunity, please send your resume and cover letter to Mary Ann Gamble at maryann.gamble@mmodal.com.
6-11 . Postdoc position at LORIA, Nancy, France
Building an articulatory model from ultrasound, EMA and MRI data
Postdoctoral position
Research project
An articulatory model comprises both the visible and the internal mobile articulators which are involved in speech articulation: the lower jaw, tongue, lips and velum) as well as the fixed walls (the palate, the rear wall of the pharynx). An articulatory model is dynamic since the articulators deform during speech production. Such a model has a potential interest in the field of language learning by providing visual feedback on the articulation conducted by the learner, and many other applications.
Building an articulatory model is difficult because the different articulators have to be detected from specific image modalities: the lips are acquired through video, the tongue shape is acquired through ultrasound imaging with a high frame rate but these 2D images are very noisy. Finally, 3D images of all articulators can be obtained with MRI but only for sustained sounds (as vowels) due to the long acquisition time of MRI images.
The subject of this post-doc is to construct a dynamic 3D model of the entire vocal tract by merging the 3D information available in the MRI acquisitions and temporal 2D information provided by the contours of the tongue visible on the ultrasound images or X-ray images.
We are working on the construction of an articulatory model within the European project ASPI (http://aspi.loria.fr/ ).
We already built an acquisition system which allows us to obtain synchronized data from ultrasound, MRI, video and EM modalities.
Only a few complete articulatory models are currently available in the world and a real challenge in the field is to design set-ups and easy-to-use methods for automatically building the model of any speaker from 3D and 2D images. Indeed, the existence of more articulatory models would open new directions of research about speaker variability and speech production.
Objectives
The aim of the subject is to build a deformable model of the vocal tract from static 3D MRI images and 2D dynamic 2D sequences. Previous works have been conducted on the modelling of the vocal tract, and especially of the tongue (M. Stone[1] O. Engwall[2]). Unfortunately, important human interaction is required to extract tongue contours in the images. In addition, only one image modality is often considered in these works, thus reducing the reliability of the model obtained.
The aim of this work is to provide automatic methods for segmenting features in the images as well as methods for building a parametric model of the 3D vocal tract with these specific aims:
- The segmentation process is to be guided by prior knowledge on the vocal tract. In particular shape, topologic as well as regularity constraints must be considered.
- A parametric model of the vocal tract has to be defined (classical models are linear and built from a principal component analysis). Special emphasis must be put on the problem of matching the various features between the images.
- Besides classical geometric constraints, both the building and the assessment of the model will be guided by acoustic distances in order to check for the adequation between the sound synthesized from the model and the sound realized by the human speaker.
Skill and profile
The recruited person must have a solid background in computer vision and in applied mathematics. Informations and demonstrations on the research topics addressed by the Magrit team are available at http://magrit.loria.fr/
References
[1] M. Stone : Modeling tongue surface contours from Cine-MRI images. Journal of Speech, language, hearing research, 2001.
[2]:P. Badin, G. Bailly, L. Reveret: Three-dimensional linear articulatory modeling of tongue, lips and face based on MRI and video images, Journal of Phonetics, 2002, vol 30, p 533-553
Contact
Interested candidates are invited to contact Marie-Odile Berger, berger@loria.fr, +33 3 54 95 85 01
Important information
This position is advertised in the framework of the national INRIA campaign for recruiting post-docs. It is a one year position, renewable, beginning fall 2008. The salary is 2,320€ gross per month.
Selection of candidates will be a two step process. A first selection for a candidate will be carried out internally by the Magrit group. The selected candidate application will then be further processed for approval and funding by an INRIA committee.
Doctoral thesis less than one year old (May 2007) or being defended before end of 2008. If defence has not taken place yet, candidates must specify the tentative date and jury for the defence.
Important - Useful links
Presentation of INRIA postdoctoral positions
To apply (be patient, loading this link takes times...)
6-12 . Internships at Motorola Labs Schaumburg
Motorola Labs - Center for Human Interaction Research (CHIR)
located in Schaumburg Illinois, USA, is offering summer intern positions in 2008 (12 weeks each).
CHIR's mission
Our research lab develops technologies that provide access to rich communication, media and information services effortless, based on natural, intelligent interaction. Our research
aims on systems that adapt automatically and proactively to changing environments, device
capabilities and to continually evolving knowledge about the user.
Intern profiles 1) Acoustic environment/event detection and classification. Successful candidate will be a PhD student near the end of his/her PhD study and is skilled in signal processing and/or pattern recognition; he/she knows Linux and C/C++ programming.
Candidates with knowledge of acoustic environment/event classification are preferred. 2) Speaker adaptation for applications on speech recognition and spoken document retrieval
The successful candidate must currently be pursuing a Ph.D. degree in EE or CS with complete understanding and hand-on experience on automatic speech recognition related research. Proficiency
in Linux/Unix working environment and C/C++ programming. Strong GPA. A strong background in speaker
adaptation is highly preferred.
3) Development of voice search-based web applications on a smartphone
We are looking for an intern candidate to help create an "experience" prototype based on our voice search technology. The app will be deployed on a smartphone and demonstrate intuitive and
rich interaction with web resources. This intern project is oriented more towards software engineering
than research. We target an intern with a master's degree and strong software engineering background.
Mastery of C++ and experience with web programming (AJAX and web services) is required. Development experience on Windows CE/Mobile desired. 4) Integrated Voice Search Technology For Mobile Devices
Candidate should be proficient in information retrie