Contents

1 . Editorial

 Dear members,

 The outstanding phonetician Nick Clement passed away end of August. His French colleagues sent me his obituary. Our speech community will miss this excellent colleague and collaborator.

After the successful Interspeech in Brighton, it is time to think about your contributions to the next conferences and more particularly to  Interspeech 2010 to be held in Chiba, Japan.

I met several of you in Brighton and I thank all of you who told me how important for them is this monthly newsletter. In order to improve the quality of information you will find in it, I ask all conference, workshop, special issue organizers to send me as soon as they make their decision about the organization to send me their Calls for papers and a description of its objectives. 

Many speech scientists found new jobs using our list of job openings. If you are in a company or University planning to hire new scientists, professors, post docs, PhD students,  please ask the persons responsible to contact me and to advertise in ISCApad.

If you recently wrote a book devoted to Speech science or technology, send me its identification with some description of its content. It will be advertised in ISCApad.

I try to gather a maximum of news about the speech community: please help me to do better with your personal contribution. Thanks for all members.

 

Prof. em. Chris Wellekens 

Institut Eurecom

Sophia Antipolis
France 

public@isca-speech.org

 
 
 
 
 
Back to Top

2 . ISCA News

 
Back to Top

2-1 . Nick Clement's obituary

 G. Nick Clements’ death on Aug. 30, 2009 shocked everyone who was fortunate enough to know him. We have lost a great friend and one of the most gifted and productive phonologists of the last decades.
 
In order to keep Nick’s fertile spirit alive, the Laboratoire de Phonétique et Phonologie (LPP) has set up a web page where colleagues and friends can share memories and thoughts about Nick. This page will be updated frequently. It is accessible at http://lpp.univ-paris3.fr/equipe/nick_clements/remembering-nick-clements.html.
If you wish to contribute, please send text, images or videos to remember.nickclements@gmail.com
 
We will gather to remember Nick on Friday October 9th, 2009 from 6 pm at the LPP (19 rue des Bernardins, 75005 Paris).  If you wish to share your memories of Nick and celebrate his life and what he brought us, you are invited to join us.
For organizational reasons, we would appreciate to know in advance if you will join us. Please email remember.nickclements@gmail.com

 

 

Nick’s colleagues of the LPP in Paris

 

 

An obituary by the members of the LPP*
 
G. Nick Clements was born on Oct. 5, 1940, in Cincinnati, Ohio, and attended Moses Brown School in Providence. In 1962, he graduated with high honors from Yale University, majoring in fine arts and was elected to Phi Beta Kappa, which celebrates the most outstanding students of arts and sciences at America’s leading colleges and universities. After a year in Nashville as a classical music DJ, he served in the Army Signal Corps for two years, stationed in Germany. Following his service, he lived in Spain for several years, painting, studying art and writing for an English language periodical. In 1968, he received a certificat from the Centre de Linguistique Quantitative, Faculté des Sciences, Université de Paris. From 1971 to 1973, he was adjunct professor of American English at the University of Paris 8. In 1973, he received his Ph.D. in linguistics from the School of African and Oriental Studies, University of London, defending a thesis on the Ewe language based on fieldwork in Ghana. He was a visiting scientist and lecturer at the department of foreign languages and linguistics at MIT (1973-1975), and held appointments as Assistant Professor and Associate Professor at Harvard (1975-1982). In 1982, he moved to Cornell University, where he was Professor of linguistics and Director of the Phonetics Laboratory. In 1992, he came to Paris, where he became Director of Research at the Centre National de la Recherche Scientifique (CNRS) and worked in the Laboratory of Phonetics and Phonology (UMR7018). He held this position until 2008, when he was elected Professor Emeritus. Clements was also an invited professor in various prestigious universities across the world, in Europe, USA, India, Australia, etc. He was also very active in the academic world. In the last three years, he organized two widely attended international conferences at the Sorbonne University: one on “The phonetic bases of distinctive features” in 2006 and one on “Where do phonological features come from?” in 2007.
 
Clements’ research interests were wide-ranging and he made outstanding contributions in phonology and phonetics- phonology interface. He is best known for his groundbreaking work on syllable and feature theory and his pioneering work on the phonological systems of various African languages, including tonal and vowel-harmony systems. His recent cross-linguistic studies on phonological units have contributed in designing and developing theories and models on phonological representations and have led to a better understanding of the role of features in speech sound inventories. A characteristic feature of Clements’ works is his rigorous scientific method and his unusual gift for finding the most convincing argumentations and drawing the clearest and most synthesized conclusions. Clements was not only an excellent connoisseur of the field, not only an expert on the language or languages studied, but also an outstanding theoretician and a highly trained phonetician. He left behind for us tremendous work in the areas of phonology and phonetics. He wrote and co-authored five books and nearly 100 articles, including journal articles, book and encyclopedia chapters, conference and working papers. He was productive until the end of his life, with some major contributions still to appear.
 
Nick Clements had several passions outside the field of linguistics. He was a music lover and was particularly knowledgeable about jazz music. He played keyboard in a jazz workshop at a club in Paris in the last year of his life. He was also a passionate traveler and visited many parts of the world in the five continents. He traveled for both work and pleasure, and was fluent in several languages. But the number one of his passions was his family: his wife and colleague, Dr. Annie Rialland, his children, William and Célia, his brother, sisters and their families.
 
G. Nick Clements was a great linguist, endowed with an outstanding ability to listen, to guide, to inspire reflections, and to stimulate brainstorming and creative thinking. He was also gifted with noble human qualities: kind, compassionate, generous, and humble. He will forever be remembered fondly for that and much more.
 

 

Rachid Ridouane and all his colleagues 

of the Phonetics and Phonology Laboratory in Paris.

 
 

 

Back to Top

2-2 . Interspeech proceedings indexed by Thomson Reuters (ISI)

In the closing ceremony of Interspeech 2009, the ISCA president announced that she had finally received the acceptance from Thomson Reuters (ISI) to index Interspeech Proceedings of 2006 and 2007. The day after, she received the same answer from the Engineering Index of Elsevier. She will now start the process of indexing the proceedings of 2008 and 2009 in these two databases and continue the efforts concerning other major citation indexes.
Back to Top

2-3 . Message to students

Dear students,

The International Speech Communication Association ISCA is now opening an online system to build a database of résumés of researchers/students working in the various fields of speech communication.
The goal of  this service is to build a centralized place where many interested employers/corporations can access and search for potential candidates.

Please be advised that the posting service will be updated at 4 month intervals. Next switch will be mid October 2009.

We encourage all of you to upload an updated version of your résumé to: http://www.isca-speech.org/resumes/ and wish you good luck with a fruitful career.


Professor Helen Meng 

Back to Top

3 . Future ISCA Conferences and Workshops (ITRW)

 
Back to Top

3-1 . (2010-09-26) CfP INTERSPEECH 2010 Chiba Japan

FIRST CALL FOR PAPERS

 

INTERSPEECH is the world's largest and most comprehensive conference on issues surrounding the science and technology of spoken language processing both in humans and in machines. It is our great pleasure to host INTERSPEECH 2010 in Japan, the birthplace of ICSLP, which has held two ICSLPs, in Kobe and Yokohama, in the past. The theme of INTERSPEECH 2010 is "Spoken Language Processing for All Ages, Health Conditions, Native Languages and Environments". INTERSPEECH 2010 emphasizes an interdisciplinary approach covering all aspects of speech science and technology spanning the basic theories to applications. Besides regular oral and poster sessions, plenary talks by internationally renowned experts, tutorials, exhibits, and special sessions are planned. We invite you to submit original papers in any related area, including but not limited to:

HUMAN SPEECH PRODUCTION, PERCEPTION AND COMMUNICATION

* Human speech production * Human speech and sound perception * Linguistics, phonology and phonetics * Discourse and dialogue * Prosody (e.g. production, perception, prosodic structure, modeling) * Paralinguistic and nonlinguistic cues (e.g. emotion and expression) * Physiology and pathology of spoken language * Spoken language acquisition, development and learning * Speech and other modalities (e.g. facial expression, gesture)

SPEECH AND LANGUAGE TECHNOLOGY

* Speech analysis and representation * Speech segmentation * Audio segmentation and classification * Speaker turn detection * Speech enhancement * Speech coding and transmission * Voice conversion * Speech synthesis and spoken language generation * Automatic speech recognition * Spoken language understanding * Language and dialect identification * Cross-lingual and multi-lingual speech processing * Multimodal/multimedia signal processing (including sign languages) * Speaker characterization and recognition * Signal processing for music and song * Spoken language technology for prosthesis, rehabilitation, wellness and welfare

SPOKEN LANGUAGE SYSTEMS AND APPLICATIONS

* Spoken dialogue systems * Systems for information extraction/retrieval * Systems for spoken language translation * Applications for aged and handicapped persons * Applications for learning and education * Other applications

RESOURCES, STANDARDIZATION AND EVALUATION

 * Spoken language resources and annotation * Evaluation and standardization of spoken language systems

 PAPER SUBMISSION

Papers for the INTERSPEECH 2010 proceedings should be up to four pages in length and conform to the format given in the paper preparation guidelines and author kits which will be available on the INTERSPEECH 2010 website along with the Final Call for Papers. Optionally, authors may submit additional files, such as multimedia files, to be included on the Proceedings CD-ROM. Authors shall also declare that their contributions are original and not being submitted for publication elsewhere (e.g. another conference, workshop, or journal). Papers must be submitted via the on-line paper submission system, which will open early in 2010. The deadline for submitting a paper is 30 April 2010. This date will not be extended. Inquiries regarding paper submissions should be directed via email to submission@interspeech2010.org.

LANGUAGE

The working language of the conference is English.

IMPORTANT DATES

Paper submission deadline: 30 April 2010

Notification of acceptance or rejection: 2 July 2010

Camera-ready paper due: 9 July 2010

Early registration deadline: 28 July 2010

Conference dates: 26-30 September 2010

WEBSITE & MAIL

http://www.interspeech2010.org/

mail: office@interspeech2010.org

VENUE

Makuhari Messe International Conference Hall Nakase 2-1, Mihama-ku, Chiba-city Chiba 261-0023 Japan http://www.m-messe.co.jp/en/access/index.html

 

 

 
Back to Top

3-2 . (2011-08-27) INTERSPEECH 2011 Florence Italy

Interspeech 2011

Palazzo dei Congressi,  Italy, August 27-31, 2011.

Organizing committee

Piero Cosi (General Chair),

Renato di Mori (General Co-Chair),

Claudia Manfredi (Local Chair),

Roberto Pieraccini (Technical Program Chair),

Maurizio Omologo (Tutorials),

Giuseppe Riccardi (Plenary Sessions).

More information www.interspeech2011.org

Back to Top

4 . Workshops and conferences supported (but not organized) by ISCA

 
Back to Top

4-1 . (2009-11-05) Workshop on Child, Computer and Interaction

Call for Papers

The Workshop on Child, Computer and Interaction (wocci2009.fbk.eu) will be held in Boston on November 5th, 2009. 
For registration visit
The Workshop is a satellite event of the Eleventh International
Conference on Multi-modal Interfaces, this year jointly with Machine Learning Multimodal
Interaction (ICMI-MLMI 2009) that will take place in the same venue
November 2-4, 2009.
This Workshop aims at bringing together researchers and practitioners from
universities and industry working in all aspects of child-machine interaction including
computer, robotics and multi-modal interfaces. Children are special both at the
acoustic/linguistic level but also at the interaction level. The Workshop provides a
unique opportunity for bringing together different research communities from
cognitive science, robotics, speech processing, linguistics and application areas
such as medical and education. Various state-of-the-art components can be
presented here as key components for next generation child centred computer
interaction. Technological advances are increasingly necessary in a world where
education and health pose growing challenges to the core well-being of our
societies. Noticeable examples are remedial treatments for children with or without
disabilities and individualised attention. The Workshop should serve for presenting
recent advancements in core technologies as well as experimental systems and
prototypes.
Technical Scope
The technical scope of the workshop includes, but it is not limited to:
● Speech Interfaces: acoustic and linguistic analysis of children's speech, discourse
analysis of spoken language in child-machine interaction, age-dependent
characteristics of spoken language, automatic speech recognition for children and
spoken dialogue systems
● Multi-modality and Robotics: multi-modal child-machine interaction, multi-modal
input and output interfaces, including robotic interfaces, intrusive, non-intrusive
devices for environmental data processing, pen or gesture/visual interfaces
● User Modelling: user modelling and adaptation, usability studies accounting for
age preferences in child-machine interaction
● Cognitive Models: internal learning models, personality types, user-centred and
participatory design
● Application Areas: training systems, educational software, gaming interfaces,
medical conditions and diagnostic tools
Paper submission
Authors are invited to submit papers in any technical areas relevant to the workshop.
The technical committee will select papers for oral/poster presentations.
Demonstrations are especially welcome. Instructions for paper submission are
available at the Workshop website. An electronic version of Workshop proceedings will be published by ACM 
 
Chairs
Kay Berkling
(Inline Internet Online GmbH,
Germany)
Diego Giuliani
(FBK, Italy)
Shrikanth Narayanan
(Univ. Sourthern California, USA
 
Back to Top

4-2 . (2009-12-13) ASRU 2009

IEEE ASRU2009
Automatic Speech Recognition and Understanding Workshop
Merano, Italy December 13-17, 2009
http://www.asru2009.org/
 
We are happy to inform you that IEEE ASRU 2009 registrations are now available at: http://www.asru2009.org/index.php/registration  Please note the   1) At least one author/paper needs to register 2) We have a limited number of slots for registrations and priority will be given to authors of accepted papers until October 7 2009.     AFTER that registrations will accepted on a first-come first-serve basis
The eleventh biannual IEEE workshop on Automatic Speech Recognition
and Understanding (ASRU) will be held on December 13-17, 2009.
The ASRU workshops have a tradition of bringing together
researchers from academia and industry in an intimate and
collegial setting to discuss problems of common interest in
automatic speech recognition and understanding.

Workshop topics

• automatic speech recognition and understanding
• human speech recognition and understanding
• speech to text systems
• spoken dialog systems
• multilingual language processing
• robustness in ASR
• spoken document retrieval
• speech-to-speech translation
• spontaneous speech processing
• speech summarization
• new applications of ASR.

The workshop program will consist of invited lectures, oral
and poster presentations,  and panel discussions. Prospective
 authors are invited to submit full-length, 4-6 page papers,
including figures and references, to the ASRU 2009 website
http://www.asru2009.org/.
All papers will be handled and reviewed electronically.
The website will provide you with further details. Please note
that the submission dates for papers are strict deadlines.

IMPORTANT DATES

Paper submission deadline         July 15, 2009
Paper notification of acceptance     September 3, 2009
Demo session proposal deadline        September 24, 2009
Early registration deadline        October 7, 2009
Workshop                 December 13-17, 2009


Please note that the number of attendees will be limited and
priority will be given to paper presenters. Registration will
be handled via the ASRU 2009 website,
http://www.asru2009.org/, where more information on the workshop
will be available.

General Chairs
    Giuseppe Riccardi, U. Trento, Italy
    Renato De Mori, U. Avignon, France

Technical Chairs
    Jeff Bilmes, U. Washington, USA
    Pascale Fung, HKUST, Hong Kong China
    Shri Narayanan, USC, USA
    Tanja Schultz, U. Karlsruhe, Germany

Panel Chairs
    Alex Acero, Microsoft, USA
    Mazin Gilbert, AT&T, USA

Demo Chairs
    Alan Black, CMU, USA
    Piero Cosi, CNR, Italy

Publicity Chairs
    Dilek Hakkani-Tür, ICSI, USA
    Isabel Trancoso, INESC -ID/IST, Portugal

Publication Chair
    Giuseppe di Fabbrizio, AT&T, USA

Local Chair
    Maurizio Omologo, FBK-irst, Italy 
 
 
 
 
 
 
Back to Top

4-3 . (2009-12-14) 6th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications MAVEBA 2009

University degli Studi di Firenze Italy
Department of Electronics and Telecommunications
6th International Workshop
on
Models and Analysis of Vocal Emissions for Biomedical
Applications
MAVEBA 2009
December 14 - 16, 2009
Firenze, Italy
Speech is the primary means of communication among humans, and results from
complex interaction among vocal folds vibration at the larynx and voluntary articulators
movements (i.e. mouth tongue, jaw, etc.). However, only recently has research
focussed on biomedical applications. Since 1999, the MAVEBA Workshop is
organised every two years, aiming to stimulate contacts between specialists active in
clinical, research and industrial developments in the area of voice signal and images
analysis for biomedical applications. This sixth Workshop will offer the participants
an interdisciplinary platform for presenting and discussing new knowledge in the field
of models, analysis and classification of voice signals and images, as far as both
adults, singing and children voices are concerned. Modelling the normal and
pathological voice source, analysis of healthy and pathological voices, are among the
main fields of research. The aim is that of extracting the main voice characteristics,
together with their deviation from “healthy conditions”, ranging from fundamental
research to all kinds of biomedical applications and related established and advanced
technologies.
SCIENTIFIC PROGRAM
linear and non-linear models of voice
signals;
physical and mechanical models;
aids for disabled;
measurement devices (signal and image); prostheses;
robust techniques for voice and glottal
analysis in time, frequency, cepstral,
wavelet domain;
neural networks, artificial intelligence and
other advanced methods for pathology
classification;
linguistic and clinical phonetics; new-born infant cry analysis;
neurological dysfunction; multiparametric/multimodal analysis;
imaging techniques (laryngography,
videokymography, fMRI);
voice enhancement;
protocols and database design;
Industrial applications in the biomedical
field;
singing voice;
speech/hearing interactions;
DEADLINES
30 May 2009 Submission of extended abstracts (1-2 pages, 1 column)
/special session proposal
30 July 2009 Notification of paper acceptance
30 September 2009 Final full paper submission (4 pages, 2 columns, pdf
format) and early registration
14-16 December 2009 Conference venue
SPONSORS
ENTE CRF Ente Cassa di Risparmio di Firenze
IEEE EMBS
IEEE Engineering in Medicine and Biology
Society
ELSEVIER Eds.
Biomedical Signal Processing and Control
ISCA
International Speech and Communication
Association
A.I.I.M.B.
Associazione Italiana di Ingegneria Medica e
Biologica
COST Action
2103 Europ. COop. in Science & Tech. Research
FURTHER INFORMATION
Claudia Manfredi – Conference Chair
Department of Electronics and
Telecommunications
Via S. Marta 3, 50139 Firenze, Italy
Phone: +39-055-4796410
Fax: +39-055-494569
Piero Bruscaglioni
Department of Physics
Polo Scientifico Sesto Fiorentino, 50019
Firenze,Italy
Phone: +39-055-4572038
Fax: +39-055-4572356
Back to Top

4-4 . (2010-05-03) Workshop on Spoken Languages Technologies for Under-Resourced Languages (SLTU'10)

Workshop on Spoken Languages Technologies for Under-Resourced Languages 
(SLTU’10)
 
*The second International Workshop on Spoken Languages Technologies for 
Under-resourced languages (SLTU’10) will be held at Universiti Sains Malaysia 
(USM), Penang, Malaysia, May 3 to May 5, 2010.* Workshop 
supported by ISCA, AFCP and CNRS. 
 
The first workshop on Spoken Languages Technologies for Under-Resourced 
Languages was organized in Hanoi, Vietnam, in 2008 by Multimedia, 
Information, Communication and Applications (MICA) research center in 
Vietnam and /Laboratoire d’Informatique de Grenoble/ (LIG) in France. 
This first workshop gathered 40 participants during two days. 
 
For 2010, we intend to attract more participants, especially from the 
local regional zone (Malaysia, Indonesia, Singapore, Thailand, 
Australia, ...). The workshop will take place inside USM in Penang, 
Malaysia. SLTU research workshop will focus on spoken language 
processing for under-resourced languages and aims at gathering 
researchers working on: 
 
   * ASR, synthesis and translation for under-resourced languages 
   * portability issues 
   * multilingual spoken language processing 
   * fast resources acquisition (speech, text, lexicons, parallel corpora) 
   * spoken language processing for languages with rich morphology 
   * spoken language processing for languages without separators 
   * spoken language processing for languages without writing system 
   * NLP for rare or endangered languages 
   * … 
 
*Important dates* 
 
* Paper submission: December 15, 2009 
 
* Notification of Paper Acceptance: February 15, 2010 
 
* Author Registration Deadline: March 1, 2010 
 
*Workshop Web site* 
 
* * 
 
 
 
 
*Workshop Chairs* 
 
Laurent Besacier 
 
Eric Castelli 
 
Dr. Chan Huah Yong 
Back to Top

4-5 . (2010-05-19) CfP LREC 2010 - 7th Conference on Language Resources and Evaluation

LREC 2010 - 7th Conference on Language Resources and Evaluation
 
FIRST ANNOUNCEMENT AND CALL FOR PAPERS
 
MEDITERRANEAN CONFERENCE CENTRE, VALLETTA - MALTA
 
MAIN CONFERENCE: 19-20-21 MAY 2010
WORKSHOPS and TUTORIALS: 17-18 MAY and 22-23 MAY 2010
 
 
 
The seventh international conference on Language Resources and Evaluation (LREC) will be organised in 2010 by ELRA in cooperation with a wide range of international associations and organisations.
 
 
CONFERENCE AIMS
 
In 12 years – the first LREC was held in Granada in 1998 – LREC has become the major event on Language Resources (LRs) and Evaluation for Human Language Technologies (HLT). The aim of LREC is to provide an overview of the state-of-the-art, explore new R&D directions and emerging trends, exchange information regarding LRs and their applications, evaluation methodologies and tools, ongoing and planned activities, industrial uses and needs, requirements coming from the e-society, both with respect to policy issues and to technological and organisational ones. 
 
LREC provides a unique forum for researchers, industrials and funding agencies from across a wide spectrum of areas to discuss problems and opportunities, find new synergies and promote initiatives for international cooperation, in support to investigations in language sciences, progress in language technologies and development of corresponding products, services and applications, and standards.
 
 
Special Highlight: Contribute to building the LREC2010 Map!
 
LREC2010 recognises that time is ripe to launch an important initiative, the LREC2010 Map of Language Resources, Technologies and Evaluation. The Map will be a collective enterprise of the LREC community, as a first step towards the creation of a very broad, community-built, Open Resource Infrastructure. As first in a series, it will become an essential instrument to monitor the field and to identify shifts in the production, use and evaluation of LRs and LTs over the years.
 
When submitting a paper, from the START page you will be asked to fill in a very simple template to provide essential information about resources (in a broad sense that includes technologies, standards, evaluation kits, etc.) that either have been used for the work described in the paper or are a new result of your research. 
 
The Map will be disclosed at LREC, where some event(s) will be organised around this initiative. 
 
 
CONFERENCE TOPICS
 
Issues in the design, construction and use of Language Resources (LRs): text, speech, other associated media and modalities
•    Guidelines, standards, specifications, models and best practices for LRs
•    Methodologies and tools for LRs construction and annotation
•    Methodologies and tools for the extraction and acquisition of knowledge
•    Ontologies and knowledge representation
•    Terminology 
•    Integration between (multilingual) LRs, ontologies and Semantic Web technologies
•    Metadata descriptions of LRs and metadata for semantic/content markup
•    Validation, quality assurance, evaluation of LRs
Exploitation of LRs in different types of systems and applications 
•    For: information extraction, information retrieval, speech dictation, mobile communication, machine translation, summarisation, semantic search, text mining, inferencing, reasoning, etc.
•    In different types of interfaces: (speech-based) dialogue systems, natural language and multimodal/multisensorial interactions, voice activated services, cognitive systems, etc.
•    Communication with neighbouring fields of applications, e.g. e-government, e-culture, e-health, e-participation, mobile applications, etc. 
•    Industrial LRs requirements, user needs
Issues in Human Language Technologies evaluation
•    HLT Evaluation methodologies, protocols and measures
•    Benchmarking of systems and products
•    Usability evaluation of HLT-based user interfaces (speech-based, text-based, multimodal-based, etc.), interactions and dialogue systems
•    Usability and user satisfaction evaluation
General issues regarding LRs & Evaluation
•    National and international activities and projects
•    Priorities, perspectives, strategies in national and international policies for LRs
•    Open architectures 
•    Organisational, economical and legal issues 
 
 
PROGRAMME
 
The Scientific Programme will include invited talks, oral presentations, poster and demo presentations, and panels. 
There is no difference in quality between oral and poster presentations. Only the appropriateness of the type of communication (more or less interactive) to the content of the paper will be considered.
 
 
SUBMISSIONS AND DATES
 
Submitted abstracts of papers for oral and poster or demo presentations should consist of about 1500-2000 words.
•    Submission of proposals for oral and poster/demo papers: 31 October 2009 
 
Proposals for panels, workshops and tutorials will be reviewed by the Programme Committee.
•    Submission of proposals for panels, workshops and tutorials: 31 October 2009
 
 
PROCEEDINGS
 
The Proceedings on CD will include both oral and poster papers, in the same format. They will be added to the ELRA web archives before the conference.
A Book of Abstracts will be printed.
 
 
CONFERENCE PROGRAMME COMMITTEE 
Nicoletta Calzolari, Istituto di Linguistica Computazionale del CNR - Pisa, Italy (Conference chair)
Khalid Choukri - ELRA, Paris, France
Bente Maegaard - CST, University of Copenhagen, Denmark
Joseph Mariani - LIMSI-CNRS and IMMI, Orsay, France
Jan Odijk - UIL-OTS, Utrecht, The Netherlands 
Stelios Piperidis - Institute for Language and Speech Processing (ILSP), Athens, Greece
Mike Rosner – Department of Intelligent Computer Systems, University of Malta, Malta
Daniel Tapias - Sigma Technologies S.L., Madrid, Spain
Back to Top

4-6 . (2010-05-25) CfP JEP 2010

JEP 2010
         XXVIIIèmes Journées d'Étude sur la Parole
 
                    Université de Mons, Belgique
 
                         du 25 au 28 mai 2010
 
                        http://w3.umh.ac.be/jep2010
 
=====================================================================
 
Les Journées d'Études de la Parole (JEP) sont consacrées à l'étude de la communication parlée ainsi qu'à ses applications. Ces journées ont pour but de rassembler l'ensemble des communautés scientifiques francophones travaillant dans le domaine. La conférence se veut aussi un lieu d'échange convivial entre doctorants et chercheurs confirmés.
 
En 2010, les JEP sont organisées par le Laboratoire des Sciences de la Parole de l'Académie Wallonie-Bruxelles, sur le site de l'Université de Mons en Belgique, sous l'égide de l'AFCP 
(Association Francophone de la Communication Parlée) avec le  soutien de l'ISCA (International Speech Communication Association).
Un second appel à communication précisant les thèmes ainsi que les modalités de soumission suivra ce premier appel.
 
 
 
CALENDRIER
===========
Date limite de soumission:          11 janvier 2010
Notification aux auteurs:             15    mars 2010
Conférence:                                 25-28 mai 2010
 
 
 
 
V. Delvaux
Chargée de Recherches FNRS
Laboratoire de Phonétique
Service de Métrologie et Sciences du Langage
Université de Mons-Hainaut
18, Place du Parc
7000 Mons
Belgium
+3265373140
 
Back to Top

5 . Books,databases and softwares

 
Back to Top

5-1 . Books

 
This section shows recent books whose titles been have communicated by the authors or editors.
 
Also some advertisements for recent books in speech are included.
 
This book presentation is written by the authors and not by this newsletter editor or any  volunteer reviewer.
 
Back to Top

5-1-1 . Advances in Digital Speech Transmission

Advances in Digital Speech Transmission
Editors: Rainer Martin, Ulrich Heute and Christiane Antweiler
Publisher: Wiley&Sons
Year: 2008
Back to Top

5-1-2 . Sprachverarbeitung -- Grundlagen und Methoden der Sprachsynthese und Spracherkennung

Title: Sprachverarbeitung -- Grundlagen und Methoden 
       der Sprachsynthese und Spracherkennung 
Authors: Beat Pfister, Tobias Kaufmann 
Publisher: Springer 
Year: 2008 
Back to Top

5-1-3 . Digital Speech Transmission

Digital Speech Transmission
Authors: Peter Vary and Rainer Martin
Publisher: Wiley&Sons
Year: 2006
Back to Top

5-1-4 . Distant Speech Recognition,

Distant Speech Recognition, Matthias Wölfel and John McDonough (2009), J. Wiley & Sons.
 
 Please link the title to http://www.distant-speech-recognition.com 
 
In the very recent past, automatic speech recognition (ASR) systems have attained acceptable performance when used with speech captured with a head-mounted or close-talking microphone (CTM). The performance of conventional ASR systems, however, degrades dramatically as soon as the microphone is moved away from the mouth of the speaker. This degradation is due to a broad variety of effects that are not found in CTM speech, including background noise, overlapping speech from other speakers, and reverberation. While conventional ASR systems underperform for speech captured with far-field sensors, there are a number of techniques developed in other areas of signal processing that can mitigate the deleterious effects of noise and reverberation, as well as separating speech from overlapping speakers. Distant Speech Recognition presents a contemporary and comprehensive description of both theoretic abstraction and practical issues inherent in the distant ASR problem.
Back to Top

5-1-5 . Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods

Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods
Joseph Keshet and Samy Bengio, Editors
John Wiley & Sons
March, 2009
Website:  Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods
 
About the book:
This is the first book dedicated to uniting research related to speech and speaker recognition based on the recent advances in large margin and kernel methods. The first part of the book presents theoretical and practical foundations of large margin and kernel methods, from support vector machines to large margin methods for structured learning. The second part of the book is dedicated to acoustic modeling of continuous speech recognizers, where the grounds for practical large margin sequence learning are set. The third part introduces large margin methods for discriminative language modeling. The last part of the book is dedicated to the application of keyword-spotting, speaker
verification and spectral clustering. 
Contributors: Yasemin Altun, Francis Bach, Samy Bengio, Dan Chazan, Koby Crammer, Mark Gales, Yves Grandvalet, David Grangier, Michael I. Jordan, Joseph Keshet, Johnny Mariéthoz, Lawrence Saul, Brian Roark, Fei Sha, Shai Shalev-Shwartz, Yoram Singer, and Nathan Srebo. 
 
 
 
Back to Top

5-1-6 . Some aspects of Speech and the Brain.

Some aspects of Speech and the Brain. 
Susanne Fuchs, Hélène Loevenbruck, Daniel Pape, Pascal Perrier
Editions Peter Lang, janvier 2009
 
What happens in the brain when humans are producing speech or when they are listening to it ? This is the main focus of the book, which includes a collection of 13 articles, written by researchers at some of the foremost European laboratories in the fields of linguistics, phonetics, psychology, cognitive sciences and neurosciences.
 
-- 
Back to Top

5-1-7 . Spoken Language Processing,

Spoken Language Processing, edited by Joseph Mariani (IMMI and
LIMSI-CNRS, France). ISBN: 9781848210318. January 2009. Hardback 504 pp

Publisher ISTE-Wiley

Speech processing addresses various scientific and technological areas. It includes speech analysis and variable rate coding, in order to store or transmit speech. It also covers speech synthesis, especially from text, speech recognition, including speaker and language identification, and spoken language understanding. This book covers the following topics: how to realize speech production and perception systems, how to synthesize and understand speech using state-of-the-art methods in signal processing, pattern recognition, stochastic modeling, computational linguistics and human factor studies. 


More on its content can be found at
http://www.iste.co.uk/index.php?f=a&ACTION=View&id=150

Back to Top

5-2 . Database providers

 
Back to Top

5-2-1 . LDC News

 

 
 
In this newsletter:
 
LDC2009T27
 
LDC2009S03
 
LDC2009T23
 
 
 

 

New Publications


(1) Chinese Gigaword Fourth Edition is a comprehensive archive of newswire text data that has been acquired over several years by the LDC. This edition includes all of the contents in Chinese Gigaword Third Edition (LDC2007T38) as well as newly collected data. In addition, four entirely new sources have been added in the fourth edition, Central News Service, Guangming Daily, People's Liberation Army Daily, and People's Daily.

The eight distinct international sources of Chinese newswire included in this edition are the following:

  • Agence France Presse 
  • Central News Agency, Taiwan
  • Central News Service
  • Guangming Daily
  • People's Daily
  • People's Liberation Army Daily
  • Xinhua News Agency
  • Zaobao Newspaper

The original data received by the LDC from AFP, People's Liberation Army Daily, Xinhua, and Zaobao were encoded in GB-2312, those from CNA were in Big-5, and those from GMW, CNS, and People's Daily were in a combination of GB-2312 and GB-18030. To avoid the problems and confusion that could result from differences in character-set specifications, all text files in this corpus have been converted to UTF-8 character encoding.

New in the Fourth Edition:

  • Two years worth of new articles (January 2007 through December 2008) have been added to the Xinhua, Agence France Presse, and CNA data sets.
  • Four new data sources have been added - Guangming Daily, Central News Service , People's Daily, and People's Liberation Army daily, covering a timespan from November 2006 through December 2008.

Chinese Gigaword Fourth Edition is distributed on 1 DVD-ROM.

2009 Subscription Members will automatically receive two copies of this corpus.  2009 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US$5000.

*


(2)  CSLU: S4X Release 1.2 was created by the Center for Spoken Language Understanding, Oregon Health and Science University (CSLU). The corpus consists of 36 speakers (22 male, 14 female) uttering 11 specified words.  The speakers repeated the following words six times on each of four channels: startrek, supernova, tektronix, generation, nebula, processing, singularity, 71523, abracadabra, sungeeta and computer. The four channels used were office phone, home phone, carbon microphone telephone and speaker phone. Each speech file has a corresponding time-aligned phoneme-level transcription (achieved using automatic forced alignment) and an automatically-generated world-level transcription.  Humans reviewed each utterance in two passes and classified it as good, bad, noisy or different. 

The data was recorded with the CSLU T1 digital data collection system. Each utterance is recorded as a separate file. These files were sampled at 8 khz 8-bit and stored as ulaw files. All of the data use the RIFF standard file format. This file format is 16-bit linearly encoded.

CSLU: S4X Release 1.2 is distributed on one CD-ROM.

2009 Subscription Members will automatically receive two copies of this corpus, provided that they have submitted a signed copy of the LDC User Agreement for CSLU Corpora. 2009 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US$150.

 

(3)  FactBank 1.0 consists of 208 documents (over 77,000 tokens) from newswire and broadcast news reports in which event mentions are annotated with their degree of factuality, that is, the degree to which they correspond to those events. FactBank 1.0 was built on top of TimeBank 1.2 and a fragment of the AQUAINT TimeML Corpus, both of which used the TimeML specification language. This resulted in a double-layered annotation of event factuality. TimeBank 1.2 and AQUAINT TimeML encode most of the basic structural elements expressing factuality information while FactBank 1.0 represents the resulting factuality interpretation. The combination of the factuality values in FactBank with the structural information in TimeML-annotated corpora facilitates the development of tools aimed at automatically identifying the factuality values of events, a component fundamental in tasks requiring some degree of text understanding, such as Textual Entailment, Question Answering, or Narrative Understanding.

FactBank annotations indicate whether the event mention describes actual situations in the world, situations that have not happened, or situations of uncertain interpretation. Event factuality is not an inherent feature of events but a matter of perspective. Different discourse participants may present divergent views about the factuality of the very same event. Consequently, in FactBank, the factuality degree of events is assigned relative to the relevant sources at play. In this way, it can adequately reflect the divergence of opinions regarding the factual status of events, as is common in news reports.

All FactBank markup is standoff and is represented through a set of 20 tables which can be easily loaded into a database. Each table resides in an independent text file, where fields are separated by three consecutive bars (i.e., |||). The data in fields of string type are presented between simple quotations (').  Because FactBank 1.0 was built on top of TimeBank 1.2 and AQUAINT TimeML, both of which are marked up with inline XML-based annotation, this release contains the TimeBank 1.2 and AQUAINT TimeML annotation in standoff, table-based format as well.

FactBank 1.0 is distributed via web download.

2009 Subscription Members will automatically receive two copies of this collection on disc. 2009 Standard Members may request a copy as part of their 16 free membership corpora.  Non-members may license this data by completing the LDC User Agreement for Non-members.  The agreement can be faxed to +1 215 573 2175 or scanned and emailed to this address.  The collection is being made available at no charge.


LDC’s Free Resources

LDC is pleased to distribute FactBank 1.0 which is available at no cost.  To license a copy of this data, non-members should complete the LDC User Agreement for Non-members and fax to +1 215 573 2175 or scan and email to this address. FactBank joins a host of LDC resources which are available for free.  These resources include tools and corpora developed at LDC as well as corpora made available through LDC's strong network of data providers.   

Since LDC's founding, we have distributed over 1300 copies of corpora at no cost including:

  • over 700 non-member downloads of Buckwalter Arabic Morphological Anaylzer 1.0
  • 400 copies of Talkbank-sponsored data including popular releases such as the American National Corpus and the Santa Barbara Corpora of Spoken American English
  • nearly 200 copies of Web 1T 5-gram Version 1, sponsored by Google Inc.
  • over 30 copies of TimeBank 1.2
  • over a dozen copies of the corpora developed for the Unified Linguistic Annotation (ULA) project

For further information, visit our What's New! What's Free! Archive.

 

Release of XTrans

At InterSpeech 2009, LDC introduced XTrans, a new tool for manual transcription and annotation of audio recordings.  XTrans is a next generation transcription tool that is designed to support transcription tasks in multiple languages on multiple platforms.   XTrans provides a flexible and intuitive graphical user interface for a multitude of speech annotation tasks including (virtual) segmentation of audio into smaller units like turns and sentences; speaker identification; orthographic transcription in any language; and labeling of structural elements of the transcript like topics.  Its versatile and powerful waveform display/playback component can load multiple audio files of different file formats and sampling rates at the same time. LDC and its partners have used XTrans to generate over 3500 hours of time-aligned verbatim transcripts in a variety of genres and languages. 

With an intuitive interface, user configurability and embedded QC functions, XTrans is optimized for high-quality, high-volume transcription tasks involving real world data. XTrans successfully addresses the challenges of real world data including transcribing multiple speakers in a single channel through Virtual Speaker Channel, which enables an unlimited number of distinct speakers to be associated with the same audio channel.  Furthermore, XTrans allows transcribers to open an effectively unlimited number of audio files for simultaneous transcription. Transcribers can switch focus between one, two or multiple speakers as needed.  XTrans also provides strong multilingual support, with bidirectional text input for languages like Arabic, Farsi, Urdu, and Hebrew.

Realtime transcription rates have improved dramatically in LDC projects using XTrans, with rates for some tasks cut by as much as half.   XTrans also brings key quality control functions directly into the interface, giving transcribers the power to improve the quality of their own work.  XTrans components are written in Python and C++, utilizing LDC's QWave waveform display module. Even with very large files or multiple recordings, XTrans provides users with fast display and playback capabilities.  A range of audio formats is supported, including .sph, .wav, .aiff, .flac, and .ogg. Transcripts are output in a Tab Delimited Format (TDF), which is easily converted to other common formats and is readily usable by downstream manual and automatic annotation tasks.

Availability:

XTrans for Linux and Windows platforms is available from the LDC at no cost under GPLv3 and can be downloaded here.

 
 
 
Back to Top

5-2-2 . ELDA/ELRA press release

    Press Release - Immediate
Paris, France, September, 3rd 2009

Distribution Agreement signed for BioLexicon

ELRA together with the European Bioinformatics Institute (EBI, Hinxton, UK), Istituto di Linguistica Computazionale-Consiglio Nazionale Ricerche (ILC-CNR, Pisa, Italy), and the National Centre for Text Mining (NaCTeM, University of Manchester, UK) has signed a Language Resources distribution agreement for a large-scale English language terminological resource in the biomedical domain: BioLexicon.

Biological terminology is a frequent cause of analysis errors when processing literature written in the biology domain, due largely to the high degree of variation in term forms, to the frequent mis-matches between labels of controlled vocabularies and ontologies on the one hand and the forms actually occurring in text on the other, and to the lack of detailed formal information on the linguistic behaviour of domain terms. For example,  "retro-regulate" is a terminological verb often used in molecular biology but it is not included in conventional dictionaries. BioLexicon is a linguistic resource for the biology domain, tailored to cope with these problems. It contains information on:
    - terminological nouns, including nominalised verbs and proper names (e.g., gene names)
    - terminological adjectives
    - terminological adverbs
    - terminological verbs
    - general English words frequently used in the biology domain

Existing information on terms was integrated, augmented, complemented and linked, through processing of massive amounts of biomedical text, to yield inter alia over 2.2M entries, and information on over 1.8M variants and on over 2M synonymy relations. Moreover, extensive information is provided on how verbs and nominalised verbs in the domain behave at both syntactic and semantic levels, supporting thus applications aiming at discovery of relations and events involving biological entities in text.

This comprehensive coverage of biological terms makes BioLexicon a unique linguistic resource within the domain. It is primarily intended to support text mining and information retrieval in the biomedical
domain, however its standards-based structure and rich content make it a valuable resource for many other kinds of application.

On behalf of ELRA, ELDA will act as the distribution agency, by incorporating the BioLexicon in the ELRA Language Resources catalogue.

With these resources, ELRA is willing to extend the current catalogue, by offering specialized resources and thus allow a better coverage of the language.

For more information on BioLexicon (catalogue reference: ELRA-S0373): http://catalog.elra.info/product_info.php?products_id=1113

For more information on the ELRA catalogue, please contact:
Valérie Mapelli, mapelli@elda.org

For more information on ELRA & ELDA, please contact:
Khalid Choukri, choukri@elda.org
Hélène Mazo, mazo@elda.org

ELDA
55-57, rue Brillat Savarin
75013 Paris (France)

Tel.: +33 1 43 13 33 33
Fax: +33 1 43 13 33 30

Back to Top

5-2-3 . MEDAR project/ELDA

 The goal of the MEDAR project, supported by the European Commission ICT programme, is to establish a network of partner centres of best practice in Arabic dedicated to promoting Arabic HLT (Human Language Technologies).

Within this framework, we are working on a global directory of players, experts, projects and Language resources related to Arabic Human Language Technology. If you have not already answered the 1st survey and would like to be part of the community, please take a few minutes to complete the survey: http://survey.elda.org/index.php?sid=15471&lang=en.
The collected information will be made available through a Knowledge Base.

About MEDAR: http://www.medar.info

Back to Top

5-2-4 . ELRA - Language Resources Catalogue - Update

*****************************************************************
ELRA - Language Resources Catalogue - Update
*****************************************************************

ELRA is happy to announce that 1 new Terminology Database, 3 Speech Desktop/Microphone resources and 1 Speech Telephone resource are now available in its catalogue:
*
**ELRA-T0373 BioLexicon *
BioLexicon is a large-scale English terminological resource which has been developed to address the needs emerging in text mining efforts in the biomedical domain. It contains over 2.2M lexical entries (over 3.3M semantic relations), and information on over 1.8M variants and on over 2M synonymy relations. BioLexicon is available in a relational database format (MySQL dump format) and it adheres to the EAGLES/ISO standards for lexical resources.
For more information, see: http://catalog.elra.info/product_info.php?products_id=1113

*ELRA-S0301 Norwegian EUROM1 (EUROM1_N)*
EUROM1 is the first really multilingual speech database produced in Europe. Over 60 speakers per language pronounced numbers, sentences, isolated words using close talking microphone.
For more information, see: http://catalog.elra.info/product_info.php?products_id=1114

*ELRA-S0302 TC-STAR female baseline voice: Laura*
Laura contains the recordings of one female English (British) speaker recorded in a noise-reduced room through a headset microphone. It consists of the recordings and annotations of read text material of approximately 10 hours of speech for baseline applications (Text-to-Speech systems).
For more information, see: http://catalog.elra.info/product_info.php?products_id=1115

*ELRA-S0303 TC-STAR male baseline voice: Ian*
Ian contains the recordings of one male English (British) speaker recorded in a noise-reduced room through a headset microphone. It consists of the recordings and annotations of read text material of approximately 10 hours of speech for baseline applications (Text-to-Speech systems).
For more information, see: http://catalog.elra.info/product_info.php?products_id=1116

*ELRA-S0304 SpeechDat(M) Italian Mobile Network Speech Database*
This speech database contains the recordings of 342 Italian speakers recorded over the Italian mobile telephone network. Each speaker uttered around 40 read and spontaneous items.
For more information, see: http://catalog.elra.info/product_info.php?products_id=1117


For more information on the catalogue, please contact Valérie Mapelli mailto:mapelli@elda.org

Visit our On-line Catalogue: http://catalog.elra.info
Visit the Universal Catalogue: http://universal.elra.info
Archives of ELRA Language Resources Catalogue Updates: http://www.elra.info/LRs-Announcements.html   

Back to Top

5-2-5 . The ELRA Catalogue of Language Resources at OLAC

 The ELRA Catalogue of Language Resources at OLAC

In the framework of its ongoing collaborative work with the OLAC community and in order to improve its compliance with OLAC standards, ELRA is pleased to announce that the OLAC archive of its Catalogue of Language Resources has been updated. According to OLAC standards, the ELRA Catalogue archive achieves the maximum rate of 5 stars on the OLAC scale.
 
The ELRA Catalogue has been exported into OLAC standards by means of associating the ELRA metadata for describing Language Resources with the OLAC metadata. Daily updates are performed through the automatic harvesting of the ELRA Catalogue which is carried out by the OLAC server.

With this work, ELRA aims at rendering its catalogue more visible to the wider community and it contributes to sharing information on Language Resources through standardized metadata sets.

To visit the ELRA Catalogue: http://catalogue.elra.info

and the corresponding OLAC archive: http://www.language-archives.org/archive/catalogue.elra.info

 

*** About ELRA ***
The European Language Resources Association (ELRA) is a non-profit making organisation founded by the European Commission in 1995, with the mission of providing a clearing house for language resources and promoting Human Language Technologies (HLT).

To find out more about ELRA, please visit our web site:
http://www.elra.info

*** About OLAC ***
OLAC, the Open Language Archives Community, is an international partnership of institutions and individuals who are creating a worldwide virtual library of language resources by: (i) developing consensus on best current practice for the digital archiving of language resources, and (ii) developing a network of interoperating repositories and services for housing and accessing such resources.

 

To find out more about OLAC, please visit the following website: http://www.language-archives.org. From that website, ELRA resources can be searched, alongside resources from 40 other language archives.


Back to Top

6 . Jobs openings

We invite all laboratories and industrial companies which have job offers to send them to the ISCApad editor: they will appear in the newsletter and on our website for free. (also have a look at http://www.isca-speech.org/jobs.html as well as http://www.elsnet.org/ Jobs). 

The ads will be automatically removed from ISCApad after  6 months. Informing ISCApad editor when the positions are filled will avoid irrelevant mails between applicants and proposers.


Back to Top

6-1 . (2009-05-04) Several Ph.D. positions and Ph.D. or Postdoc scholarships, Universität Bielefeld

 Several Ph.D. Positions and Ph.D. or Postdoc Scholarships, Universität Bielefeld

 
Applications are invited for several Ph.D. positions and Ph.D. scholarships in experimental phonetics, speech technology and laboratory phonology at Universität Bielefeld (Fakultät für Linguistik und Literaturwissenschaft), Germany.

 

Successful candidates should hold a Master's degree (or equivalent) in phonetics, computational linguistics, linguistics, computer science or a related discipline. They will have a strong background in either

-       speech synthesis and/or recognition

-       discourse prosody

-       laboratory phonology

-       speech and language rhythm research

-       multimodal speech (technology)

 

Candidates should appreciate working in an interdisciplinary environment. Good knowledge in experimental design techniques and programming skills will be considered a plus. Strong interest in research and high proficiency in English is required.

 

The Ph.D. positions will be part-time (50%); salary and social benefits are determined by the German public service pay scale (TVL-E13). The Ph.D. scholarship is based on the DFG scale. There is no mandatory teaching load.

 

Bielefeld University is an equal opportunity employer. Women are therefore particularly encouraged to apply. Disabled applicants with equivalent qualification will be treated preferentially.

 

The positions are available for three years (with a potential extension for the Ph.D. positions), starting as soon as
possible. Please submit your documents (cover letter, CV including list of publications, statement of research interests, names of two referees) electronically to the address indicated below. Applications must be received by June 15, 2009.
 
Universität Bielefeld
Fakultät für Linguistik und Literaturwissenschaft
Prof. Dr. Petra Wagner
Postfach 10 01 31
33 501 Bielefeld
Germany
 
 
 
 
 
 
Back to Top

6-2 . (2009-05-07) PhD POSITION in MACHINE TRANSLATION AND SPEECH UNDERSTANDING FRANCE)

=============================================================================
PhD POSITION in MACHINE TRANSLATION AND SPEECH UNDERSTANDING (starting
09/09)
=============================================================================

The PORT-MEDIA (ANR CONTINT 2008-2011) is a cooperative project
sponsored by the French National Research Agency, between the University
of Avignon, the University of Grenoble, the University of Le Mans, CNRS
at Nancy and ELRA (European Language Resources Association).  PORT-MEDIA
will address the multi-domain and multi-lingual robustness and
portability of spoken language understanding systems. More specifically,
the overall objectives of the project can be summarized as:
- robustness: integration/coupling of the automatic speech recognition
component in the spoken language understanding process.
- portability across domains and languages: evaluation of the genericity
and adaptability of the approaches implemented in the
understanding systems, and development of new techniques inspired by
machine translation approaches.
- representation: evaluation of new rich structures for high-level
semantic knowledge representation.

The PhD thesis will focus on the multilingual portability of speech
understanding systems. For example, the candidate will investigate
techniques to fast adapt an understanding system from one language to
another and creating low-cost resources with (semi) automatic methods,
for instance by using automatic alignment techniques and lightly
supervised translations. The main contribution will be to fill the gap
between the techniques currently used in the statistical machine
translation and spoken language understanding fields.

The thesis will be co-supervised by Fabrice Lefèvre, Assistant Professor
at LIA (University of Avignon) and Laurent Besacier, Assistant Professor
at LIG (University of Grenoble). The candidate will spend 18 months at
LIG then 18 months at LIA.

The salary of a PhD position is roughly 1,300€ net per month. Applicants
should hold a strong university degree entitling them to start a
doctorate (Masters/diploma or equivalent) in a relevant discipline
(Computer Science, Human Language Technology, Machine Learning, etc).
The applicants should be fluent in English. Competence in French is
optional, though applicants will be encouraged to acquire this skill
during training. All applicants should have very good programming skills.

For further information, please contact Fabrice Lefèvre (Fabrice.Lefevre
at univ-avignon.fr) AND Laurent Besacier (Laurent.Besacier at imag.fr).

====================================================================================
Sujet de thèse en Traduction Automatique et Compréhension de la Parole
(début 09/09)
====================================================================================

Le projet PORT-MEDIA (ANR CONTINT 2008-2011) concerne la robustesse et
la portabilité multidomaine et multilingue des systèmes de compréhension
de l'oral. Les partenaires sont le LIG, le LIA, le LORIA, le LIUM et
ELRA (European Language Ressources Association). Plus précisément, les
trois objectifs principaux du projet concernent :
-la robustesse et l'intégration/couplage du composant de reconnaissance
automatique de la parole dans le processus de compréhension.
-la portabilité vers un nouveau domaine ou langage : évaluation des
niveaux de généricité et d'adaptabilité des approches implémentées dans
les systèmes de compréhension.
-l’utilisation de représentations sémantiques de haut niveau pour
l’interaction langagière.

Ce sujet de thèse concerne essentiellement la portabilité multilingue
des différents composants d’un système de compréhension automatique ;
l’idée étant d’utiliser, par exemple, des techniques d’alignement
automatique et de traduction pour adapter rapidement un système de
compréhension d’une langue vers une autre, en créant des ressources à
faible coût de façon automatique ou semi-automatique. L'idée forte est
de rapprocher les techniques de traduction automatique et de
compréhension de la parole.

Cette thèse est un co-encadrement entre deux laboratoires (Fabrice
Lefevre, LIA & Laurent Besacier, LIG). Les 18 premiers mois auront lieu
au LIG, les 18 suivants au LIA.

Le salaire pour un etudiant en thèse est d'environ 1300€ net par mois.
Nous recherchons des étudiants ayant un Master (ou équivalent) mention
Recherche dans le domaine de l'Informatique, et des compétences dans les
domaines suivants : traitement des langues écrites et/ou parlées,
apprentissage automatique...

Pour de plus amples informations ou candidater, merci de contacter
Fabrice Lefèvre (Fabrice.Lefevre at univ-avignon.fr) ET Laurent Besacier
(Laurent.Besacier at imag.fr).

-------------------------- 

Back to Top

6-3 . (2009-05-07)Several Ph.D. Positions and Ph.D. or Postdoc Scholarships, Universität Bielefeld

 
Several Ph.D. Positions and Ph.D. or Postdoc Scholarships, Universität Bielefeld
 
Applications are invited for several Ph.D. positions and Ph.D. scholarships in experimental phonetics, speech technology and laboratory phonology at Universität Bielefeld (Fakultät für Linguistik und Literaturwissenschaft), Germany.

 

Successful candidates should hold a Master's degree (or equivalent) in phonetics, computational linguistics, linguistics, computer science or a related discipline. They will have a strong background in either

-       speech synthesis and/or recognition

-       discourse prosody

-       laboratory phonology

-       speech and language rhythm research

-       multimodal speech (technology)

 

Candidates should appreciate working in an interdisciplinary environment. Good knowledge in experimental design techniques and programming skills will be considered a plus. Strong interest in research and high proficiency in English is required.

 

The Ph.D. positions will be part-time (50%); salary and social benefits are determined by the German public service pay scale (TVL-E13). The Ph.D. scholarship is based on the DFG scale. There is no mandatory teaching load.

 

Bielefeld University is an equal opportunity employer. Women are therefore particularly encouraged to apply. Disabled applicants with equivalent qualification will be treated preferentially.

 

The positions are available for three years (with a potential extension for the Ph.D. positions), starting as soon as
possible. Please submit your documents (cover letter, CV including list of publications, statement of research interests, names of two referees) electronically to the address indicated below. Applications must be received by June 15, 2009.
 
Universität Bielefeld
Fakultät für Linguistik und Literaturwissenschaft
Prof. Dr. Petra Wagner
Postfach 10 01 31
33 501 Bielefeld
Germany
 
 
 
Back to Top

6-4 . (2009-05-08)PhD POSITION in MACHINE TRANSLATION AND SPEECH UNDERSTANDING (starting 09/09)

PhD POSITION in MACHINE TRANSLATION AND SPEECH UNDERSTANDING (starting 09/09)
=============================================================================

The PORT-MEDIA (ANR CONTINT 2008-2011) is a cooperative project sponsored by the French National Research Agency, between the University of Avignon, the University of Grenoble, the University of Le Mans, CNRS at Nancy and ELRA (European Language Resources Association).  PORT-MEDIA will address the multi-domain and multi-lingual robustness and portability of spoken language understanding systems. More specifically, the overall objectives of the project can be summarized as:
- robustness: integration/coupling of the automatic speech recognition component in the spoken language understanding process.
- portability across domains and languages: evaluation of the genericity and adaptability of the approaches implemented in the
understanding systems, and development of new techniques inspired by machine translation approaches.
- representation: evaluation of new rich structures for high-level semantic knowledge representation.

The PhD thesis will focus on the multilingual portability of speech understanding systems. For example, the candidate will investigate techniques to fast adapt an understanding system from one language to another and creating low-cost resources with (semi) automatic methods, for instance by using automatic alignment techniques and lightly supervised translations. The main contribution will be to fill the gap between the techniques currently used in the statistical machine translation and spoken language understanding fields.

The thesis will be co-supervised by Fabrice Lefèvre, Assistant Professor at LIA (University of Avignon) and Laurent Besacier, Assistant Professor at LIG (University of Grenoble). The candidate will spend 18 months at LIG then 18 months at LIA.

The salary of a PhD position is roughly 1,300€ net per month. Applicants should hold a strong university degree entitling them to start a doctorate (Masters/diploma or equivalent) in a relevant discipline (Computer Science, Human Language Technology, Machine Learning, etc). The applicants should be fluent in English. Competence in French is optional, though applicants will be encouraged to acquire this skill during training. All applicants should have very good programming skills.

For further information, please contact Fabrice Lefèvre (Fabrice.Lefevre at univ-avignon.fr) AND Laurent Besacier (Laurent.Besacier at imag.fr). 

Back to Top

6-5 . (2009-05-11) Thèse Cifre indexation de données multimédia Institut Eurecom


						
Thèse Cifre indexation de données multimédia
These
DeadLine: 01/11/2009
The Multimedia Communications Department of EURECOM, in partnership the travel service provider company AMADEUS, invites applications for a PhD position on multimedia indexing. The goal of the thesis is to study new techniques to organize large quantities of multimedia information, specifically images and videos, for improving services to travelers. This includes managing images and videos from providers as well as from users about places, locations, events, etc… The approach will be based on the most recent techniques in multimedia indexing, and will benefit from the strong research experience of EURECOM in this domain, joint to the industrial experience of AMADEUS.
We are looking for very good and motivated students, with a strong knowledge in image and video processing, statistical and probabilistic modeling, for the theoretical part, and a good C/C++ programming ability for the experimental part. English is required. The successful candidate will be employed by AMADEUS in Sophia Antipolis, and will strongly interact with the researchers at EURECOM.
Applicants should email a resume, letter of motivation, and all relevant information to.with
Prof. Bernard Merialdo
The project will be conducted within AMADEUS (http://www.amadeus.com/), a world leader in provision of solutions to the travel industry to manage the distribution and selling of travel services. The company is the leading Global Distribution System (GDS) and the biggest processor of travel bookings in the world. Their main development center is located in Sophia Antipolis, France, and employs more than 1200 engineers. The research will be supervised by EURECOM (http://www.eurecom.fr), a graduate school and research center in communication systems, whose activity includes corporate, multimedia and mobile communications. EURECOM currently counts about 20 professors, 10 post-docs, 170 MS and 60 PhD students, and is involved in many European research projects and joint collaborations with industry. EURECOM is also located in Sophia-Antipolis, a major European technology park for telecommunications research and development in the French Riviera.
 


					
Back to Top

6-6 . (2009-05-11)Senior Research Fellowship in Speech Perception and Language Development,MARCS Auditory Laboratories

Ref 147/09 Senior Research Fellowship in Speech Perception and Language Development, MARCS Auditory Laboratories, Australia

 

MARCS Auditory Laboratories is a multi-disciplinary research centre involved in research in auditory perception and cognition, particularly in the fields of speech and language, music, sound and action, and hearing and auditory processes.

MARCS is seeking a Senior Research Fellow with a background in psychology/behavioural science and specialisation in some or all of the following: speech perception, speech science, experimental phonetics, infant and child perception studies, speech production studies (eg with OPTOTRAK), psychophysics/psychoacoustics, cross-language studies; and experience in sophisticated methods of data analysis.
 
As this position is likely to involve working with children, 'Prohibited Persons' are not permitted to apply. The successful applicant will be required to authorise a screening check.

5 Year Fixed Term Contract , Bankstown Campus

Remuneration Package: Academic Level C $107,853 to $123,724 p.a. (comprising Salary $91,266 to $104,831 p.a., 17% Superannuation, and Leave Loading)

Position Enquiries: Professor Denis Burnham, (02) 9772 6677 or email d.burnham@uws.edu.au

Closing Date: The closing date for this position has been extended until 30 June 2009. 

 
 
Back to Top

6-7 . (2009-06-02)Proposition de sujet de thèse 2009 Analyse de scènes de parole Grenoble France

Proposition de sujet de thèse 2009
Ecole Doctorale EDISCE (http://www-sante.ujf-grenoble.fr/edisce/)
Financement ANR (http://www.icp.inpg.fr/~schwartz/Multistap/Multistap.html)

Analyse de scènes de parole : le problème du liage audio-visuo-moteur à la lumière de données comportementales et neurophysiologiques


Deux questions importantes traversent les recherches actuelles sur le traitement cognitif de la parole : la question de la multisensorialité (comment se combinent les informations auditives et visuelles dans le cerveau) et celle des interactions perceptuo-motrices.

Une question manquante est selon nous celle du « liage » (binding) : comment dans ces processus de traitement auditif ou audiovisuel, le cerveau parvient-il à « mettre ensemble » les informations pertinentes, à éliminer les « bruits », à construire les « flux de parole » pertinents avant la prise de décision ? Plus précisément, les objets élémentaires de la scène de parole sont les phonèmes, et des modules spécialisés auditifs, visuels, articulatoires contribuent au processus d'identification phonétique, mais il n'a pas été possible jusqu'à présent d'isoler leur contribution respective, ni la manière dont ces contributions sont fusionnées. Des expériences récentes permettent d'envisager le processus d'identification phonétique comme étant de nature non hériarchique, et essentiellement instancié par des opérations associatives. La thèse consistera à développer d’autres paradigmes expérimentaux originaux, mais aussi à mettre en place des expériences de neurophysiologie et neuroimagerie (EEG, IRMf) disponibles au laboratoire et dans son environnement Grenoblois, afin de déterminer la nature et le fonctionnement des processus de groupement audiovisuel des scènes de parole, en relation avec le mécanismes de production.

Cette thèse se réalisera dans le cadre d’un projet ANR « Multistap » (Multistabilité et groupement perceptif dans l’audition et dans la parole » http://www.icp.inpg.fr/~schwartz/Multistap/Multistap.html). Ce projet fournira à la fois le support de financement pour la bourse de thèse, et un environnement stimulant pour le développement des recherches, en partenariat avec des équipes de spécialistes d’audition et de vision, de Paris (DEC ENS), Lyon (LNSCC) et Toulouse (Cerco).

Responsables
Jean-Luc Schwartz (DR CNRS, HDR) : 04 76 57 47 12,
Frédéric Berthommier (CR CNRS) : 04 76 57 48 28
Jean-Luc.Schwartz, Frederic.Berthommier@gipsa-lab.grenoble-inp.fr

Back to Top

6-8 . (2009-06-10) PhD in ASR in Le Mans France

PhD position in Automatic Speech Recognition
=====================================

Starting in september-october 2009.

The ASH (Attelage de Systèmes Hétérogènes) project is a project funded by the ANR (French National Research Agency). Three French academic laboratories are involved: LIUM (University of Le Mans), LIA (University of Avignon) and IRISA (Rennes).

The main objective of the ASH project is to define and experiment an original methodological framework for the integration of heterogeneous automatic speech recognition systems. Integrating heterogeneous systems, and hence heterogeneous sources of knowledge, is a key issue in ASR but also in many other applicative fields concerned with knowledge integration and multimodality.

Clearly, the lack of a generic framework to integrate systems operating with different viewpoints, different types of knowledge and at different levels is a strong limitation that needs to be overcome: the definition of such a framework is the fundamental challenge of this work.

By defining a rigorous and generic framework to integrate systems, significant scientific progresses are expected in automatic speech recognition. Another objective of this project is to enable the efficient and reliable processing of large data streams by combining systems on the y.
At last, we expect to develop an on-the-fly ASR system as a real-time demonstrator of this new approach.

The thesis will be co-supervised by Paul Deléglise, Professeur at LIUM, Yannick Estève, Assistant Professor  at LIUM and Georges Linarès, Assistant Professor at LIA. The candidate will work at Le Mans (LIUM), but will regularly spend a few days in Avignon (LIA)

Applicants should hold a strong university degree entitling them to start a doctorate (Masters/diploma or equivalent) in a relevant discipline (Computer Science, Human Language Technology, Machine Learning, etc).

The applicants for this PhD position should be fluent in English or in French. Competence in French is optional, though applicants will be encouraged to acquire this skill during training. This position is funded by the ANR.

Strong software skills are required, especially Unix/linux, C, Java, and a scripting language such as Perl or Python.

Contacts:
Yannick Estève: yannick.esteve@lium.univ-lemans.fr
Georges Linarès: georges.linares@univ-avignon.fr 

Back to Top

6-9 . (2009-06-17)Two post-docs in the collaboration between CMU (USA) and University-Portugal program

Two post-doctoral positions in the framework of the Carnegie Mellon
University-Portugal program are available at the Spoken Language Systems
Lab (www.l2f.inesc-id.pt), INESC-ID, Lisbon, Portugal.
Positions are for a fixed term contract of length up to two and a half
years, renewable in one year intervals, in the scope of the research
projects PT-STAR (Speech Translation Advanced Research to and from
Portuguese) and REAP.PT (Computer Aided Language Learning – Reading
Practice), both financed by FCT (Portuguese Foundation for Science and
Technology).
The starting date for these positions is September 2009, or as soon as
possible thereafter.
Candidates should send their CVs (in .pdf format) before July 15th, to
the email addresses given below, together with a motivation letter.
Questions or other clarification requests should be emailed to the same
addresses.
======== PT-STAR (project CMU-PT/HuMach/0039/2008) ========
Topic: Speech-to-Speech Machine Translation
Description: We seek candidates with excellent knowledge in statistical
approaches to machine translation (and if possible also speech
technologies) and strong programming skills. Familiarity with the
Portuguese language is not at all mandatory, although the main source
and target languages are Portuguese/English.
Email address for applications: lcoheur at l2f dot inesc-id dot pt
======== REAP.PT (project CMU-PT/HuMach/0053/2008) ========
Topic: Computer Aided Language Learning
Description: We seek candidates with excellent knowledge in automatic
question generation (multiple-choice synonym questions, related word
questions, and cloze questions) and/or measuring the reading difficulty
of a text (exploring the combination of lexical features, grammatical
features and statistical models). Familiarity with a romance language is
recommended, since the target language is Portuguese.
Email address for applications: nuno dot mamede at inesc-id dot pt

					
Back to Top

6-10 . (2009-06-19) POSTDOC POSITION in SPEECH RECOGNITION FOR UNDER-RESOURCED LANGUAGES

POSTDOC POSITION in SPEECH RECOGNITION FOR UNDER-RESOURCED LANGUAGES (18 months ; starting January 2010 or later) IN GRENOBLE (France)

=============================================================================

PI (ANR BLANC 2009-2012) is a cooperative project sponsored by the French National Research Agency, between the University of Grenoble (France), the University of Avignon (France), and the International Research Center MICA in Hanoï (Vietnam).


PI addresses spoken language processing (notably speech recognition) for under-resourced languages (or ?-languages). From a scientific point of view, the interest and originality of this project consists in proposing viable innovative methods that go far beyond the simple retraining or adaptation of acoustic and linguistic models. From an operational point of view, this project aims at providing a free open source ASR development kit for ?-languages. We plan to distribute and evaluate such a development kit by deploying ASR systems for new under-resourced languages with very poor resources from Asia (Khmer, Lao) and Africa (Bantu languages).

The POSTDOC position focus on the development of ASR for two low-resourced languages from Asia and Africa. This includes : supervising the resource collection (in relation with the language partners), propose innovative methods to quickly develop ASR systems for these languages, evaluation., etc.

The salary of the POSTDOC position is roughly 2300€ net per month. Applicants should hold a PhD related to spoken language processing. The applicants should be fluent in English. Competence in French is optional, though applicants will be encouraged to acquire this skill during the postdoc.

For further information, please contact Laurent Besacier (Laurent.Besacier at imag.fr).

Back to Top

6-11 . (2009-06-22) PhD studentship in speech and machine learning ESPCI ParisTech

Thesis topic : Vocal Prosthesis Based on Machine Learning
The objective of the thesis is to design and implement a vocal prosthesis to restore the original voice of persons who have lost the ability to speak due to a partial or total laryngectomy or a neurological problem. Using a miniature ultrasound machine and a video camera to drive a speech synthesizer, the device is intended to restore the original voice of these patients with as much fidelity as possible, allowing speech handicapped individuals to interact with those around them in a more natural and familiar way. The thesis work will build upon promising results obtained in the Ouisper project (funded on contract number ANR-06-BLAN-0166, http://www.neurones.espci.fr/ouisper/index.htm, and also supported by the French Defense Department, DGA), which terminates at the end of 2009; however, final success will require addressing the following four key technological issues:
1) New data acquisition protocol: the current acquisition system requires the user’s head to be immobilized during speech. The candidate will need to design and implement an innovative new system to overcome this constraint, which would be unacceptable for a real world application.
2) New dictionaries: results obtained thus far show that a truly open domain vocabulary may not be realistic. The candidate will create new dictionaries of vocabularies which are constrained, yet rich enough to be of genuine utility for verbal communication in the targeted speech handicapped community.
3) New synthesis methods: concatenative synthesis, though conceptually simple, is not sufficiently flexible when the initial recognition step contains errors. The candidate will devise new synthesis methods which better model the spectral qualities of the speaker’s voice, perhaps using Bayesian networks. Innovative techniques of recovering an acceptable prosody for the synthesized speech will also need to be developed.
4) Real time execution: as the amount of calculation necessary to carry out the recognition and synthesis steps is significant, the candidate will need to pay particular attention to optimization of code and real time execution in the algorithms he or she develops.
The thesis will be carried out in partnership with the Laboratoire de Phonétique et de Phonologie of the Université de Paris III, specializing in speech production and pathologies, for which additional funding has been obtained from the Agence Nationale de la Recherche (Emergence-TEC 2009 call, REVOIX project).
Back to Top

6-12 . (2009-06-30)Postdoctoral Fellowships in machine learning/statistics/machine vision at Monash University, Australia


						

					
Back to Top

6-13 . (2009-06-30) PhD studentship at LIMSI France

Titre :
Modèles de l'expressivité pour la synthèse de récits courts, lus par un robot humanoïde.
Contenu:
Si les systèmes de synthèses actuels sont généralement suffisants pour lire des phrases de
façon neutre, il sont très vite pénibles à écouter, en particulier pour des textes assez long
(plusieurs paragraphes). Les systèmes de synthèse ne sont guère capables de rendre expressif
une narration. De même, les capacités motrices des robots humanoïdes sont actuellement peu
exploitées et développées pour l’expression par le geste et la posture.
Ce projet de recherche porte sur la synthèse expressive audiovisuelle de récits courts. Le
projet comprend deux aspects principaux. Dans une phase d’analyse, il s’agit de traiter
automatiquement des textes, courts récits de type « contes pour enfant », afin d'en extraire un
contenu pragmatique, sémantique, dialogique, narratif et émotionnel.
Ce contenu servira dans une seconde phase d'une part à la synthèse de prosodie expressive, et
d'autre part à alimenter un modèle comportemental en termes de postures, de gestes et autres
mouvements du robot humanoïde NAO.
Compétences requises:
Ce sujet est situé dans le domaine de l’interaction homme-machine expressive.
Il demande de posséder ou d'acquérir des compétences en informatique linguistique, tant du
point de vue de l'écrit que de l'oral, et si possible également du point de vue audio visuel.
Le projet contient une part significative de programmation pour l’analyse des texte et la
synthèse, mais aussi une part significative d’analyse linguistique (des textes), phonétique (de
la prosodie), et comportementale (posture et gestes).
Des profils de type informatique, science cognitive ou linguistique seront donc considérés.
Contexte et équipe d’accueil:
Cette thèse s’inscrit dans le contrat ANR GV-LEX.
Elle se déroulera au LIMSI-CNRS www.limsi.fr dans les groupes Audio & Acoustique,
Traitement du Langage Parlé, et Architecture et Modèles de l’Interaction
Cette thèse commencera dès septembre, financée par l’ANR pour une durée 3 ans.
Encadrement - contact:
La thèse sera encadrée par Christophe d’Alessandro, directeur de recherche au CNRS. Les
candidatures seront adressées aux quatre chercheurs impliqués dans ce projet :
Christophe d’Alessandro <cda@limsi.fr>
Jean-Claude Martin <martin@limsi.fr>
Sophie Rosset <sophie.rosset@limsi.fr>
Albert Rilliard <rilliard@limsi.fr>
Back to Top

6-14 . (2009-07-01) These: Vocal Prosthesis Based on Machine Learning (France)

  Vocal Prosthesis Based on Machine Learning(2)
These
DeadLine: 01/09/2009
http://
We are looking for an excellent candidate for a PhD studentship in speech and statistical learning at the Laboratoire d'Electronique at ESPCI ParisTech, Paris, France. Interested candidates should contact Prof. B. Denby by mail at denby@ieee.org before 1 Septembre 2009 at the latest (earlier application is strongly encouraged).
Working language: French or English
Thesis topic : Vocal Prosthesis Based on Machine Learning
The objective of the thesis is to design and implement a vocal prosthesis to restore the original voice of persons who have lost the ability to speak due to a partial or total laryngectomy or a neurological problem. Using a miniature ultrasound machine and a video camera to drive a speech synthesizer, the device is intended to restore the original voice of these patients with as much fidelity as possible, allowing speech handicapped individuals to interact with those around them in a more natural and familiar way. The thesis work will build upon promising results obtained in the Ouisper project (funded on contract number ANR-06-BLAN-0166,
and also supported by the French Defense Department, DGA), which terminates at the end of 2009; however, final success will require addressing the following four key technological issues:
1) New data acquisition protocol: the current acquisition system requires the users head to be immobilized during speech. The candidate will need to design and implement an innovative new system to overcome this constraint, which would be unacceptable for a real world application.
2) New dictionaries: results obtained thus far show that a truly open domain vocabulary may not be realistic. The candidate will create new dictionaries of vocabularies which are constrained, yet rich enough to be of genuine utility for verbal communication in the targeted speech handicapped community.
3) New synthesis methods: concatenative synthesis, though conceptually simple, is not sufficiently flexible when the initial recognition step contains errors. The candidate will devise new synthesis methods which better model the spectral qualities of the speaker's voice, perhaps using Bayesian networks. Innovative techniques of recovering an acceptable prosody for the synthesized speech will also need to be developed.
4) Real time execution: as the amount of calculation necessary to carry out the recognition and synthesis steps is significant, the candidate will need to pay particular attention to optimization of code and real time execution in the algorithms he or she develops.
The thesis will be carried out in partnership with the Laboratoire de Phonetique et de Phonologie of the Universite de Paris III, specializing in speech production and pathologies, for which additional funding has been obtained from the Agence Nationale de la Recherche (Emergence-TEC 2009 call, REVOIX project).
Back to Top

6-15 . (2009-07-06) PhD in SPEECH RECOGNITION FOR UNDER-RESOURCED LANGUAGES (Grenoble France)


POSTDOC POSITION in SPEECH RECOGNITION FOR UNDER-RESOURCED LANGUAGES (18 months ; starting January 2010 or later) IN GRENOBLE (France)
=============================================================================

PI (ANR BLANC 2009-2012) is a cooperative project sponsored by the French National Research Agency, between the University of Grenoble (France), the University of Avignon (France), and the International Research Center MICA in Hanoï (Vietnam).

PI addresses spoken language processing (notably speech recognition) for under-resourced languages (or ?-languages). From a scientific point of view, the interest and originality of this project consists in proposing viable innovative methods that go far beyond the simple retraining or adaptation of acoustic and linguistic models. From an operational point of view, this project aims at providing a free open source ASR development kit for ?-languages. We plan to distribute and evaluate such a development kit by deploying ASR systems for new under-resourced languages with very poor resources from Asia (Khmer, Lao) and Africa (Bantu languages).


The POSTDOC position focus on the development of ASR for two low-ressourced languages from Asia and Africa. This includes : supervising the ressource collection (in relation with the language partners), propose innovative methods to quickly develop ASR systems for these languages, evaluation., etc.

The salary of the POSTDOC position is roughly 2300? net per month. Applicants should hold a PhD related to spoken language processing. The applicants should be fluent in English. Competence in French is optional, though applicants will be encouraged to acquire this skill during the postdoc.

For further information, please contact Laurent Besacier (Laurent.Besacier at imag.fr).
Back to Top

6-16 . (2009-07-08) Position at Deutsche Telekom R&D

Deutsche Telekom, one of the world’s leading telecommunications and information
technology service provider, is expanding its corporate research and development
activities at Deutsche Telekom Inc., R&D Lab USA, Los Altos, California. Having a close
collaboration with top-notch institutions, the laboratories offer an unprecedented
combination of academic and industrial research with opportunities to have a direct
impact on company’s products and services.
There is a current opening for a highly qualified Senior Research Scientist the
research field New Media for the area of Multimedia Communications and Systems.
We are looking for a self-driven and motivated individual who is passionate about
conducting leading-edge research. Applicants should have recently completed a
doctoral degree in computer science, electrical engineering, or other related disciplines
and have expertise in different facets of multimedia communications such as media
coding, streaming, and compression, with hands on system building experience and
know-how of standards. Experience in industrial R&D will be valued.
Application material should include, in a single pdf file, the following in stated order, (a)
cover letter, (b) one-page statement of research objectives, (c) curriculum vitae, (d) list
of publications, and (e) contact information of at least three individuals who may serve
as references. Short-listed candidates will be invited to give a talk and have interviews
with members of the recruiting committee.
Please submit your application until 22 July 2009.
Deutsche Telekom Inc. is an equal opportunity employer.
Applications should be submitted via email to:
Dr. Jatinder Pal Singh
Deutsche Telekom Inc., R&D Lab USA
Back to Top

6-17 . (2009-07-15) PhD at LIMSI Paris

Titre :
Modèles de l'expressivité pour la synthèse de récits courts, lus par un robot humanoïde.
Contenu:
Si les systèmes de synthèses actuels sont généralement suffisants pour lire des phrases de
façon neutre, il sont très vite pénibles à écouter, en particulier pour des textes assez long
(plusieurs paragraphes). Les systèmes de synthèse ne sont guère capables de rendre expressif
une narration. De même, les capacités motrices des robots humanoïdes sont actuellement peu
exploitées et développées pour l’expression par le geste et la posture.
Ce projet de recherche porte sur la synthèse expressive audiovisuelle de récits courts. Le
projet comprend deux aspects principaux. Dans une phase d’analyse, il s’agit de traiter
automatiquement des textes, courts récits de type « contes pour enfant », afin d'en extraire un
contenu pragmatique, sémantique, dialogique, narratif et émotionnel.
Ce contenu servira dans une seconde phase d'une part à la synthèse de prosodie expressive, et
d'autre part à alimenter un modèle comportemental en termes de postures, de gestes et autres
mouvements du robot humanoïde NAO.
Compétences requises:
Ce sujet est situé dans le domaine de l’interaction homme-machine expressive.
Il demande de posséder ou d'acquérir des compétences en informatique linguistique, tant du
point de vue de l'écrit que de l'oral, et si possible également du point de vue audio visuel.
Le projet contient une part significative de programmation pour l’analyse des texte et la
synthèse, mais aussi une part significative d’analyse linguistique (des textes), phonétique (de
la prosodie), et comportementale (posture et gestes).
Des profils de type informatique, science cognitive ou linguistique seront donc considérés.
Contexte et équipe d’accueil:
Cette thèse s’inscrit dans le contrat ANR GV-LEX.
Elle se déroulera au LIMSI-CNRS www.limsi.fr dans les groupes Audio & Acoustique,
Traitement du Langage Parlé, et Architecture et Modèles de l’Interaction
Cette thèse commencera dès septembre, financée par l’ANR pour une durée 3 ans.
Encadrement - contact:
La thèse sera encadrée par Christophe d’Alessandro, directeur de recherche au CNRS. Les
candidatures seront adressées aux quatre chercheurs impliqués dans ce projet :
Christophe d’Alessandro <cda@limsi.fr>
Jean-Claude Martin <martin@limsi.fr>
Sophie Rosset <sophie.rosset@limsi.fr>
Albert Rilliard <rilliard@limsi.fr>
Back to Top

6-18 . (2009-07-17) 2 PhD in Computational linguistics in Radboud University Nijmegen NL

Two PhD students for Second Language Acquisition/Computational Linguistics (1,0 fte)

Faculty of Arts, Radboud University, Nijmegen
Vacancy number: 23.24.09
Closing date: 1 September 2009

Job description
As a PhD student you will take part in the larger research project ‘Corrective feedback and the acquisition of syntax in oral proficiency’. The goal of this research project is to investigate the essential role of corrective feedback in L2 learning of syntax in oral proficiency. It will proceed from a granular level investigating the short-term effects of different types of feedback moves on different types of learners, to a global level by studying whether the granular, short-term effects also generalize to actual learning in the long term. Corrective feedback will be provided through a CALL system that makes use of automatic speech recognition. This will make it possible to assess the learner’s oral production online and to provide corrective feedback immediately under near-optimal conditions.
As a PhD student you will study which feedback moves lead to immediate uptake and acquisition in learners with a high level of education (PhD1) or learners with a low level of education (PhD2).
You are expected to start in November 2009. You will be part of an international and interdisciplinary team and will work in a motivating research environment.
For more information, see: http://lands.let.ru.nl/~strik/research/ASOP.html; http://www.ru.nl/cls/.

Requirements
You must have:
- a Master’s degree in (Applied) Linguistics, Computational Linguistics, Computer Science, Psycholinguistics, Artificial Intelligence, Cognitive Science or Education;
- programming skills (e.g. Matlab, Perl);
- an interest in second language acquisition;
- a working knowledge of Dutch and a good command of the English language.

Organization
The Faculty of Arts consists of eleven departments in the fields of language and culture, history, history of arts, linguistics and business communication, which together cater for about 2,800 students and collaborate closely in teaching and research. The project will be carried out at the Centre for Language Studies as part of the Linguistic Information Processing and Communicative Competences research programmes.
Website: http://www.ru.nl/cls/

Conditions of employment
Employment: 1,0 fte

Additional conditions of employment
The total duration of the contract is 3.5 years. The PhD students will receive an initial contract for 18 months with possible extension by 2 years.
The starting gross salary is €2,042 per month based on full-time employment.

The short-listed applicants will be interviewed in September 2009


Other Information
Please include:
- a copy of your university degree (in English or Dutch)
- a list of all your university marks (in English or Dutch)
- a motivation letter with details of research interests/experience, programming skills and knowledge of Linguistics and Psycholinguistics.

Additional Information
Prof. Roeland van Hout (r.vanhout@let.ru.nl)
Dr. Catia Cucchiarini (c.cucchiarini@let.ru.nl)
Dr. Helmer Strik (h.strik@let.ru.nl)

Application
You can apply for the job (mention the vacancy number 23.24.09) before 1 September 2009 by sending your application -preferably by email- to:

RU Nijmegen, Faculty of Arts, Personnel Department
P.O. Box 9103, 6500 HD, Nijmegen, The Netherlands
E-mail: vacatures@let.ru.nl

 

Back to Top

6-19 . (2009-07-24) Acoustic signal detection engineer Oregon State University

Acoustic Signal Detection Engineer

OSU’s Cooperative Institute for Marine Resources Studies offers one year of support, with the possibility of additional support, for a researcher on a project studying passive acoustic monitoring of large whales under the direction of Dr. David Mellinger. This is a full-time (1.0 FTE), 12-month-per-year, fixed-term Faculty Research Assistant position. Individuals with a Ph.D. may be appointed as a Research Associate (Postdoc). To see full details and to apply, please see http://oregonstate.edu/jobs, then search for posting #0004462 (the leading zeros are required). For questions, please email Jessica.Waddell@oregonstate.edu or David.Mellinger@oregonstate.edu. For full consideration, apply by September 21, 2009. OSU is an Affirmative Action/Office of Equal Opportunity employer. 

Back to Top

6-20 . (2009-08-06) Post graduate Research positions at Marcs, Australia

MARCS Auditory Laboratories currently has 3 Postgraduate Research Awards available, offering a competitive tax free living allowance of $30,427 per annum and a funded place in the doctoral program.
 
Projects Available:
 
Thinking Head - Performance
Supervisor:  Dr Garth Paine (ga.paine@uws.edu.au)
 
Thinking Head and Head-User Evaluation
Supervisor: Assoc Prof Kate Stevens (kj.stevens@uws.edu.au)
 
Sonification of Real-Time Data: Computational and Cognitive Approaches
Supervisor: Professor Roger Dean (roger.dean@uws.edu.au)
 
Thinking Head—Human-Human and Human-Head Interaction
Supervisor: Professor Chris Davis (chris.davis@uws.edu.au)
 
Learning Complex Temporal and Rhythmic Relations
Supervisor: Assoc Prof Kate Stevens (kj.stevens@uws.edu.au)
 
Tuning in to Native Speech and Perceiving Spoken Words
Supervisor: Professor Catherine Best (c.best@uws.edu.au)
 
Applications close 21 August 2009.  For further information visit the scholarship website –  www.uws.edu.au/research/scholarships.
 
MARCS Website - http://marcs.uws.edu.au/            Thinking Head Website - http://thinkinghead.edu.au//
 
 
Back to Top

6-21 . (2009-08-06) PhD position is available at the Queensland University of Technology, Brisbane, Australia.

PHD OPPORTUNITY:

A full-time 3 year PhD position is available at the Queensland University of Technology, Brisbane, Australia.

The position is within the Speech and Audio Research Lab, part of Smart Systems Theme of the Faculty of Built Environment and Engineering, and the Information Securities Institute.  The lab conducts world class research and  postgraduate training in a variety of speech and audio processing areas (speaker recognition, diarisation, speech detection, speech enhancement, multi-microphone speech technology, automatic language identification, keyword spotting)

Project title: Speaker Diarisation
Starting date: November/December 2009
Research fields:  Speech and audio processing, pattern recognition, bayesian theory, machine learning, biometrics and security.
Project Description:  
Large volumes of spoken audio are being recorded on a daily basis and audio archives of these recordings around the world are expanding rapidly. It is becoming increasingly important to be able to efficiently and automatically search, index and access information from these audio information sources. Speaker diarisation is an important, fundamental task in this process which aims to annotate the audio stream with speaker identities for each temporal region—determining “who spoke when.”

Current diarisation systems are susceptible to a number of impediments including wide variability in the acoustic characteristics of recordings from different sources, differences in the number of speakers present in a recording, the dominance of speakers, and the style and structure of the speech.  All can affect the diarisation performance dramatically.

The aim of this research is to develop a framework and methods for better exploiting the sources of prior information that are generally available in many applications with the view to making portable, robust speaker diarisation systems a reality.  Examples of relevant prior information include identities of participating speakers, models describing the characteristics of speakers in general or of a specific known speaker, models of the effects that recording conditions and domains have on acoustic features and knowledge of the recording domain.

Information for Applicants:
Applicants should hold a strong university degree which would entitle them to embark on a doctorate (Masters/diploma or equivalent) in a relevant discipline (computer science, mathematics, computer systems engineering etc). International students are encouraged to apply. The project is part of a ARC linkage between QUT, a commercial partner, and two  partner speech/audio processing laboratories at European universities.  Opportunities for exchange/internships at these partner institutions exist.  The opportunity also exists for cotutelle PhD with the French partner university.  

Information on Brisbane and Queensland University of Technology can be found at www.qut.edu.au

The salary of the PhD position is provided as a Linkage APAI scholarship ($26,669 in 2009, indexed annually) + top-up scholarship (approx $5,000pa). The salary is tax-exempt.   
Funding is also available for conference/internship travel.

Interested students are encouraged to contact both the project leader Prof. Sridha Sridharan  (s.sridharan@qut.edu.au), and Dr Brendan Baker (bj.baker@qut.edu.au).  

Applicants are asked to provide:
- cover letter describing your interest in the project
- curriculum vitae indicating degrees obtained, disciplines covered (list of courses), publications, and other relevant experience.  
- sample of written work (research papers in English) is also desirable.  
- references along with contact details

As the start date is later this year, potential applicants are encouraged to contact the project coordinators as soon as possible to register their interest.

Deadlines for applications: 10 September 2009. (international applicants should apply as soon as possible)

Back to Top

6-22 . (2009-08-26) Ph Positions at the University of Bielefeld Germany

PhD Positions

The Applied Informatics Group, Faculty of Technology, Bielefeld University is looking for PhD candidates for grants and project positions in the following areas:

* Dialog modeling for human-robot interaction

* Speech signal modelling and analysis for speech recognition and synthesis

* Modeling and combining bottom-up with top-down attentional-processes

We invite applications from motivated young scientists with a background in computer science, linguistics, psychology, robotics, mathematics, cognitive science or similar areas, that are willing to contribute to the cross-disciplinary research agenda of our research group. Research and development are directed towards understanding the processes and functional constituents of cognitive interaction, and establishing cognitive interfaces and robots that facilitate the use of complex technical systems. Bielefeld University provides a unique environment for research in cognitive and intelligent systems by bringing together researchers from all over the world in a variety of relevant disciplines under the roof of central institutions such as the Excellence Center of Cognitive Interaction Technology (CITEC) or the Research Institute for Cognition and Robotics (CoR-Lab).

Successful candidates should hold an academic degree (MSc/Diploma) in a related discipline and have a strong interest in research and social robotics.

All applications should include: a short cover letter indicating the motivation and research interests of the candidate, a CV including a list of publications, and relevant certificates of academic qualification.

Bielefeld University is an equal opportunity employer. Women are especially encouraged to apply and in the case of comparable competences and qualification, will be given preference. Bielefeld University explicitly encourages disabled people to apply. Bielefeld University offers a family friendly environment and special arrangements for child care and double carrier opportunities.

Please send your application with reference to one of the three offered research areas no later than 15.9.2009 to Ms Susanne Hoeke (shoeke@techfak.uni-bielefeld.de).

Contact:

Susanne Hoeke

AG Applied Informatics

Faculty of Technology

Universitaetsstr. 21-23

33615 Bielefeld

Germany

Email: shoeke@techfak.uni-bielefeld.de

Back to Top

6-23 . (2009-09-03) Post-doc au laboratoire d'informatique de Grenoble France (french)

Le laboratoire LIG propose un sujet de recherche
    pour un post-doctorant
    CDD de 12 mois
    Grenoble, campus
    année 2009-2010

Sujet de recherche
------------------
Apprentissage Parallèle pour l'Indexation Multimédia Sémantique
Mots-clés : Apprentissage, Parallélisme, Indexation Multimédia.

Contexte
--------
Le poste est proposé dans le contexte du projet APIMS (Apprentissage
Parallèle pour l'Indexation Multimédia Sémantique) soutenu par le pôle
MSTIC de l’Université Joseph Fourrier.

La quantité de documents image et vidéo numériques croît de manière
exponentielle depuis de nombreuses années et cette tendance devrait se
poursuivre encore longtemps grâce aux progrès technologiques dans ce
domaine. L’indexation par concepts des documents image et vidéo est une
nécessité pour gérer de manière efficace les masses de données
correspondantes. En effet, les mots-clés nécessaires pour la recherche
par le contenu n’y sont pas explicitement présents comme dans le cas
des documents textuels. La recherche à partir d’exemples ou à partir
de caractéristiques dites « de bas niveau » présente également de
sérieuses limitations : les exemples nécessaires ne sont généralement
pas disponibles et les caractéristiques de bas niveau ne sont pas
aisément manipulables et interprétables par un utilisateur. Par ailleurs,
une similarité au niveau de ces caractéristiques ne correspond pas
forcément à une similarité au niveau sémantique. L’indexation par
concepts est un grand challenge en raison du « fossé sémantique »
séparant le contenu brut de ces documents (pixels, échantillons audio)
et les concepts qui on un sens pour un utilisateur.

Des progrès importants ont été accomplis ces dernières années, notamment
dans le cadre des campagnes d’évaluation TRECVID [1]. Ces campagnes
annuelles organisées par le National Institute of Standards and
Technologies (NIST) américain fournissent des données en quantité
importante, des tâches bien définies, des « vérités terrain », des
métriques et des outils d’évaluation associés. Elles contribuent
largement à fédérer les recherches dans le domaine de l’indexation
et de la recherche par le contenu des documents vidéo.
Les méthodes fonctionnant le mieux actuellement sont des méthodes
statistiques fonctionnant par apprentissage supervisé à partir
d’exemples annotés manuellement. Des caractéristiques dites de bas
niveau sont extraites à partir du signal audio ou image brut (des
histogrammes de couleur ou des transformées de Gabor par exemple) et
sont ensuite envoyées à des classifieurs qui sont entraînés à partir
d’exemples positifs et négatifs des concepts à reconnaître. Pour
obtenir de bons résultats, il est nécessaire de multiplier les
caractéristiques utilisées et de les combiner en utilisant des
techniques de fusion appropriées. Un gain supplémentaire est obtenu
en utilisant les relations entre les concepts comme les relations
statistiques (cooccurrences) ou logiques (générique-spécifique par
exemple).

Les principes généraux étant les mêmes, les différences entre les
approches concernent les choix sur les caractéristiques, sur les
outils de classification et/ou de fusion, et sur la façon de prendre
en compte le contexte. La qualité et la quantité des exemples positifs
et négatifs utilisés fait également une différence importante. L’état
de l’art actuel est l’extraction conjointe de plusieurs centaines de
concepts définis dans l’ontologie LSCOM [2]. Cependant, malgré les
efforts très importants fournis par un grand nombre d’équipes (plus
de 40 équipes ont participé à la tâche d’extraction de concepts dans
les plans vidéo lors de la campagne TRECVID 2008), la précision
moyenne des meilleurs systèmes ne dépasse pas 20%.

L’équipe MRIM du LIG a développé des méthodes et des outils pour
l’extraction automatique de concepts dans les plans vidéo et a
obtenu des résultats un peu supérieurs à la moyenne dans les
campagnes TRECVID 2005 à 2007 [3]. L’objectif de ce projet est
d’améliorer de manière importante ces méthodes et de leur faire
rejoindre voire définir l’état de l’art dans le domaine. Pour
cela, il faut d’une part les optimiser en prenant en compte tous
les facteurs importants et de leur ajouter un certain nombre
d’innovations comme l’utilisation de concepts de niveau
intermédiaire, la combinaison de méthodes génériques et
spécifiques, et l’apprentissage actif pour l’amélioration de
la quantité et qualité de l’annotation servant à l’entraînement
des systèmes.

Un des facteurs limitant est la puissance de calcul nécessaire.
Il faut en effet entraîner et évaluer les systèmes sur plusieurs
centaines de concepts et sur plusieurs dizaines de milliers
d’images ou de plans vidéo. Il faut en outre faire cela en
étudiant de multiples combinaisons de caractéristiques de bas
et moyen niveau, de méthodes de classification et de méthodes
de fusion. Nous envisageons pour cela d’utiliser les ressources
du projet GRID 5000 [4] afin de pouvoir étudier à grande échelle
l’influence combinée de ces différents facteurs. Dans sa version
simple, le problème se parallélise assez facilement (on peut faire
faire l’apprentissage et l’évaluation d’un concept sur un
processeur) mais lorsqu’on veut utiliser le contexte, c'est-à-dire
les relations statistiques ou ontologiques des concepts entre eux,
il y a lieu de faire coopérer les différents processus entre eux
et cela devient un réel problème de programmation parallèle.
L’équipe MESCAL du LIG dispose d’une grande expertise dans ce
domaine et participera à l’étude et à la mise en œuvre des versions
parallèles des méthodes d’extraction de concepts.

L’utilisation de la multi modalité naturellement présente dans
les documents vidéo est également essentielle pour la performance
des systèmes d’indexation par concepts. L’équipe GETALP du LIG
dispose de compétences dans le domaine du traitement du signal
audio et de parole et participera à la définition et à
l’optimisation des caractéristiques de bas et moyen niveau pour
l’indexation des concepts à partir de la piste audio. De même,
l’équipe GPIG de GIPSA-Lab dispose de compétences dans l’analyse
et l’indexation du mouvement dans les documents vidéo et participera
à la définition et à l’optimisation des caractéristiques de bas et
moyen niveau pour l’indexation des concepts à partir du mouvement
dans la piste image.

Références
[1] Smeaton, A. F., Over, P., and Kraaij, W. TRECVID: evaluating
    the effectiveness of information retrieval tasks on digital video.
    In Proceedings of the 12th Annual ACM international Conference on
    Multimedia, New York, NY, USA, October 10-16, 2004.
[2] M. Naphade, J.R. Smith, J. Tesic, S.-F. Chang, W. Hsu,
    L. Kennedy, A. Hauptmann and J. Curtis, Large-Scale Concept
    Ontology for Multimedia, IEEE Multimedia 13(3), pp. 86-91, 2006.
[3] Stéphane Ayache, Georges Quénot and Jérôme Gensel, CLIPS-LSR
    Experiments at TRECVID 2006, TRECVID’2006 Workshop, Gaithersburg,
    MD, USA, November 13-14, 2006.
[4] Bolze, R. et al, Grid'5000: a large scale and highly reconfigurable
    experimental Grid testbed International Journal of High Performance
    Computing Applications, 20(4), pp 481-494, 2006.

Description du poste
--------------------
La première partie du travail consistera à mettre en œuvre des
versions parallèles des méthodes de classification développées dans
l’équipe MRIM et à utiliser ces versions parallèles pour optimiser
conjointement les différents éléments (jeux de caractéristiques,
opérateurs de classification et opérateurs de fusion) intervenant
dans celles-ci. Cette optimisation devra être faire de manière aussi
systématique que possible. Compte tenu de l’aspect hautement
combinatoire et du coût de calcul (même sur une architecture
parallèle) de celle-ci, des méthodes heuristiques appropriées devront
être étudiées et mises en œuvre afin d’obtenir le meilleur résultat
dans un temps donné.

Dans une deuxième partie, il faudra mettre en œuvre des approches
intégrées pour la reconnaissance simultanée de plusieurs centaines
de concepts en prenant en compte dès les premiers niveaux de
l’apprentissage les corrélations existant entre ceux-ci.

Ces travaux seront, dans la mesure du possible, planifiés en fonction
des évaluations TRECVID sur la détection de concepts dans les plans
vidéo. Les expérimentations on lieu en général pendant l’été
(juillet-août) et les campagnes s’étendent de février à novembre de
l’année en cours. L’objectif est de pouvoir évaluer lors des campagnes
2009 et 2010 ce qu’il est prévu de développer dans la première et la
deuxième partie décrites ci-dessus.

Type de poste et localisation
-----------------------------
CDD de 12 mois au laboratoire LIG.

Intégration dans l’équipe MRIM du LIG (recherche en recherche
d’information multimédia et systèmes de recommandation) et
collaboration avec les équipes GETALP et MESCAL du LIG et
l’équipe GPIG du laboratoire GIPSA.

Localisation : Grenoble, campus de Saint Martin d’Hères.
Salaire : 2 000 euros nets / mois environ.

Formation et compétences nécessaires
------------------------------------
Profil demandé
o Expérience importante et compétences reconnues en conception –
  développement de logiciels.
o Connaissances et expérience significative en langage C ou C++.
o Thèse dans l’un des domaines suivants : programmation parallèle,
  systèmes de recherche d’information, apprentissage automatique,
  traitement statistique des données, traitement d’images.

Compétences complémentaires intéressantes pour le poste
o Expérience dans l’optimisation des performances des algorithmes.
o Expérience du travail en équipe.

Date limite de candidature
--------------------------
Les candidatures peuvent être déposée jusqu’au 30 septembre 2009.
Le poste est à pourvoir début novembre ou décembre 2009 au plus tard.
Dès qu'une candidature sera retenue, le poste sera affecté.

Contact
-------
Georges Quénot – Laboratoire LIG – Equipe MRIM – http://mrim.imag.fr/georges.quenot
Adresse : Bâtiment B – 385 avenue de la Bibliothèque – 38400 Saint Martin d’Hères
E-mail : Georges.Quenot@imag.fr – Tél : 04 76 63 58 55

Back to Top

6-24 . (2009-09-23)Position of Professor in Phonology- GIPSA, Grenoble,France

Profil du poste PR0035 "Phonologie - phonétique générale et expérimentale" qui sera mis au concours au printemps 2010.

*Phonologie - phonétique générale et expérimentale*
• *Enseignement :*
• filières de formation concernées :
Parcours LMD de la filière de Sciences du Langage
• objectifs pédagogiques et besoin d’encadrement :
Le professeur recruté devra assurer des enseignements de phonétique et phonologie proposés par l’UFR des Sciences du Langage :
– dans le cursus de Licence : développement de la parole, phonétique articulatoire et acoustique, prosodie, phonologies linéaires et multilinéaires, phonétique expérimentale ;
– dans le Master Sciences du Langage, spécialité « Linguistique, sociolinguistique et acquisition du langage », orientation Recherche, en particulier les cours de phonologie.
Le professeur recruté devra prendre en charge ces enseignements en intégrant les apports des travaux sur les approches phonologiques (géométrie des traits, théorie des éléments, phonologie de laboratoire, phonologie prosodique, tonologie, phonologie cognitive, théorie de l’optimalité, etc.) et sur la diversité des réalisations sonores des langues du monde, en particulier les langues à tradition orale.
Le professeur devra s’impliquer dans la direction et l’encadrement de mémoires de master recherche ainsi que dans la formation des jeunes chercheurs en phonologie, typologie et linguistique de terrain (recueil de données, corpus, etc.), phonétique expérimentale, phonétique générale.
*• Recherche :*
La recherche s’effectuera au sein du département Parole et Cognition (DPC) de l’UMR 5216 GIPSA-lab. Le DPC mène des recherches multidisciplinaires sur la parole et le langage en s’appuyant en particulier sur quatre domaines de compétence: traitement du signal, physique, cognition et sciences du langage. Dans notre approche de l’analyse des langues naturelles, les aspects phonétiques, phonologiques, lexico-sémantiques et prosodiques, incluant les champs de la perspective temporelle (étude diachronique) et spatiale (géolinguistique), sont plus particulièrement l’objet d’étude de l’équipe Systèmes Linguistiques et Dialectologie du DPC.
Les axes de cette équipe concernent l’émergence des phénomènes linguistiques – et particulièrement dans des situations de contacts de langues –, la description et la documentation des systèmes linguistiques à tradition orale, l’étude de la prosodie et des fonctions communicatives. Ces recherches s’inscrivent dans le contrat quadriennal 2011-2014 du laboratoire et font l’objet depuis plusieurs années de collaborations nationales et internationales (notamment avec l’Amérique Latine) soutenues, entre autres, par l’ANR et la Communauté Européenne.
Les thématiques de recherche du Professeur recruté devront s’inscrire dans ces axes de recherche en explorant plus particulièrement la variabilité et les dynamiques temporelles et spatiales des systèmes linguistiques aux niveaux segmental et suprasegmental. Le Professeur s’appuiera sur les acquis et expériences de la linguistique de terrain, de la linguistique de corpus, de la phonétique expérimentale et de la phonologie de laboratoire. Il pourra être amené à proposer des
éléments de modélisation cognitive et computationnelle. Il est attendu que la personne recrutée ait une expérience confirmée dans l’animation d’équipe et s’investisse activement dans la mise en place de projets collectifs nationaux et internationaux. Il devra bien évidemment travailler dans une dynamique de collaboration inter-équipes au sein du GIPSA-lab.
*• Laboratoire d’accueil :*
Laboratoire Grenoble, Image, Parole, Signal, Automatique GIPSA-lab UMR 5216, site : http://www.gipsa-lab.inpg.fr/
*• Contacts :*
– Enseignement :
o Marinette.Matthey@u-grenoble3.fr pour le niveau M
o Francoise.Boch@u-grenoble3.fr pour le niveau L
– Recherche :
o Gerard.Bailly@gipsa-lab.grenoble-inp.fr

Back to Top

6-25 . (2009-09-28) Researcher in expressive speech synthesis DFKI, Kaiserslautern, Germany

Job offer: Researcher in expressive speech synthesis

The German Research Center for Artificial Intelligence (DFKI GmbH), with sites in Kaiserslautern, Saarbrücken, Bremen and Berlin is the leading German research institute in the field of innovative software technology.

DFKI's Language Technology Lab is looking for a Researcher to work in either Saarbrücken or Berlin in the DFG-funded project PAVOQUE ( PArametrisation of prosody and VOice QUality for concatenative speech synthesis in view of Emotion expression). The contract should start on 1 November 2009 or 1 December 2009, and is limited to the project duration of one year.

Main Tasks

  • Develop and extend speech synthesis technologies in the speech synthesis system MARY TTS, in view of the realisation of prosody and voice quality modifications in unit selection;

  • Develop and apply algorithms to annotate prosody and voice quality in expressive speech synthesis corpora;

  • Carry out a listener evaluation study of expressive synthetic speech.

Profile

The ideal candidate holds a PhD, or is close to finishing a PhD, in a relevant topic area such as speech signal processing or computer science. The candidate must have demonstrable experience with programming algorithms of unit selection synthesis, speech signal processing and/or voice conversion, and should have experience with Java programming. Knowledge in the area of planning, carrying out and evaluating perception tests would be a plus. Highly valued personal qualities include creativity, open-mindedness, team spirit, and a willingness to address novel challenges. Fluency in English as a working language is required.

For more information about MARY TTS, see http://mary.dfki.de.

Contact for Questions

Dr. Marc Schröder
Tel. +49-681-302-5303
http://dfki.de/~schroed
marc.schroeder@dfki.de

Closing date: 20 October 2009

Please send your electronic application with all usual documents to: lt-jobs@dfki.de

  
Back to Top

6-26 . (2009-10-01) PhD at the National Center for Biometric Studies- Univ. Canberra Australia

The Faculty of Information Sciences and Engineering of the University of Canberra is offering a top-up stipend of $7,000 per annum for a student undertaking a PhD thesis in the National Centre for Biometric Studies. The project is related to the Thinking Head Project (www.canberra.edu.au/faculties/ise/ncbs/thinkinghead) and will be in one of the following research areas:  *	Automatic Speech Recognition (ASR) or Audio-Video Speech Recognition (AVSR) *	Speaker Recognition / Verification / Authentication *	Face Recognition or Facial Feature Tracking *	Speaker Characterisation or Facial Expression Recognition *	Affective Computing / Affective Sensing *	Multimodal Human-Computer Interaction (MM-HCI) *	Pattern Recognition / Multimodal Fusion Algorithms  The stipend is available for up to 3 years either to an Australian-resident student having gained an APA place or to an international student having gained a scholarship for international students in the Faculty of ISE commencing in 2010 (http://www.canberra.edu.au/research-students/scholarships/).   Further information from Prof. Michael Wagner (michael.wagner@canberra.edu.au) or Dr Roland Goecke (roland.goecke@canberra.edu.au). 
Back to Top

6-27 . (2009-10-06) ASSISTANT PROFESSOR, Department of Linguistics, University of Washington.

ASSISTANT PROFESSOR, Department of Linguistics, University of Washington. A tenure-track appointment is intended in the area of computational linguistics beginning September 2010 associated with the professional MA program and PhD track in Computational Linguistics. University of Washington faculty engage in teaching, research and service; the successful applicant will teach graduate and undergraduate courses, supervise student research, and develop a high-impact research program. This position is full-time (100% FTE), with a 9-month service period.  Applicants should have a Ph.D. degree in Linguistics, Computer Science, or related field and be highly qualified for undergraduate and graduate teaching and independent research. All qualified candidates are encouraged to apply.  However, we are particularly interested in scholars active in the areas of machine learning, speech technology, computational semantics and dialogue systems.   The ideal candidate will complement and build on existing strengths within the department, and will be eager to interact with students and faculty from the broader linguistics and language processing community at the University of Washington.  The University of Washington is an affirmative action, equal opportunity employer. The University is building a culturally diverse faculty and staff and strongly encourages applications from women, minorities, individuals with disabilities and covered veterans. The University of Washington, a recipient of the 2006 Alfred P. Sloan award for Faculty Career Flexibility, is committed to supporting the work-life balance of its faculty.  Applications, including a curriculum vitae, statement of research and teaching interests, and three letters of recommendation, should be sent to Prof. Emily M. Bender, Chair, Computational Linguistics Search Committee, Department of Linguistics, University of Washington, Box 354340, Seattle, WA 98195-4340.  Questions regarding the position can be directed to ebender -at- uw.edu. Priority will be given to applications received before November 20, 2009.  Please include your email address.
Back to Top

6-28 . (2009-10-06) Information Technology Support Unit (ITS) Unit at the Directorate General for Translation at the European Parliament is offering a paid 5-month traineeship programme (Schuman Traineeships - general option) in the areas of Language Technology Research and Development and Communication.


The Information Technology Support Unit (ITS) Unit at the Directorate General for Translation at the European Parliament is offering a paid 5-month traineeship programme (Schuman Traineeships - general option) in the areas of Language Technology Research and Development and Communication.
 
Entity and location: Information Technology Support (ITS) Unit of the Directorate General for Translation at the European Parliament in Luxembourg.
 
Requirements:
 
The ITS Unit is looking for candidates for a traineeship in 2 different teams:
 
1) Research and Development Team 
 
a) Computanional Linguist or Research Engineer with an interest in NLP and translation technologies.
b) Translation or Communication Graduate with an interest in translation technologies
 
2) Communication Team
 
a) Graduates with a degree in IT, Web publishing or equivalent
b) Graduates with a degree in Journalism, Communication or equivalent.
 
Detailed profile descriptions (4) are attached to this e-mail.
 
General requirements:
 
1) Be a national of a Member State of the European Union or of an applicant country (derogation possible).
2) Have a thorough knowledge of one of the official languages of the European Union and a good knowledge of a second (English or French)
3) Not have been awarded any other paid traineeship, or have been in paid employment for more than 4 consecutive weeks, with a European Institution or a Member or political group of the European Parliament.
4) Have obtained, before the deadline for applications, a university degree after a course of study of at least three years’ duration;
5) Submit a written reference from a university lecturer or from a professional person who is able to give an objective assessment of the applicant’s aptitudes.
6) have produced a substantial written paper, as part of the requirements for a university degree or for a scientific journal
 
You can find all rules governing traineeships on:   http://www.europarl.europa.eu/pdf/traineeships/general_rules_en.pdf
 
Application procedure:
 
Applications for traineeships starting on 1 March 2010 are accepted until 15 October 2009 (midnight).
 
To find more information, conditions for admission and online application form on this page:
 
 
To indicate in the application form that you would be interested in a traineeship in ITS, please fill in:
 
a) in point 6 "Other" of the application form in subpoint "Aim of traineeship" in "other" that you are interested in a traineeship at the Information Technology Support Unit at Directorate General for Translation
b) in point 6 "Areas of interest" select appropriate areas of interest (e.g. information technology, engineering/technology, multimedia, communications)
c) in point 6 "Department - preference": Directorate General for Translation
 The Information Technology Support Unit (ITS) Unit at the Directorate General for Translation at the European Parliament is offering a paid 5-month traineeship programme (Schuman Traineeships - general option) in the areas of Language Technology Research and Development and Communication.
 
Entity and location: Information Technology Support (ITS) Unit of the Directorate General for Translation at the European Parliament in Luxembourg.
 
Requirements:
 
The ITS Unit is looking for candidates for a traineeship in 2 different teams:
 
1) Research and Development Team 
 
a) Computanional Linguist or Research Engineer with an interest in NLP and translation technologies.
b) Translation or Communication Graduate with an interest in translation technologies
 
2) Communication Team
 
a) Graduates with a degree in IT, Web publishing or equivalent
b) Graduates with a degree in Journalism, Communication or equivalent.
 
Detailed profile descriptions (4) are attached to this e-mail.
 
General requirements:
 
1) Be a national of a Member State of the European Union or of an applicant country (derogation possible).
2) Have a thorough knowledge of one of the official languages of the European Union and a good knowledge of a second (English or French)
3) Not have been awarded any other paid traineeship, or have been in paid employment for more than 4 consecutive weeks, with a European Institution or a Member or political group of the European Parliament.
4) Have obtained, before the deadline for applications, a university degree after a course of study of at least three years’ duration;
5) Submit a written reference from a university lecturer or from a professional person who is able to give an objective assessment of the applicant’s aptitudes.
6) have produced a substantial written paper, as part of the requirements for a university degree or for a scientific journal
 
You can find all rules governing traineeships on:   http://www.europarl.europa.eu/pdf/traineeships/general_rules_en.pdf
 
Application procedure:
 
Applications for traineeships starting on 1 March 2010 are accepted until 15 October 2009 (midnight).
 
To find more information, conditions for admission and online application form on this page:
 
 
To indicate in the application form that you would be interested in a traineeship in ITS, please fill in:
 
a) in point 6 "Other" of the application form in subpoint "Aim of traineeship" in "other" that you are interested in a traineeship at the Information Technology Support Unit at Directorate General for Translation
b) in point 6 "Areas of interest" select appropriate areas of interest (e.g. information technology, engineering/technology, multimedia, communications)
c) in point 6 "Department - preference": Directorate General for Translation
 

Back to Top

6-29 . (2009-10-07) Post-Docs at HLY Center of Excellence, Johns Hopkins University

Johns Hopkins University
Human Language Technology Center of Excellence
Post-Docs, Research Staff, Sabbaticals


The National Human Language Technology Center of Excellence (COE) at Johns Hopkins University is seeking to hire a few outstanding junior and senior researchers in the field of speech and natural language processing. Positions include research staff, sabbaticals and post-docs. The COE, located by Johns Hopkins's main campus in Baltimore, Maryland, conducts long-term research on fundamental challenges that are critical for real-world problems.

Candidates should have a strong background in one of the following areas:

SPEECH PROCESSING:
Robust speech recognition and information extraction (multiple languages, genres, and channels, limited resources)

NATURAL LANGUAGE PROCESSING:
Information extraction, knowledge distillation, machine translation, etc.

MACHINE LEARNING:
Large-scale learning, transfer-learning, semi-supervised, data-mining, etc.

Applicants must have a Ph.D. in CS, ECE, or a related field. Directions for applications can be found at: http://www.hltcoe.org/opportunities.html

Applications for postdoctoral and junior research scientist positions should apply by January 4, 2010 for full consideration.

Note: Although researchers are expected to publish in open peer-reviewed venues, the position requires a security clearance. Security clearances require U.S. citizenship; the COE will seek a clearance for researchers without an existing clearance.

 
Back to Top

6-30 . (2009-10-08) Post-doc position in speech recognition/modeling at TTI-Chicago

### Post-doc position in speech recognition/modeling at TTI-Chicago ###     A post-doc position is available at TTI-Chicago.  It includes opportunities for work on articulatory modeling, graphical models, discriminative learning, large-scale data analysis, and multi-modal (e.g. audio-visual) modeling.  The post-doc will be mainly working with Karen Livescu, and will interact with collaborators Jeff Bilmes (U. Washington), Eric Fosler-Lussier (Ohio State U.), and Mark Hasegawa-Johnson (U. Illinois at Urbana-Champaign). To apply, or for additional information, please contact Karen Livescu at klivescu@uchicago.edu.  There is also an opportunity for a shorter-term post-doc project on annotation of speech at the articulatory level. Please contact klivescu@uchicago.edu for more details.
Back to Top

7 . Journals

 
Back to Top

7-1 . IEEE Special Issue on Speech Processing for Natural Interaction with Intelligent Environments

Call for Papers IEEE Signal Processing Society IEEE Journal of Selected Topics in Signal Processing  Special Issue on Speech Processing for Natural Interaction                   with Intelligent Environments  With the advances in microelectronics, communication technologies and smart materials, our environments are transformed to be increasingly intelligent by the presence of robots, bio-implants, mobile devices, advanced in-car systems, smart house appliances and other professional systems. As these environments are integral parts of our daily work and life, there is a great interest in a natural interaction with them. Also, such interaction may further enhance the perception of intelligence. "Interaction between man and machine should be based on the very same concepts as that between humans, i.e. it should be intuitive, multi-modal and based on emotion," as envisioned by Reeves and Nass (1996) in their famous book "The Media Equation". Speech is the most natural means of interaction for human beings and it offers the unique advantage that it does not require carrying a device for using it since we have our "device" with us all the time.  Speech processing techniques are developed for intelligent environments to support either explicit interaction through message communications, or implicit interaction by providing valuable information about the physical ("who speaks when and where") as well as the emotional and social context of an interaction. Challenges presented by intelligent environments include the use of distant microphone(s), resource constraints and large variations in acoustic condition, speaker, content and context. The two central pieces of techniques to cope with them are high-performing "low-level" signal processing algorithms and sophisticated "high-level" pattern recognition methods.  We are soliciting original, previously unpublished manuscripts directly targeting/related to natural interaction with intelligent environments. The scope of this special issue includes, but is not limited to:  * Multi-microphone front-end processing for distant-talking interaction * Speech recognition in adverse acoustic environments and joint          optimization with array processing * Speech recognition for low-resource and/or distributed computing          infrastructure * Speaker recognition and affective computing for interaction with          intelligent environments * Context-awareness of speech systems with regard to their applied          environments * Cross-modal analysis of speech, gesture and facial expressions for          robots and smart spaces * Applications of speech processing in intelligent systems, such as          robots, bio-implants and advanced driver assistance systems.  Submission information is available at http://www.ece.byu.edu/jstsp. Prospective authors are required to follow the Author's Guide for manuscript preparation of the IEEE Transactions on Signal Processing at http://ewh.ieee.org/soc/sps/tsp. Manuscripts will be peer reviewed according to the standard IEEE process.  Manuscript submission due:    		 		  		 		  Jul. 3, 2009 First review completed:       		 		  		 		  Oct. 2, 2009 Revised manuscript due:      		 		  		 		  Nov. 13, 2009 Second review completed:      		 		  		 		  Jan. 29, 2010 Final manuscript due:         		 		  		 		  Mar. 5, 2010  Lead guest editor:         Zheng-Hua Tan, Aalborg University, Denmark             zt@es.aau.dk  Guest editors:         Reinhold Haeb-Umbach, University of Paderborn, Germany             haeb@nt.uni-paderborn.de         Sadaoki Furui, Tokyo Institute of Technology, Japan             furui@cs.titech.ac.jp         James R. Glass, Massachusetts Institute of Technology, USA             glass@mit.edu         Maurizio Omologo, FBK-IRST, Italy             omologo@fbk.eu
Back to Top

7-2 . Special issue "Speech as a Human Biometric: I know who you are from your voice" Int. Jnl Biometrics

International Journal of Biometrics  (IJBM)
 
Call For papers
 
Special Edition on: "Speech as a Human Biometric: I Know Who You Are From Your Voice!"
 
Guest Editors: 
Dr. Waleed H. Abdulla, The University of Auckland, New Zealand
Professor Sadaoki Furui, Tokyo Institute of Technology, Japan
Professor Kuldip K. Paliwal, Griffith University, Australia
 
 
The 2001 MIT Technology Review indicated that biometrics is one of the emerging technologies that will change the world. Human biometrics is the automated recognition of a person using adherent distinctive physiological and/or involuntary behavioural features.
 
Human voice biometrics has gained significant attention in recent years. The ubiquity of cheap microphones, human identity information carried by voice, ease of deployment, natural use, telephony applications diffusion, and non-obtrusiveness have been significant motivations for developing biometrics based on speech signals. The robustness of speech biometrics is sufficiently good. However, there are significant challenges with respect to conditions that cannot be controlled easily. These issues include changes in acoustical environmental conditions, respiratory and vocal pathology, age, channel, etc. The goal of speech biometric research is to solve and/or mitigate these problems.
 
This special issue will bring together leading researchers and investigators in speech research for security applications to present their latest successes in this field. The presented work could be new techniques, review papers, challenges, tutorials or other relevant topics.
 
   Subject Coverage
 
Suggested topics include, but are not limited to:
 
Speech biometrics
Speaker recognition
Speech feature extraction for speech biometrics
Machine learning techniques for speech biometrics
Speech enhancement for speech biometrics
Speech recognition for speech biometrics
Speech changeability over age, health condition, emotional status, fatigue, and related factors
Accent, gender, age and ethnicity information extraction from speech signals
Speech watermarking
Speech database security management
Cancellable speech biometrics
Voice activity detection
Conversational speech biometrics
   Notes for Prospective Authors
 
Submitted papers should not have been previously published nor be currently under consideration for publication elsewhere
 
All papers are refereed through a peer review process. A guide for authors, sample copies and other relevant information for submitting papers are available on the Author Guidelines page
 
   Important Dates
 
Manuscript due: 15 June, 2009
 
Acceptance/rejection notification: 15 September, 2009
 
Final manuscript due: 15 October, 2009
 
For more information please go to Calls for Papers page (http://www.inderscience.com/callPapers.php) OR The IJBM home page (http://www.inderscience.com/ijbm).
 
 
Back to Top

7-3 . Special on Voice transformation IEEE Trans ASLP


						
CALL FOR PAPERS
IEEE Signal Processing Society
IEEE Transactions on Audio, Speech and Language Processing
Special Issue on Voice Transformation
With the increasing demand for Voice Transformation in areas such as
speech synthesis for creating target or virtual voices, modeling various
effects (e.g., Lombard effect), synthesizing emotions, making more natural
dialog systems which use speech synthesis, as well as in areas like
entertainment, film and music industry, toys, chat rooms and games, dialog
systems, security and speaker individuality for interpreting telephony,
high-end hearing aids, vocal pathology and voice restoration, there is a
growing need for high-quality Voice Transformation algorithms and systems
processing synthetic or natural speech signals.
Voice Transformation aims at the control of non-linguistic information of
speech signals such as voice quality and voice individuality. A great deal
of interest and research in the area has been devoted to the design and
development of mapping functions and modifications for vocal tract
configuration and basic prosodic features.
However, high quality Voice Transformation systems that create effective
mapping functions for vocal tract, excitation signal, and speaking style
and whose modifications take into account the interaction of source and
filter during voice production, are still lacking.
We invite researchers to submit original papers describing new approaches
in all areas related to Voice Transformation including, but not limited to,
the following topics:
* Preprocessing for Voice Transformation
(alignment, speaker selection, etc.)
* Speech models for Voice Transformation
(vocal tract, excitation, speaking style)
* Mapping functions
* Evaluation of Transformed Voices
* Detection of Voice Transformation
* Cross-lingual Voice Transformation
* Real-time issues and embedded Voice Transformation Systems
* Applications
The call for paper is also available at:
Prospective authors are required to follow the Information for Authors for
manuscript preparation of the IEEE Transactions on Audio, Speech, and
Language Processing Signal Processing at
Manuscripts will be peer reviewed according to the standard IEEE process.
Schedule:
Submission deadline: May 10, 2009
Notification of acceptance: September 30, 2009
Final manuscript due: October 30, 2009
Publication date: January 2010
Lead Guest Editor:
Yannis Stylianou, University of Crete, Crete, Greece
Guest Editors:
Tomoki Toda, Nara Inst. of Science and Technology, Nara, Japan
Chung-Hsien Wu, National Cheng Kung University, Tainan, Taiwan
Alexander Kain, Oregon Health & Science University, Portland Oregon, USA
Olivier Rosec, Orange-France Telecom R&D, Lannion, France


					
Back to Top

7-4 . Special Issue on Statistical Learning Methods for Speech and Language Processing

IEEE Signal Processing Society
IEEE Journal of Selected Topics in Signal Processing
Special Issue on Statistical Learning Methods for Speech and
Language Processing
In the last few years, significant progress has been made in both
research and commercial applications of speech and language
processing. Despite the superior empirical results, however, there
remain important theoretical issues to be addressed. Theoretical
advancement is expected to drive greater system performance
improvement, which in turn generates the new need of in-depth
studies of emerging novel learning and modeling methodologies. The
main goal of this special issue is to fill in the above need, with
the main focus on the fundamental issues of new emerging approaches
and empirical applications in speech and language processing.
Another focus of this special issue is on the unification of
learning approaches to speech and language processing problems. Many
problems in speech processing and in language processing share a
wide range of similarities (despite conspicuous differences), and
techniques in speech and language processing fields can be
successfully cross-fertilized. It is of great interest to study
unifying modeling and learning approaches across these two fields.
The goal of this special issue is to bring together a diverse but
complementary set of contributions on emerging learning methods for
speech processing, language processing, as well as unifying
approaches to problems across the speech and language processing
fields.
We invite original and unpublished research contributions in all
areas relevant to statistical learning, speech processing and
natural language processing. The topics of interest include, but are
not limited to:
• Discriminative learning methods and applications to speech and language processing
• Unsupervised/semi-supervised learning algorithms for Speech and language processing
• Model adaptation to new/diverse conditions
• Multi-engine approaches for speech and language processing
• Unifying approaches to speech processing and/or language processing
• New modeling technologies for sequential pattern recognition
for information on paper submission. Manuscripts should be submitted
using the Manuscript Central system at http://mc.manuscriptcentral.com/jstsp-ieee.
Manuscripts will be peer reviewed according to the standard IEEE process.
Manuscript submission due: Aug. 7, 2009
First review completed: Oct. 30, 2009
Revised manuscript due: Dec. 11, 2009
Second review completed: Feb. 19, 2010
Final manuscript due: Mar. 26, 2010
Lead guest editor:
Xiaodong He, Microsoft Research, Redmond (WA), USA, xiaohe@microsoft.com
Guest editors:
Li Deng, Microsoft Research, Redmond (WA), USA, deng@microsoft.com
Roland Kuhn, National Research Council of Canada, Gatineau (QC), Canada, roland.kuhn@cnrc-nrc.gc.ca
Helen Meng, The Chinese University of Hong Kong, Hong Kong, hmmeng@se.cuhk.edu.hk
Samy Bengio, Google Inc., Mountain View (CA), USA, bengio@google.com 
Back to Top

7-5 . SPECIAL ISSUE OF SPEECH COMMUNICATION: Perceptual and Statistical Audition

Perceptual and Statistical Audition
 
To give authors a bit more time we extended the deadline for the call for papers for the special issue of Speech Communication to the 27th July, 2009
 
See the call for papers below for more details.
 
Aims and Scope
Current trends in audio analysis are strongly founded in statistical principles, or on approaches that are influenced by empirically derived, or perceptually motivated rules of auditory perception. These approaches are perceived as orthogonal, but new ideas that draw upon both perceptual and statistical principles can often result in superior performance. The relationship of these two approaches however, has not been thoroughly explored and is still a developing field of research.
In this special issue we invite researchers to submit papers on original and previously unpublished work on both approaches, and especially on hybrid techniques that combine perceptual and statistical principles, as applied to speech, music and audio analysis.  Recent advances in neurosciences have emphasized the important role of spectro-temporal modulations in human perception. We encourage submission of original and previously unpublished work on techniques that exploit the information in spectro-temporal modulations, particularly within a statistical framework.
Papers describing relevant research and new concepts are solicited on, but not limited to, the following topics:
 
 - Analysis of audio including speech and music
 - Audio classification
 - Speech recognition
 - Signal separation
 - Multi-channel analysis
 - Computational Auditory Scene Analysis  (CASA)
 - Spectro-temporal modulation methods
 - Perceptual aspects of statistical algorithms, such as Independent Component Analysis and Non-negative Matrix Factorization.
 - Hybrid methods that use CASA-like cues in a statistical framework
 
Guest Editors
Martin Heckmann Bhiksha Raj Paris Smaragdis
Honda Research Institute Europe Carnegie Mellon University Adobe Advanced Technology Labs
63073 Offenbach a. M., Germany Pittsburgh, PA 15217 Newton, MA 02446
 
NEW DEADLINE
Papers due 27th July, 2009
 
Submission Guidelines
Authors should consult the "Guide for Authors", available online, at http://www.elsevier.com/locate/specom for information about the preparation of their manuscripts. Authors, please submit your paper via http://ees.elsevier.com/specom, choosing "Perceptual and Statistical Audition" as the Article Type. If you are a first time user of the system, please register yourself as an author. 
 
Back to Top

8 . Future Speech Science and Technology Events

8-1 . (2009-10-18) 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

Call for Papers

2009 IEEE Workshop on Applications of Signal Processing to Audio and

Acoustics

 

Mohonk Mountain House

New Paltz, New York

October 18-21, 2009

http://www.waspaa2009.com

 

The 2009 IEEE Workshop on Applications of Signal Processing to Audio and

Acoustics (WASPAA'09) will be held at the Mohonk Mountain House in New

Paltz, New York, and is sponsored by the Audio & Electroacoustics committee

of the IEEE Signal Processing Society. The objective of this workshop is to

provide an informal environment for the discussion of problems in audio and

acoustics and the signal processing techniques leading to novel solutions.

Technical sessions will be scheduled throughout the day. Afternoons will be

left free for informal meetings among workshop participants.

 

Papers describing original research and new concepts are solicited for

technical sessions on, but not limited to, the following topics:

 

* Acoustic Scenes

- Scene Analysis: Source Localization, Source Separation, Room Acoustics

- Signal Enhancement: Echo Cancellation, Dereverberation, Noise Reduction,

Restoration

- Multichannel Signal Processing for Audio Acquisition and Reproduction

- Microphone Arrays

- Eigenbeamforming

- Virtual Acoustics via Loudspeakers

 

* Hearing and Perception

- Auditory Perception, Spatial Hearing, Quality Assessment

- Hearing Aids

 

* Audio Coding

- Waveform Coding and Parameter Coding

- Spatial Audio Coding

- Internet Audio

- Musical Signal Analysis: Segmentation, Classification, Transcription

- Digital Rights

- Mobile Devices

 

* Music

- Signal Analysis and Synthesis Tools

- Creation of Musical Sounds: Waveforms, Instrument Models, Singing

- MEMS Technologies for Signal Pick-up

 

 

Submission of four-page paper: April 15, 2009

Notification of acceptance: June 26, 2009

Early registration until:  September 1, 2009

 

Workshop Committee

 

General Co-Chair:

Jacob Benesty

Université du Québec

INRS-EMT

Montréal, Québec, Canada

benesty@emt.inrs.ca

 

General Co-Chair:

Tomas Gaensler

mh acoustics

Summit, NJ, USA

tfg@mhacoustics.com

 

Technical Program Chair:

Yiteng (Arden) Huang

WeVoice Inc.

Bridgewater, NJ, USA

arden_huang@ieee.org

 

Technical Program Chair:

Jingdong Chen

Bell Labs

Alcatel-Lucent

Murray Hill, NJ, USA

jingdong@research.bell-labs.com

 

Finance Chair:

Michael Brandstein

Information Systems

Technology Group

MIT Lincoln Lab

Lexington, MA, USA

msb@ll.mit.edu

 

Publications Chair:

Eric J. Diethorn

Multimedia Technologies

Avaya Labs Research

Basking Ridge, NJ, USA

ejd@avaya.com

 

Publicity Chair:

Sofiène Affes

Université du Québec

INRS-EMT

Montréal, Québec, Canada

affes@emt.inrs.ca

 

Local Arrangements Chair:

Heinz Teutsch

Multimedia Technologies

Avaya Labs Research

Basking Ridge, NJ, USA

teutsch@avaya.com

 

Far East Liaison:

Shoji Makino

NTT Communication Science

Laboratories, Japan

maki@cslab.kecl.ntt.co.jp

Back to Top

8-2 . (2009-10-23) ACM Multimedia 2009 Workshop Searching Spontaneous Conversational Speech (SSCS 2009)

Call for Papers
----------------------------
ACM Multimedia 2009 Workshop
Searching Spontaneous Conversational Speech (SSCS 2009)
October 23, 2009
Beijing, China
----------------------------
http://ict.ewi.tudelft.nl/SSCS2009/

Multimedia content often contains spoken audio as a key component. Although speech is generally acknowledged as the quintessential carrier of semantic information, spoken audio remains underexploited by multimedia retrieval systems. In particular, the potential of speech technology to improve information access has not yet been successfully extended beyond multimedia content containing scripted speech, such as broadcast news. The SSCS 2009 workshop is dedicated to fostering search research based on speech technology as it expands into spoken content domains involving non-scripted, less-highly conventionalized, conversational speech characterized by wide variability of speaking styles and recording conditions. Such domains include podcasts, video diaries, lifelogs, meetings, call center recordings, social video networks, Web TV, conversational broadcast, lectures, discussions, debates, interviews and cultural heritage archives. This year we are setting a particular focus on the user and the use of speech techniques and technology in real-life multimedia access systems and have chosen the theme "Speech technology in the multimedia access framework."

The development of robust, scalable, affordable approaches for accessing multimedia collections with a spoken component requires the sustained collaboration of researchers in the areas of speech recognition, audio processing, multimedia analysis and information retrieval. Motivated by the aim of providing a forum where these disciplines can engage in productive interaction and exchange, Searching Spontaneous Conversational Speech (SSCS) workshops were held in conjunction with SIGIR 2007 in Amsterdam and with SIGIR 2008 in Singapore. The SSCS workshop series continues with SSCS 2009 held in conjunction with ACM Multimedia 2009 in Beijing. This year the workshop will focus on addressing the research challenges that were identified during SSCS 2008: Integration, Interface/Interaction, Scale/Scope, and Community.

We welcome contributions on a range of trans-disciplinary issues related to these research challenges, including:

***Integration***
-Information retrieval techniques based on speech analysis (e.g., applied to speech recognition lattices)
-Search effectiveness (e.g., evidence combination, query/document expansion)
-Self-improving systems (e.g., unsupervised adaptation, recursive metadata refinement)
-Exploitation of audio analysis (e.g., speaker emotional state, speaker characteristics, speaking style)
-Integration of higher-level semantics, including cross-modal concept detection
-Combination of indexing features from video, text and speech

***Interface/Interaction***
-Surrogates for representation or browsing of spoken content
-Intelligent playback: exploiting semantics in the media player
-Relevance intervals: determining the boundaries of query-related media segments
-Cross-media linking and link visualization deploying speech transcripts

***Scale/Scope***
-Large-scale speech indexing approaches (e.g., collection size, search speed)
-Dealing with collections containing multiple languages
-Affordable, light-weight solutions for small collections, i.e., for the long tail

***Community***
-Stakeholder participation in design and realization of real world applications
-Exploiting user contributions (e.g., tags, ratings, comments, corrections, usage information, community structure)

Contributions for oral presentations (8-10 pages) poster presentations (2 pages), demonstration descriptions (2 pages) and position papers for selection of panel members (2 pages) will be accepted. Further information including submission guidelines is available on the workshop website: http://ict.ewi.tudelft.nl/SSCS2009/

Important Dates:
Monday, June 1, 2009 Submission Deadline
Saturday, July 4, 2009 Author Notification
Friday, July 17, 2009 Camera Ready Deadline
Friday, October 23, 2009 Workshop in Beijing

For more information: m.a.larson@tudelft.nl
SSCS 2009 Website: http://ict.ewi.tudelft.nl/SSCS2009/
ACM Multimedia 2009 Website: http://www.acmmm09.org

On behalf of the SSCS2009 Organizing Committee:
Martha Larson, Delft University of Technology, The Netherlands
Franciska de Jong, University of Twente, The Netherlands
Joachim Kohler, Fraunhofer IAIS, Germany
Roeland Ordelman, Sound & Vision and University of Twente, The Netherlands
Wessel Kraaij, TNO and Radboud University, The Netherlands


						
Back to Top

8-3 . (2009-11-01) NLP Approaches for Unmet Information Needs in Health Care

NLP Approaches for Unmet Information Needs in Health Care

(http://www.uwm.edu/~hongyu/files/BIBM.workshop.html)

 

A workshop of IEEE International Conference on Bioinformatics and

Biomedicine 2009, Washington DC

 

As the amount of literature and other information in the biomedical

field continues to grow at a rapid rate, researchers in the health

care community dependent on computers to find the best answers for

meeting their information needs. Traditionally, information needs have

been simply represented as a set of queries. Recently, there have been

growing research efforts addressing these needs with natural languageapproaches. Although great strides have been made in producing

 

valuable biomedical databases, more work needs to be done to develop

computational approaches that enable users to search multiple

databases, which often comprise a variety of formats, including

journal articles, clinical guidelines, and electronic health care

records. Therefore, the task at hand is to develop natural language

systems that can understand the queries or complex questions being

asked, interpret the different resources that could be used to answer

the question, extract relevant information, and summarize this

information to meet user needs, and data mine the structured data for

clinical decision support. This workshop will explore a broad range of

traditional NLP approaches and emerging new methods, and the variety

of challenges that need to be overcome with respect to these issues.

 

Some specific topics include:

 

   * Clinical information needs

   * Clinical terminology and coding clinical data

   * Annotation and machine learning

   * Healthcare, domain-specific adaption of open-domain NLP techniques

   * Information extraction from electronic health records

   * Data mining of electronic health records

   * NLP approaches that involve with image and video

   * Automatic speech recognition for the healthcare domain

   * Spoken clinical question answering

 

 

Paper submission: http://kis-lab.com/cyberchair/bibm09/cbc_index.html

Timeline:

 August 10, 2009: Due date for full workshop papers submission

 September 10, 2009: Notification of paper acceptance to authors

 September 17, 2009: Camera-ready of accepted papers

 November 1-4, 2009: Workshops

 

Organizers:

 

Workshop co-chairs:

 

Hong Yu, PhD, University of Wisconsin-Milwaukee

Dilek Hakkani-Tür, PhD, International Computer Science Institute

John Ely, MD University of Iowa

Lyle Ungar, PhD, University of Pennsylvania

 

Workshop PC members:

 

Eugene Agichtein, Emory University

Alan Aronson, NLM

James Cimino, NIH

Kevin Cohen, University of Colorado

Nigel Collier, National Institute of Informatics, Japan

Chris Chute, Mayo Clinic

Dina Demner Fushman, NLM

Bob Futrelle, Northeastern University

Henk Harkema, University of Pittsburgh

Lynette Hirschman, MITRE

Susan McRoy, University of Wisconsin

Serguei Pakhomov, University of Minnesota

Tim Patrick, University of Wisconsin

Thomas Rindflesch, NLM

Pete White, Children's Hospital of Philadelphia

John Wilbur, NLM

Pierre Zweigenbaum, LIMSI

Back to Top

8-4 . (2009-11-02) Eleventh International Conference on Multimodal Interfaces and Workshop on Machine Learning for Multi-modal Interaction

The Eleventh International Conference on Multimodal Interfaces and Workshop
on Machine Learning for Multi-modal Interaction will jointly take place in the Boston area during November 2-6, 2009.

The main aim of ICMI-MLMI 2009 is to further scientific research within the broad field of multimodal interaction, methods and systems. The joint conference will focus on major trends and challenges in this area, and work to identify a roadmap for future research and commercial success. ICMI-MLMI 2009 will feature a single-track main conference with keynote speakers, panel discussions, technical paper presentations, poster sessions, and demonstrations of state of the art multimodal systems and concepts. It will be followed by workshops.

The conference will take place at the MIT Media Lab, widely known for its innovative spirit. Organized in Cambridge, Massachusetts, USA, ICMI-MLMI 09 provides an excellent setting for brainstorming and sharing the latest advances in multimodal interaction, systems and methods in an inspired setting of a city, known as one of the top historical, technological and scientific centers of the US.


Program committees:

James Crowley, INRIA
Yuri Ivanov, MERL
Christopher Wren, Google
Daniel Gatica-Perez, Idiap Research Institute
Michael Johnston, AT&T Research
Rainer Stiefelhagen University of Karlsruhe
Janet McAndless, MERL
Hervé Bourlard, Idiap Research Institute
Rana el Kaliouby, MIT Media Lab
Matthew Berlin, MIT Media Lab
Clifton Forlines, MERL
Deb Roy, MIT Media Lab
Thanks to Cole Krumbholz, MITRE
Sonya Allin, University of Toronto
Yang Liu, University of Texas at Dallas
Louis-Philippe Morency, University of South California
Xilin Chen, JDL
Steve Renals, University of Edinburgh
Denis Lalanne, University of Fribourg
Enrique Vidal, Polytechnic University of Valencia
Kenji Mase, University of Nagoya

ICMI Advisory Board

Matthew Turk, Chair, UC Santa Barbara (USA)
Jim Crowley, INRIA-Rhone Alpes (France)
Trevor Darrell, MIT (USA)
Kenji Mase, University of Nagoya (Japan)
Eric Horvitz, Microsoft Research (USA)
Sharon Oviatt, Adapx (USA)
Fabio Pianesi, ITC-irst (Italy)
Wolfgang Wahlster, DFKI (Germany)
Jie Yang, Carnegie Mellon University (USA)

MLMI Advisory Board

Hervé Bourlard, Idiap Research Institute (Switzerland)
Steve Renals, University of Edinburgh (UK)
Sharon Oviatt, Adapx (USA)
Rainer Stiefelhagen, Universitaet Karlsruhe (Germany)
Jean Carletta, University of Edinburgh (UK)
Catherine Pelachaud, CNRS (France)
Sadaoki Furui, Tokyo Institute of Technology (Japan)
Samy Bengio, Google (USA)
Andrei Popescu-Belis, Idiap Research Institute (Switzerland)

See http://icmi2009.acm.org/ for more information.


The following is a list of co-located workshops.

2nd Workshop on Child, Computer and Interaction
Thursday, 5 November 2009 (Full Day)
More Information: http://wocci2009.fbk.eu/


Workshop on Use of Context in Vision Processing (UCVP)
Thursday, 5 November 2009 (Full Day)
More Information: http://hmi.ewi.utwente.nl/ucvp09


Affect-Aware Virtual Agents and Social Robots (AFFINE)
Friday, 6 November 2009 (Full Day)
More Information: http://homepages.feis.herts.ac.uk/~comqjm/affine/index.html


Multimodal Computing with Mobile Phones: Sensing, Modeling and Sharing
Friday, 6 November 2009 (Morning)


Workshop on Multimodal Sensor-Based Systems for Social Computing
Friday, 6 November 2009 (Afternoon)
More Information: http://web.media.mit.edu

Back to Top

8-5 . (2009-11-05)LRL WORKSHOP: Getting Less-Resourced Languages on-Board! Poznan Poland

LRL WORKSHOP: Getting Less-Resourced Languages on-Board!

 

Name: Getting Less-Resourced Languages on-Board!

 

Date: 5.11.2009, half-day (13h30 – 18h00) + cocktail

 

Theme:

Language Technologies (LT) provide an essential support to the challenge of Multilingualism. In order to develop them, it is necessary to have access to Language Resources (LR) and to assess LT performances. To this regard, the situation is very different across the different languages. Little or sparse data exist for languages in countries or regions where limited efforts have been devoted to such issues in the past, also known as Less-Resourced Languages (LRL). The workshop aims at reporting the needs, at presenting achievements and at proposing solutions for the future, both in terms of LR and of LT evaluation, especially in the European, Euro-Mediterranean and regional frameworks. This will allow to identity the factors that have an impact on a potential and shared roadmap towards supplying LR and LT for all languages.

 

Topics:

-          Experience in the production, validation and distribution of LR for less-resourced languages

-          Experience in the evaluation of LT for less-resourced languages

-          Infrastructures for making available LR and LT in less-resourced languages

-          Alternative approaches (comparable corpora, pivot languages, language clustering…)

-          To be completed…

 

Co-Chairs: Joseph Mariani (LIMSI-CNRS & IMMI-CNRS), Khalid Choukri(ELRA & ELDA), Zygmunt Vetulani (Adam Mickiewicz University, Poznan)

 

 

 

Paper submission deadline: August 15.

 

Sponsors: FLaReNet, ELRA

 

Registrations: as for the general LTC (+ cc to workshop chairs)

 

Fees: inscription fees to the LTC + extra 40 Euros or 80 Euros for the Workshop-only attendees.

 

Paper submission: as for the general LTC(EasyChair) + to the workshop chairs

 

Presentation: publication in the LTC proceedings (paper + CD)

 

Reviewing: up to the workshop chairs + scientific committee

 

Program: The workshop will comprise presentations (including keynote talks) and a panel session, including a EC representative (tentative). In addition, selected speakers will be invited to present their papers to a larger audience at the main LTC conference.

 

 

 

E-mail: ltc@amu.edu.pl

 

WWW: http://www.ltc.amu.edu.pl/

 

Back to Top

8-6 . (2009-11-06)4th LANGUAGE AND TECHNOLOGY CONFERENCE: Human Language Technologies as a challenge Poznan Poland

LTC2009 FlaReNet-LRL2009 Workshop - 1 week reminder for LTC

 

 

 

Call for papers and participation

 

The 4th LANGUAGE AND TECHNOLOGY CONFERENCE: Human Language Technologies as a Challenge

for Computer Science and Linguistics (LTC 2009), a meeting organized by the Faculty of Mathematics and Computer Science of Adam Mickiewicz University, Poznań, Poland in cooperation with the Adam Mickiewicz University Foundation (co-organizer), will take place on November 6-8, 2009.

 

Human Language Technologies (HLT) continue to be a challenge for computer science, linguistics and related fields as these areas become an ever more essential element of our everyday technological environment. Since the very beginning of the Computer and Information  Age these fields have influenced and stimulated each other. The European Union strongly supports HLT under the 7th Framework Program. These efforts as well as technological, social and cultural globalization have created a favorable climate for the intensive exchange of novel ideas, concepts and solutions across initially distant disciplines. We aim at further contributing to this exchange and invite you to join us at LTC in November 2009, as well as at the FlaReNet workshop (LRL 2009) on the theme "Getting Less-Resourced Languages on-Board!".

 

Zygmunt Vetulani

LTC 2009 Chair

vetulani@amu.edu.pl

 

 

CONFERENCE TOPICS

 

The conference topics include the following (the ordering is not significant):

   - electronic language resources and tools,

   - formalisation of natural languages,

   - parsing and other forms of NL processing,

   - computer modelling of language competence,

   - NL user modelling,

   - NL understanding by computers,

   - knowledge representation,

   - man-machine NL interfaces,

   - Logic Programming in Natural Language Processing,

   - speech processing,

   - NL applications in robotics,

   - text-based information retrieval and extraction,

   - question answering,

   - tools and methodologies for developing multilingual systems,

   - translation enhancement tools,

   - corpora-based methods in language engineering,

   - WordNet-like ontologies,

   - methodological issues in HLT,

   - language-specific computational challenges for HLTs (especially for languages other than English),

   - HLT standards,

   - HLTs as a support for foreign language teaching,

   - communicative intelligence,

   - legal issues connected with HLTs (problems and challenges),

   - contribution of HLTs to the Homeland Security problems (technology applications and legal aspects),

   - visionary papers in the field of HLT,

   - HLT's for the Less-Resourced Languages

   - HLT related policies,

   - system prototype presentations.

 

This list is by no means closed and we are open to further proposals. Please do not hesitate to contact us in order to feed us with you suggestions and ideas of how to satisfy your expectation concerning the program. The Program Committee is also open to suggestions concerning accompanying events (workshops, exhibits, panels, etc). Suggestions, ideas and observations may be addressed directly to the LTC Chair by email (vetulani@amu.edu.pl).

 

 

 

 

 

PAPER SUBMISSION

 

The conference accepts papers in English. Papers (5 formatted pages) are due by July 31, 2009 (midnight, any time zone) and should not identify the author(s)in any manner. In order to facilitate submission we have decided to reduce the formatting requirements as much as possible at this stage. Please, however, do observe the following:

 

1. Accepted fonts for texts are Times Roman, Times New Roman. Courier is recommended for program listings. Character size for the main text should be 10 points, with 11 points leading (line spacing).

 

2. Text should be presented in 2 columns, 8,42 cm each with 0,95 cm between columns (gutter).

 

3. The paper size is 5 pages formatted according to (1) and (2) above.

 

4. The use of PDF format is strongly recommended, although MS Word will also be accepted.

 

Detailed guidelines for the final submission of accepted papers will be

published on the conference Web site by September 10, 2009 (acceptance

notification date).

 

 

All submissions are to be made electronically via the LTC 2009 web submission system. Acceptance/rejection notification will be sent by September 1, 2009.

 

 

IMPORTANT DATES/DEADLINES

 

- Deadline for submission of papers for review:  July 31, 2009.

- Acceptance/Rejection notification: September 10, 2009.

- Deadline for submission of final versions of accepted papers: October 1, 2009.

- Conference: November 6-8, 2009.

 

REGISTRATION

 

Only electronic registration will be possible. Details will be provided later on www.ltc.amu.edu.pl.

 

CONFERENCE FEES

 

Non-student participants:

   - Regular registration (payment by October 4, 2009) 160 EURO

   - Late registration (payment after October 4, 2009) 190 EURO

 

Student participants:

   - Regular registration (payment before October 4, 2009)  100 EURO

   - Late registration (payment after October 4, 2009)  120 EURO

 

Extra 40 Euro will be charged for the LRL Workshop participation (5.11.2009, cf below).

 

Student registrations must be accompanied by a proof of full-time student status valid on the payment date. Registrants are requested to scan and e-mail their proof of student status to ltc@amu.edu.pl. The e-mail subject field must have the following format:

   LTC-09-StudentStatus-< Name_of_participant > 

   (e.g. LTC-09-StudentStatus-VETULANI)

 

The conference fee covers:

   - Participation in the scientific programme.

   - Conference materials.

   - Proceedings on CD and paper.

   - Social events (banquet,...).

   - Coffee breaks.

 

PAYMENT

 

The payment methods will be detailed shortly.

Back to Top

8-7 . (2009-11-15) CIARP 2009

CIARP 2009 Third Call for Papers
Chairs
Eduardo Bayro Corrochano CINVESTAV, Mexico
Jan Olof Ecklundh
KTH, Sweden
November 15th-18th 2009, Guadalajara, México Venue: Hotel Misión Carlton
CIARP-IAPR Award for best papers Special Issue in Journal Pattern Recognition Letters
The 14th Iberoamerican Congress on Pattern Recognition (CIARP 2009) will be held in Guadalajara, Jalisco, México. CIARP 2009 is organized by CINVESTAV, Unidad Guadalajara, México, supported by IAPR and sponsored by the Mexican Association for Computer Vision, Neural Computing and Robotics (MACVNR) and other five PR iberoamerican PR societies CIARP 2009, as all the thirteen previous conferences, will be a fruitful forum for the exchange of scientific results and experiences, as well as the sharing of new knowledge, and the increase of the co-operation between research groups in pattern recognition and related areas.
Topics of interests
• Artificial Intelligence Techniques in PR
• Bioinformatics
• Clustering
• Computer Vision
• Data Mining
• DB, Knowledge Bases and Linguistic PR-Tools
• Discrete Geometry
• Clifford Algebra Applications in Perception Action
• Document Processing and Recognition
• Fuzzy and Hybrid Techniques in PR
• Image Coding, Processing and Analysis
• Kernel Machines
• Logical Combinatorial Pattern Recognition
• Mathematical Morphology
• Mathematical Theory of Pattern Recognition
• Natural Language Processing and Recognition
• Neural Networks for Pattern Recognition
• Parallel and Distributed Pattern Recognition
• Pattern Recognition Principles
• Petri Nets
• Robotics and humanoids
• Remote Sensing Applications of PR
• Satellite Image processing and radar
• Gognitive Humanoid Vision
• Shape and Texture Analysis
• Signal Processing and Analysis
• Special Hardware Architectures
• Statistical Pattern Recognition
• Syntactical and Structural Pattern Recognition
• Voice and Speech Recognition
Invited Speakers: Prof. M. Petrou Imp. Coll. UK, Prof. I. Kakadiaris Hou TX Univ., Dr. P. Sturm INRIA, Gr. FR, Prof. W. Kropatsch (TU Wien, AU).
Paper Submission
Prospective authors are invited to contribute to the conference by electronically submitting a full paper in English of no more than 8 pages including illustrations, results and references, and must be presented at the conference in English. The papers should be submitted electronically before June 7th, 2009, through the CIARP 2008 webpage (http://www.gdl.cinvestav.mx/ciarp2009). The papers should be prepared following the instructions from Springer LNCS series. At least one of the authors must have registered for the paper to be published
Workshops/Tutorials: CASI’2009 Intellig. Remote Satellite Imagery & Humanoid Robotics, 4 Tutorials on Texture,CV, PR & Geometric Algebra Applications.
Important Dates
Submission of papers before June 7th, 2009
Notification of acceptance August 1th, 2009 Camera-ready August 21th, 2009
Registration IAPR Members Non-IAPR
Before August 21th , 2008 400 USD 450 USD
After August 21th, 2008 450 USD 500 USD
Extra Conference Dinner 50 USD
Registration fee includes: Proceedings, Ice-break Party, Coffee Breaks, Lunches, Conference Dinner, Tutorials and Cultural Program (1. tour colonial area by night, 2. Latin dance night, 3. folkloric dance spectacle, mariachi traditional concert with superb banquet in colonial romantic garden). Extra: organized tours to Puerto. Vallarta Tequila, archeological places, artisans markets, museums and traditional colonial churches and towns . Contact: ciarp09@gdl.cinvestav.mx
Back to Top

8-8 . (2009-11-15) Entertainment=Emotion (International workshop) Spain

Entertainment=Emotion (International workshop) – From November 15 to 21, 2009 - Centro de Ciencias de Benasque Pedro Pascual (Spanish Pyrenees)

What is the relationship between entertainment and emotions in the consumption of new forms of media?
How does said relationship affect the attitudes, behaviors and thoughts of audiences?
What new emotions are generated by the new forms of interactive entertainment?
How does interactivity affect the emotional experience of entertainment?
What is the importance of morality or aesthetic appreciation in the experience of emotions during the consumption of media entertainment?
What emotions do we consume through new interactive products?
How do the intensity and valence of emotions change our aesthetic perception of entertainment products?
How are we influenced both by the emotions experienced during the processes of media entertainment, and by the perception of entertainment we obtain from experiencing these emotions?
Where are we taken by the emotions that entertain us?
Are there other ways of entertaining ourselves that make us freer?
In what products, and with what characteristics, are emotions stimulated or presented nowadays?

What are the cultural, economic, ideological, sociological, or artistic consequences of experiencing media emotions nowadays?

Does entertaining ourselves essentially mean generating emotions?

These are the kind of questions that will be answered at Entertainment=Emotion (E=E), the first edition of a very special workshop that will be held at the Centro de Ciencias de Benasque Pedro Pascual (CCBPP) from November 15 to 21, co-managed by María Teresa Soto Sanfiel (Department of Audiovisual Communication and Advertising, at the Universitat Autònoma de Barcelona) and Peter Vorderer (Center for Advanced Media Research -CAMeRA-, Free University Amsterdam).

E=E is an international workshop to which researchers, professionals and students of media entertainment are invited. The event, which follows a very special format, far removed from the traditional meetings held in the area, seeks to create the right atmosphere for prominent international researchers, media professionals, creators of content and students to think together about the phenomenon of emotions in the audiovisual consumption of entertainment. Of special interest to E=E are the new forms of interactive entertainment that provoke new emotional experiences among audiences.

The organizers are hoping that such a meeting, in the magnificent setting of the CCBPP, surrounded by beautiful mountains and delightful scenery, will encourage relaxed exchange between researchers and professionals from different traditions and of differing levels of experience, and for this to produce new visions and problems to be investigated, and the inspiration to create content. They also hope to create a permanent meeting point for media entertainment researchers that sets a first class international standard.

The organizers also hope to help generate aesthetic, philosophical, sociological and political discourse on the good for cultural progress signified by the different forms of emotional experience obtained during media entertainment processes. In this sense, the ultimate aim of E=E is to generate knowledge that arouses positive, free and responsible attitudes to the use, and experience, of emotions in relation to the consumption of audiovisual entertainment.

Similarly, the organizers of E=E seek to promote the creation, at the CCBPP, of a high level academic space where professionals and researchers can exchange and experience in situ the emotions produced through exposure to products that can generate emotion in order to entertain.

Finally, the organizers aim to investigate new forms of interactive entertainment and the emotional experiences they generate, and we therefore invite any developments to be exhibited or presented.

The application period for the presentation of communications, reports, presentations, exhibitions, audiovisual performances and poster session ends on September 7. The candidatures will be evaluated by a scientific committee, after which a list of those accepted will be published. For more information about the requirements for participation, please visit the website (http://www.benasque.org/2009emotion/) or write to Maria Teresa Soto (mariateresa.soto@uab.es <mailto:mariateresa.soto@uab.es>).

The best selected full papers from Entertainment=Emotion (International workshop) will be considered to be published in a *special issue of the International Journal of Arts and Technology (IJART)*, the leading journal in the area (http://www.inderscience.com/browse/index.php?journalCODE=ijart).

A limited number of people can attend the workshop, so we advise you to send in your application as early as possible.

The CCBPP is managed by Physics Professors José Ignacio Latorre (UB) and Manuel Asorey (UZAR) and is supported by the Spanish Ministry for Education and Science, the Benasque Town Council, the Government of Aragon, the University of Zaragoza and the BBV. The CCBPP is a centre of renowned international prestige and is used to hold high level scientific meetings and, as well as hosting this meeting, seeks to stimulate the production of significant advances both in the study of science as in the professional creation of content, associated to the experience of emotions in media entertainment. 

Back to Top

8-9 . (2009-11-16) 8ème Rencontres Jeunes Chercheurs en Parole (french)

******************************************************************** 
               Appel à communications RJCP 2009 : 
            8ème Rencontres Jeunes Chercheurs en Parole 
******************************************************************** 
 
 
 
16-18 novembre 2009 à Avignon 
 
 
 
 
PRESENTATION 
____________________________________________________________________ 
 
Cette manifestation, parrainée par l’Association Francophone de la 
Communication Parlée (AFCP), donne aux (futurs) doctorants ou jeunes 
docteurs l’occasion de se rencontrer, de présenter leurs travaux et 
d’échanger sur les divers domaines de la Parole. 
 
Des jeunes chercheurs de différentes disciplines seront invités lors 
de ces rencontres et viendront disserter sur les travaux en cours 
dans leurs domaines respectifs. Leurs conseils et questions vous 
permettront de porter un regard nouveau sur vos travaux de 
recherche. 
 
Des sessions "poster" ainsi que des sessions orales seront proposées 
aux participants souhaitant exposer leurs travaux. Ces journées 
sont bien sûr ouvertes à tous ceux qui désirent simplement assister 
aux présentations sans proposer eux-mêmes une communication. 
 
 
 
DATES IMPORTANTES 
____________________________________________________________________ 
 
Date limite de réception des articles : 2 juillet 2009 
Notification aux auteurs : 27 septembre 2009 
Conférence : 16,17 et 18 novembre 2009 
 
 
Pour la bonne organisation de ces rencontres, merci de vous inscrire 
le plus rapidement possible, votre article pourra être envoyé par 
la suite. 
 
 
SOUMISSIONS 
____________________________________________________________________ 
 
Les propositions de communication sous forme de résumé de 
4 à 6 pages devront être envoyées avant le 2 juillet 2009 
sur le site de la conférence : http://rjcp2009.univ-avignon.fr
 
Un comité de lecture, composé de scientifiques du domaine, examinera 
les articles soumis et communiquera à chaque participant ses 
remarques éventuelles. 
 
Les instructions spécifiques et feuilles de style prédéfinies sont 
disponibles sur le site. Un recueil des articles sera publié et 
distribué à l’issue de ces rencontres. 
 
 
 
 
DEROULEMENT DE LA CONFERENCE 
____________________________________________________________________ 
 
La conférence se tiendra sur 3 jours dans les locaux de l'Université 
d'Avignon et des Pays de Vaucluse. Outre les présentations des 
participants et les sessions "poster", des personnalités issues du 
monde académique et industriel animeront des conférences plénières. 
De plus, un forum d'entreprises sera organisé, permettant ainsi la 
rencontre entre chercheurs et industriels. 
 
Toutes les informations pratiques concernant le déroulement de la 
conférence seront disponibles sur le site. 
 
 
 
CONTACT 
____________________________________________________________________ 
 
Pour plus de renseignements, vous pouvez envoyer un mail à : 
 
 
 
THEMATIQUES 
____________________________________________________________________ 
 
Les thématiques abordées (liste non exhaustive) : 
 
- Phonétique et phonologie 
- Traitement automatique de la langue naturelle orale 
- Production/perception de la parole 
- Pathologies de la parole 
- Acoustique de la parole 
- Reconnaissance et compréhension de la parole 
- Acquisition de la parole et du langage 
- Applications à composante orale (dialogue, indexation,...) 
- Prosodie 
- Diversité linguistique 
- Surdité 
- Gestualité 
Back to Top

8-10 . (2009-11-20) Seminar FROM PERCEPTION TO COMPREHENSIONOF A FOREIGN LANGUAGE(Strasbourg-France)

SEMINAR
FROM PERCEPTION TO COMPREHENSION
OF A FOREIGN LANGUAGE
UNIVERSITY OF STRASBOURG
UdS
CALL FOR PAPERS
Equipe d’Accueil 1339, Linguistique, Langues et Parole (LiLPa),
Composantes: Fonctionnement Discursif & Parole et Cognition
The proof of the effectiveness of perception in the ultimate phase of speech
reception is when the listener accesses meaning, and which one calls
comprehension.
The obstacles that affect this comprehension in foreign language learning are
multiple. Specialists highlight three points of view that one must correlate to explain
the phenomenon of speech reception: the articulatory and acoustic signals (physical
aspects), the phonological system (linguistic code) and processing of relevant
information by the listener (psycholinguistic aspect).
This seminar will be devoted to the role of perception in the comprehension of a
foreign language (with a particular focus on the comprehension of English), and to
the various dysfunctions related to data processing by the learner.
The transition from perception to comprehension involves a series of processing
stages:
- peripheral (auditory) and central processing, where sensory information makes
it possible for the listener to extract the acoustic and articulatory clues that are
considered relevant;
- categorical perception (phonemic units, invariance and variability);
- matching the learner’s phonetic information and phonological knowledge in the
native language (phonological sieve), or in a foreign language
- recognition of words, sentences, discourse, during various speech acts…
In the perception/comprehension process, difficulties may be related to several
factors:
- intrinsic characteristics of a language (for example the duration of English
vowels or nasality in French…);
- linguistic, situational or interactional contexts…;
- segmentation into erroneous perceptual units, coarticulation…
- speaker-specific variability (speech rate, accent, intonation…).
EXPECTED CONTRIBUTIONS
The topics covered by this seminar, in the field of perception and
comprehension of English or of any other foreign language, will be the
following:
• speech perception
• prosody
• phonology
• foreign language learning / acquisition
• psycholinguistics
• neurophonetics/neurolinguistics
• etc.
INVITED SPEAKER
 
Jean-Luc Schwartz (GIPSA-Lab, Grenoble) will speak on
 
PACT (Perception-for-Action-Theory), une théorie perceptuo-motrice de la
perception de la parole
 
ABSTRACT SUBMISSION
Please send your proposals (500 words MAXIMUM, in English or in French), in a
Word compatible format, Times 12, to: soumission.perception09@unistra.fr
by 21st September 2009, latest. The seminar will be held at the University of
Strasbourg, France on Friday the 20th of November, 2009.
For further information, please contact:
Ms Nuzha Moritz (PhD)
Université de Strasbourg (UdS)
Département des Langues Etrangères Appliquées
22, rue René Descartes
67084 Strasbourg cedex
France
 
Back to Top

8-11 . (2009-12-04) Troisièmes Journées de Phonétique Clinique Aix en Provence France (french)

JPC3

Troisièmes Journées de Phonétique Clinique

Appel à Communication
**4-5 décembre 2009, Aix-en-Provence, France

_http://www.lpl-aix.fr/~jpc3/ <http://www.lpl-aix.fr/%7Ejpc3/>
_********************************************************************************************************
 Les inscriptions aux 3èmes journées de phonétique clinique sont désormais ouvertes.

Vous pouvez vous inscrire au tarif normal jusqu'au 18 octobre.

Pour vous inscrire :
http://aune.lpl.univ-aix.fr/~jpc3/inscription.php

Le comité d'organisation
org.jpc3@lpl-aix.fr
*
Ces journées s’inscrivent dans la lignée des premières et deuxièmes journées d’études de phonétique clinique, qui s’étaient tenues respectivement à Paris en 2005 et Grenoble en 2007. La phonétique clinique réunit des chercheurs, enseignants-chercheurs, ingénieurs, médecins et orthophonistes, différents corps de métiers complémentaires qui poursuivent le même objectif : une meilleure connaissance des processus d’acquisition et de dysfonctionnement de la parole et de la voix. Cette approche interdisciplinaire vise à optimiser les connaissances fondamentales relatives à la communication parlée chez le sujet sain et à mieux comprendre, évaluer, diagnostiquer et remédier aux troubles de la parole et de la voix chez le sujet pathologique.

Les communications porteront sur les études phonétiques de la parole et de la voix pathologiques, chez l’adulte et chez l’enfant. Les *thèmes* du colloque incluent, de façon non limitative :

   Perturbations du système oro-pharyngo-laryngé    Perturbations du système perceptif    Troubles cognitifs et moteurs    Instrumentation et ressources en phonétique clinique    Modélisation de la parole et de la voix pathologique    Evaluation et traitement des pathologies de la parole et de la voix
*Les contributions sélectionnées seront présentées sous l’une des deux formes suivantes :*

   Communication longue: 20 minutes, pour l’exposé de travaux aboutis    Communication courte: 8 minutes pour l’exposé d'observations
   cliniques, de travaux préliminaires, de problématiques émergentes
   afin de favoriser au mieux les échanges interdisciplinaires entre
   phonéticiens et cliniciens.
*Format de soumission:
*Les soumissions aux JPC se présentent sous la forme de résumés rédigés en français, d’une longueur maximale d’une page A4, police Times New Roman, 12pt, interligne simple. Les résumés devront être soumis au format PDF à l’adresse suivante: _soumission.jpc3@lpl-aix.fr


_*Date limite de soumission: 15 mai 2009
Date de notification auteurs : 1er juillet 2009

*Pour toute information complémentaire, contactez les organisateurs:
_org.jpc3@lpl-aix.fr

_/L’inscription aux JPC3 (1^er juillet 2009) sera ouverte à tous, publiant ou non publiant.

Back to Top

8-12 . (2009-12-09)1st EUROPE-ASIA SPOKEN DIALOGUE SYSTEMS TECHNOLOGY WORKSHOP

1st EUROPE-ASIA SPOKEN DIALOGUE SYSTEMS
TECHNOLOGY WORKSHOP
December 9 – 11, 2009
Kloster Irsee, Germany
Introduction
Dear Colleagues,
It is our pleasure to invite you to participate in this FIRST EUROPE-ASIA SPOKEN DIALOGUE
SYSTEMS TECHNOLOGY WORKSHOP, which will be held at the Kloster Irsee in southern
Germany from December 9 to December 11, 2009.
This annual workshop will bring together researchers from all over the world working in the field of
spoken dialogue systems. It will provide an international forum for the presentation of research and
applications and for lively discussions among researchers as well as industrialists. The workshops
will be held alternately in Europe and Asia.
A scientific focus of the Europe-Asia Spoken Dialogue Systems Technology Workshop is placed on
advanced speech-based human-computer interaction where to a larger extent contextual factors are
modelled and taken into account when users communicate with computers. Future interfaces will be
endowed with more human-like capabilities. For example, the emotional state of the user will be
analyzed so as to be able to automatically adapt the dialogue flow to user preferences, state of
knowledge and learning success. Complex knowledge bases and reasoning capabilities will control
ambient devices that automatically adapt to user requirements and communication styles and, in
doing so, help reducing the mental load of the user. Future interfaces will finally behave like real
partners or cognitive technical assistants to their users.
Topics of interest include mechanisms, architectures, design issues, applications, evaluation and
tools. Prototype and product demonstrations will be very welcome.
The workshop will be held as a Satellite Event of ASRU2009 - Automatic Speech Recognition and
Understanding Workshop; Merano (Italy), December 13-17, 2009.
We welcome you to the workshop.
Gary Geunbae Lee
POSTECH, Pohang
(Korea)
Joseph Mariani
LIMSI-CNRS and
IMMI, Orsay (France)
Wolfgang Minker
Ulm University
(Germany)
Satoshi Nakamura
NICT-ATR, Kyoto
(Japan)
Back to Top

8-13 . (2009-12-14)7th International Conference on Natural Language Processing Hyberabad, India

 http://www.icon2009.in   7th International Conference on Natural Language Processing University of Hyderabad, Hyderabad, India December 14-17, 2009 
Back to Top

8-14 . (2009-12-15) All-India Conference of Linguists University of Hyderabad, India

http://www.aicl2009.in All-India Conference of Linguists University of Hyderabad, Hyderabad, India December 15-17, 2009

Back to Top

8-15 . (2010-02-10) International Conference on Socio-Cultural Approaches to Translation, Hyderabad, India

http://www.icscat2010.in/ International Conference on Socio-Cultural Approaches to Translation University of Hyderabad, Hyderabad, India Feb 10-12, 2010 

Back to Top

8-16 . (2010-03-13) CfP SpokenQuery Voice Search Workshop 2010 (SQ 2010)

SpokenQuery Voice Search Workshop 2010 (SQ 2010)

 

13 March 2010, Dallas, TX

http://www.spokenquery.org/

 

Aims

 

Papers are solicited for the SpokenQuery 2010 Workshop on Voice Search (SQ2010), to be held in Dallas, Texas as a satellite to to ICASSP 2010.

 

Small devices with high computing power have become ubiquitous, via cellphone-like devices, car communication systems, etc. This new reality makes it feasible to utilize speech as the preferred input mode. Searching for information from a spoken query brings its own challenges that go beyond the inherent difficulties of speech recognition or information retrieval.

 

The first SQ workshop aims at bringing together researchers working in the area that overlaps speech processing and information retrieval. This workshop is intended to be an open forum that will allow different research communities to become better acquainted with each other and to share ideas.

 

This one-day workshop will include a limited number of oral presentations, chosen for breadth and stimulation, and an informal atmosphere to promote discussion. We hope this workshop will expose participants to a broad perspective of techniques, tools, best practices, and innovation, which will provide the impetus for new research and compelling variants on current approaches.

 

Papers describing relevant research and new concepts are solicited on, but not limited to, the following topics:

 

·        Spoken queries in various languages

·        Retrieval of spoken or text documents via spoken query

·        Handling of new vocabulary words

·        Tools and databases

·        Commercial applications

·        Automotive spoken query applications

·        Spoken query applications via a mobile phone

·        Server-based and client-based approaches to automatic speech recognition

·        Novel adaptation and noise robustness methods

·        Novel demonstrations

 

Manuscripts must be between 4 and 6 pages long, in standard ICASSP double-column format. Accepted papers will be published in the workshop proceedings.

 

Important Dates

 

Paper submission: 30 November 2009

Notification of paper acceptance: 15 January 2010

Workshop: 13 March 2010

 

Organizers

 

Bhiksha Raj, CMU, USA

Evandro Gouvêa, MERL, USA

Tony Ezzat, MERL, USA

 

Technical Committee

 

Michiel Bacchiani, Google, USA

Fabio Crestani, UNISI, Switzerland

Ute Ehrlich, DaimlerAG, Germany

Mazin Gilbert,  ATT, USA

Prasad Venkatesh, Ford, USA

Chao Wang, Vlingo, USA

Geoffrey Zweig, Microsoft, USA

 

Contact

 

To email the organizers, please send email to organizers@spokenquery.org.

Back to Top

8-17 . (2010-03-15) IEEE ICASSP 2010 International Conference on Acoustics, Speech, and Signal Processing March 15 – 19, 2010 Sheraton Dallas Hotel * Dallas, Texas, U.S.A.

IEEE ICASSP 2010   International Conference on Acoustics, Speech, and Signal Processing                            March 15 – 19, 2010                Sheraton Dallas Hotel * Dallas, Texas, U.S.A.                         http://www.icassp2010.com/     The 35th International Conference on Acoustics, Speech, and Signal Processing (ICASSP) will be held at the Sheraton Dallas Hotel, March 15 – 19, 2010. The ICASSP meeting is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The conference will feature world-class speakers, tutorials, exhibits, and over 120 lecture and poster sessions on the following topics:   * Audio and electroacoustics  * Bio imaging and signal processing  * Design and implementation of signal processing systems  * Image and multidimensional signal processing  * Industry technology tracks  * Information forensics and security  * Machine learning for signal processing  * Multimedia signal processing  * Sensor array and multichannel systems  * Signal processing education  * Signal processing for communications  * Signal processing theory and methods  * Speech processing  * Spoken language processing  Welcome to Texas, Y’All! Dallas is known for living large and thinking big. As the nation’s ninth-largest city, Dallas is exciting, diverse and friendly — factors that contribute to its success as a leading leisure and convention destination. There’s a whole “new” vibrant Dallas to enjoy-new entertainment districts, dining, shopping, hotels, arts and cultural institutions- with more on the way. There’s never been a more exciting time to visit Dallas than now.  Submission of Papers: Prospective authors are invited to submit full-length, four-page papers, including figures and references, to the ICASSP Technical Committee. All ICASSP papers will be handled and reviewed electronically. The ICASSP 2010 website www.icassp2010.com will provide you with further details. Please note that all submission deadlines are strict.  Tutorial and Special Session Proposals: Tutorials will be held on March 14 and 15, 2010. Brief proposals should be submitted by July 31, 2009, through the ICASSP 2010 website and must include title, outline, contact information for the presenter, and a description of the tutorial and material to be distributed to participants. Special sessions proposals should be submitted by July 31, 2009, through the ICASSP 2010 website and must include a topical title, rationale, session outline, contact information, and a list of invited papers. Tutorial and special session authors are referred to the ICASSP website for additional information regarding submissions.  * Important Deadlines *  Submission of Camera-Ready Papers      September 14, 2009  Notification of Paper Acceptance      December 11, 2009  Revised Paper Upload Deadline      January 8, 2010  Author’s Registration Deadline      January 15, 2010  For more detailed information, please visit the ICASSP 2010 official website, http://www.icassp2010.com/.
Back to Top

8-18 . (2010-03-20) CfP CMU Sphinx Users and Developers Workshop 2010 (CMU-SPUD 2010)

CMU Sphinx Users and Developers Workshop 2010 (CMU-SPUD 2010)

 

20 March 2010, Dallas, TX

 

http://www.cs.cmu.edu/~sphinx/Sphinx2010

 

Papers are solicited for the CMU Sphinx Workshop for Users and Developers (CMU-SPUD 2010), to be held in Dallas, Texas as a satellite to to ICASSP 2010.

 

CMU Sphinx is one of the most popular open source speech recognition systems. It is currently used by researchers and developers in many locations world-wide, including universities, research institutions and in industry. CMU Sphinx's liberal license terms has made it a significant member of the open source community and has provided a low-cost way for companies to build businesses around speech recognition.

 

The first SPUD workshop aims at bringing together CMU Sphinx users, to report on applications, developments and experiments conducted using the system. This workshop is intended to be an open forum that will allow different user communities to become better acquainted with each other and to share ideas. It is also an opportunity for the community to help define the future evolution of CMU Sphinx.

 

We are planning a one-day workshop with a limited number of oral presentations, chosen for breadth and stimulation, held in an informal atmosphere that promotes discussion. We hope this workshop will expose participants to different perspectives and that this in turn will help foster new directions in research, suggest interesting variations on current approaches and lead to new applications.

 

Papers describing relevant research and new concepts are solicited on, but not limited to, the following topics. Papers must describe work performed with CMU Sphinx:

 

·        Decoders: PocketSphinx, Sphinx-2, Sphinx-3, Sphinx-4

·        Tools: SphinxTrain, CMU/Cambridge SLM toolkit

·        Innovations / additions / modifications of the system

·        Speech recognition in various languages

·        Innovative uses, not limited to speech recognition

·        Commercial applications

·        Open source projects that incorporate Sphinx

·        Novel demonstrations

 

Manuscripts must be between 4 and 6 pages long, in standard ICASSP double-column format. Accepted papers will be published in the workshop proceedings.

 

IMPORTANT DATES

 

Paper submission: 30 November 2009

Notification of paper acceptance: 15 January 2010

Workshop: 20 March 2010

 

Organizers

 

Bhiksha Raj, Carnegie Mellon University, USA

Evandro Gouvêa, Mitsubishi Electric Research Labs, USA

Richard Stern, Carnegie Mellon University, USA

Alex Rudnicky, Carnegie Mellon University, USA

Rita Singh, Carnegie Mellon University, USA

David Huggins-Daines, Carnegie Mellon University, USA

Nickolay Schmyrev, Nexiwave, Russian Federation

Yannick Estève, Laboratoire d'Informatique de l'Université du Maine, France

 

Contact

 

To email the organizers, please send email to sphinx+workshop@cs.cmu.edu

Back to Top

8-19 . (2010-04-13) CfP Workshop: Positional phenomena in phonology and phonetics Wroclaw-



http://www.ifa.uni.wroc.pl/~glow33/phon.html

 Workshop: Positional phenomena in phonology and phonetics

(Organised by Zentrum für Allgemeine Sprachwissenschaft, Berlin)

*Date:* 13 April 2010
*Organisers:* Marzena Zygis, Stefanie Jannedy, Susanne Fuchs
*Deadline for abstract submission:* 1st November 2009
*Abstracts submitted to:* zygis@zas.gwz-berlin.de
*Invited speakers:*

  * Taehong Cho (Hanyang University, Seoul) confirmed
  * Grzegorz Dogil (University of Stuttgart) confirmed

*Venue:* /Instytut Filologii Angielskiej, ul. Kuz'nicza 22, 50-138 Wroc?aw/

Positional effects found cross-linguistically at the edges of prosodic
constituents (e.g. final lengthening, final lowering, strengthening
effects, or final devoicing) have increasingly received attention in
phonetic-phonological research. Recent empirical investigations of such
positional effects and their variability pose, however, a great number
of questions challenging e.g. the idea of perceptual invariance. It has
been claimed that acoustic variability is a necessary prerequisite for
the perceptual system to parse segmental strings into words, phrases or
larger prosodic units.

This workshop will provide a forum for discussing controversies and
recent developments regarding positional phenomena. We invite abstracts
bearing on positional effects from various perspectives.The following
questions can be addressed, but are not limited to:

 1. What kind of variability is found in the data, and how does such
    variability need to be accounted for? What positional effects are
    common cross-linguistically and how can they be attributed to
    perceptual, articulatory or aerodynamic principles?
 2. How does positional prominence (lexical stress; accent) interact
    with acoustic and articulatory realizations of prosodic
    boundaries? What are the positional (a)symmetries in the
    realizations of boundaries, and what are the mechanisms underlying
    them?
 3. How does left- and right-edge phrasal marking interact with the
    acoustic and articulatory realizations at these prosodic
    boundaries? How are these interpreted in phonetics and in phonology?
 4. What are the necessary prerequisites for the interpretation of
    prosodic constituents? Which auditory cues are essential for the
    perception of boundaries and positional effects? Are such cues
    language-specific?
 5. To what extent do lexical frequency, phonotactic probability, and
    neighbourhood density contribute to the production and recognition
    of prosodic boundaries in (fluent/spontaneous) speech?
 6. How are positional characteristics exploited during the process of
    language acquisition? How are they learned during the process of
    language acquisition? Are positional effects salient enough for L2
    learners?

Abstracts are invited for a 20-min. presentation (excluding discussion).
Abstracts should be sent in two copies: one with a name and one without
as attached files (the name(s) should also be clearly mentioned in the
e-mail) to: zygis@zas.gwz-berlin.de in .pdf format. Only electronic
submissions will be considered. Abstracts may not exceed two pages of
text with at least a one-inch margin on all four sides (measured on A4
paper) and must employ a font not smaller than 12 point. Each page may
include a maximum of 50 lines of text. An additional page with
references may be included.

Deadline for submissions: November 1, 2009.

Contact person: Marzena Zygis

-- 
*************************
Susanne Fuchs, PhD
ZAS/Phonetik
Schützenstrasse 18
10117 Berlin

phone: 030 20192 569
fax:   030 20192 402
webpage: http://susannefuchs.org
*************************


 
Back to Top

8-20 . (2010-05-10) Cfp Workshop on Prosodic Prominence: Perceptual and Automatic Identification

Speech Prosody 2010 Satellite Workshop May 10th, 2010, Chicago, Illinois   Description of the workshop: Efficient tools for (semi-)automatic prosodic annotation are becoming more and more important for the speech community, as most systems of prosodic annotation rely on the identification of syllabic prominence in spoken corpora (whether they lead a phonological interpretation or not). The use of automatic and semi-automatic annotation has also facilitated multilingual research; many experiments on prosodic prominence identification have been conducted for European and non-European languages, and protocols have been written in order to build large databases of spoken languages prosodically annotated all around the world. The aim of this workshop is to bring together specialists of automatic prosodic annotation interested in the development of robust algorithms for prominence detection, and linguists who developed methodologies for the identification of prosodic prominence in natural languages on perceptual bases. The conference will include oral and poster sessions, and a final round table.   Scientific topics: 1. Annotation of prominence 2. Perceptual processing of prominences: gestalt theories’ background 3. Acoustic correlates of prominence 4. Prominence and its relations with prosodic structure 5. Prominence and its relations with accent, stress, tone and boundary 6. The use of syntactic/pragmatic information in prominence identification 7. Perception of prominence by naive/expert listeners 8. Statistical methods for prominence’s detection 9. Number of relevant prominence degrees : categorical or continuous scale 10.Prosodic prominence and visual perception   Submission of papers: Anonymous four-page papers (including figures and references) must be written in English, and be uploaded as pdf files here: https://www.easychair.org/login.cgi?conf=prom2010. All papers will be reviewed by at least three members of the scientific committee. Accepted four-page papers will be included in the online proceedings of the workshop published on the workshop website. The publication of extended selected papers after the workshop in a special issue of a journal is being considered.   Organizing Committee: Mathieu Avanzi (Université de Neuchâtel, CH) Anne Lacheret-Dujour (Université de Paris Ouest Nanterre) Anne-Catherine Simon (Université catholique de Louvain-la-Neuve)  Scientific committee: the names of the scientific committee will be announced in the second circular.   Venue: The workshop will take place in The Doubletree Hotel Magnificent Mile, in Chicago. See the Speech prosody 2010 website (http://www.speechprosody2010.illinois.edu/index.html) for further information.   Important deadlines: Submission of four-page papers: November 15, 2009 Notification of acceptation: January 15, 2009 Author's Registration Deadline: March 2, 2010 Workshop: March 10, 2010    Website of the workshop: http://www2.unine.ch/speechprosody-prominence
Back to Top

8-21 . (2010-05-11) CfP Speech prosody 2010 Chicago IL USA

SPEECH PROSODY 2010   (New submission deadline)
===============================================================
Every Language, Every Style: Globalizing the Science of Prosody
===============================================================
Call For Papers
===============================================================

Prosody is, as far as we know, a universal characteristic of human speech, founded on the cognitive processes of speech production and perception.  Adequate modeling of prosody has been shown to improve human-computer interface, to aid clinical diagnosis, and to improve the quality of second language instruction, among many other applications.

Speech Prosody 2010, the fifth international conference on speech prosody, invites papers addressing any aspect of the science and technology of prosody.  Speech Prosody is the only recurring international conference focused on prosody as an organizing principle for the social, psychological, linguistic, and technological aspects of spoken language.  Speech Prosody 2010 seeks, in particular, to discuss the universality of prosody.  To what extent can the observed scientific and technological benefits of prosodic modeling be ported to new languages, and to new styles of spoken language?  Toward this end, Speech Prosody 2010 especially welcomes papers that create or adapt models of prosody to languages, dialects, sociolects, and/or communicative situations that are inadequately addressed by the current state of the art.

=======
TOPICS
=======

Speech Prosody 2010 will include keynote presentations, oral sessions, and poster sessions covering topics including:

* Prosody of under-resourced languages and dialects
* Communicative situation and speaking style
* Dynamics of prosody: structures that adapt to new situations
* Phonology and phonetics of prosody
* Rhythm and duration
* Syntax, semantics, and pragmatics
* Meta-linguistic and para-linguistic communication
* Signal processing
* Automatic speech synthesis, recognition and understanding
* Prosody of sign language
* Prosody in face-to-face interaction: audiovisual modeling and analysis
* Prosodic aspects of speech and language pathology
* Prosody in language contact and second language acquisition
* Prosody and psycholinguistics
* Prosody in computational linguistics
* Voice quality, phonation, and vocal dynamics

====================
SUBMISSION OF PAPERS
====================

Prospective authors are invited to submit full-length, four-page papers, including figures and references, at http://speechprosody2010.org. All Speech Prosody papers will be handled and reviewed electronically.

===================
VENUE
===================

The Doubletree Hotel Magnificent Mile is located two blocks from North Michigan Avenue, and three blocks from Navy Pier, at the cultural center of Chicago.  The Windy City has been the center of American innovation since the mid nineteenth century, when a railway link connected Chicago to the west coast, civil engineers reversed the direction of the Chicago river, Chicago financiers invented commodity corn (maize), and the Great Chicago Fire destroyed almost every building in the city. The Magnificent Mile hosts scores of galleries and museums, and hundreds of world-class restaurants and boutiques.

===================
IMPORTANT DATES
===================

Submission of Papers (http://speechprosody2010.org): November 15, 2009
Notification of Acceptance:                                           December 15, 2009
Conference:                                                                    May 11-14, 2010

Back to Top

8-22 . (2010-05-24)CfP 4th INTERNATIONAL CONFERENCE ON LANGUAGE AND AUTOMATA THEORY AND APPLICATIONS (LATA 2010)

1st Call for Papers  4th INTERNATIONAL CONFERENCE ON LANGUAGE AND AUTOMATA THEORY AND APPLICATIONS (LATA 2010)  Trier, Germany, May 24-28, 2010  http://grammars.grlmc.com/LATA2010/  *********************************************************************  AIMS:  LATA is a yearly conference in theoretical computer science and its applications. As linked to the International PhD School in Formal Languages and Applications that was developed at Rovira i Virgili University (the host of the previous three editions and co-organizer of this one) in the period 2002-2006, LATA 2010 will reserve significant room for young scholars at the beginning of their career. It will aim at attracting contributions from both classical theory fields and application areas (bioinformatics, systems biology, language technology, artificial intelligence, etc.).  SCOPE:  Topics of either theoretical or applied interest include, but are not limited to:  - algebraic language theory - algorithms on automata and words - automata and logic - automata for system analysis and programme verification - automata, concurrency and Petri nets - cellular automata - combinatorics on words - computability - computational complexity - computer linguistics - data and image compression - decidability questions on words and languages - descriptional complexity - DNA and other models of bio-inspired computing - document engineering - foundations of finite state technology - fuzzy and rough languages - grammars (Chomsky hierarchy, contextual, multidimensional, unification, categorial, etc.) - grammars and automata architectures - grammatical inference and algorithmic learning - graphs and graph transformation - language varieties and semigroups - language-based cryptography - language-theoretic foundations of artificial intelligence and artificial life - neural networks - parallel and regulated rewriting - parsing - pattern matching and pattern recognition - patterns and codes - power series - quantum, chemical and optical computing - semantics - string and combinatorial issues in computational biology and bioinformatics - symbolic dynamics - term rewriting - text algorithms - text retrieval - transducers - trees, tree languages and tree machines - weighted machines  STRUCTURE:  LATA 2010 will consist of:  - 3 invited talks - 2 invited tutorials - refereed contributions - open sessions for discussion in specific subfields, on open problems, or on professional issues (if requested by the participants)  Invited speakers to be announced.  PROGRAMME COMMITTEE:  Alberto Apostolico (Atlanta) Thomas Bäck (Leiden) Stefania Bandini (Milano) Wolfgang Banzhaf (St. John's) Henning Bordihn (Potsdam) Kwang-Moo Choe (Daejeon) Andrea Corradini (Pisa) Christophe Costa Florencio (Leuven) Maxime Crochemore (Marne-la-Vallée) W. Bruce Croft (Amherst) Erzsébet Csuhaj-Varjú (Budapest) Jürgen Dassow (Magdeburg) Volker Diekert (Stuttgart) Rodney G. Downey (Wellington) Frank Drewes (Umea) Henning Fernau (Trier, co-chair) Rusins Freivalds (Riga) Rudolf Freund (Wien) Paul Gastin (Cachan) Edwin Hancock (York, UK) Markus Holzer (Giessen) Helmut Jürgensen (London, Canada) Juhani Karhumäki (Turku) Efim Kinber (Fairfield) Claude Kirchner (Bordeaux) Carlos Martín-Vide (Brussels, co-chair) Risto Miikkulainen (Austin) Victor Mitrana (Bucharest) Claudio Moraga (Mieres) Sven Naumann (Trier) Chrystopher Nehaniv (Hatfield) Maurice Nivat (Paris) Friedrich Otto (Kassel) Daniel Reidenbach (Loughborough) Klaus Reinhardt (Tübingen) Antonio Restivo (Palermo) Christophe Reutenauer (Montréal) Kai Salomaa (Kingston, Canada) Jeffrey Shallit (Waterloo) Eljas Soisalon-Soininen (Helsinki) Bernhard Steffen (Dortmund) Frank Stephan (Singapore) Wolfgang Thomas (Aachen) Marc Tommasi (Lille) Esko Ukkonen (Helsinki) Todd Wareham (St. John's) Osamu Watanabe (Tokyo) Bruce Watson (Pretoria) Thomas Wilke (Kiel) Slawomir Zadrozny (Warsaw) Binhai Zhu (Bozeman)  ORGANIZING COMMITTEE:  Adrian Horia Dediu (Tarragona) Henning Fernau (Trier, co-chair)  Maria Gindorf (Trier) Stefan Gulan (Trier) Anna Kasprzik (Trier) Carlos Martín-Vide (Brussels, co-chair)  Norbert Müller (Trier) Bianca Truthe (Magdeburg)  SUBMISSIONS:  Authors are invited to submit papers presenting original and unpublished research. Papers should not exceed 12 single-spaced pages and should be formatted according to the standard format for Springer Verlag's LNCS series (see http://www.springer.com/computer/lncs/lncs+authors?SGWID=0-40209-0-0-0). Submissions have to be uploaded at:  http://www.easychair.org/conferences/?conf=lata2010  PUBLICATIONS:  A volume of proceedings published by Springer in the LNCS series will be available by the time of the conference.  At least one special issue of a major journal will be later published containing extended versions of the papers contributed to the conference.  Submissions to the post-conference publications will be only by invitation.  REGISTRATION:  The period for registration will be open since September 1, 2009 until May 24, 2010. The registration form can be found at the website of the conference: http://grammars.grlmc.com/LATA2010/  Early registration fees: 500 Euro Early registration fees (PhD students): 400 Euro Late registration fees: 530 Euro Late registration fees (PhD students): 430 Euro On-site registration fees: 550 Euro On-site registration fees (PhD students): 450 Euro  At least one author per paper should register. Papers that do not have a registered author by February 15, 2010 will be excluded from the proceedings.  Fees comprise access to all sessions, one copy of the proceedings volume, coffee breaks, lunches, excursion, and conference dinner.  PAYMENT:  Early (resp. late) registration fees must be paid by bank transfer before February 15, 2010 (resp. May 14, 2010) to the conference series account at Open Bank (Plaza Manuel Gomez Moreno 2, 28020 Madrid, Spain): IBAN: ES1300730100510403506598 - Swift code: OPENESMMXXX (account holder: Carlos Martin-Vide & URV – LATA 2010).  Please write the participant’s name in the subject of the bank form. Transfers should not involve any expense for the conference.  On-site registration fees can be paid only in cash.  A receipt for the payment will be provided on site.  Besides paying the registration fees, it is required to fill in the registration form at the website of the conference.  BEST PAPER AWARDS:  An award will be presented to the authors of the two best papers accepted to the conference. Only papers fully authored by PhD students are eligible. The award intends to cover their travel expenses.  IMPORTANT DATES:  Paper submission: December 3, 2009 Notification of paper acceptance or rejection: January 21, 2010 Final version of the paper for the LNCS proceedings: February 3, 2010 Early registration: February 15, 2010 Late registration: May 14, 2010 Starting of the conference: May 24, 2010 Submission to the post-conference publications: August 27, 2010  FURTHER INFORMATION:  gindorf-ti@informatik.uni-trier.de  CONTACT:  LATA 2010 Universität Trier Fachbereich IV – Informatik Campus II, Behringstraße D-54286 Trier  Phone: +49-(0)651-201-2836 Fax: +49-(0)651-201-3954
Back to Top