Contents

1 . Editorial

Dear Members,

Summer holidays are coming soon. But INTERSPEECH 2010 organizing committee is working hard. The reviewing process is now over. Authors will be notified soon. All members, authors or not are invited to join the most important annual event of ISCA. Before your registration at the conference, do not forget to renew your membership (possible via our website www.isca-speech.org) if it expires before September 30.

Prof. em. Chris Wellekens 

Institut Eurecom

Sophia Antipolis
France 

ISCApad@isca-speech.org

 

 
 
 

 

Back to Top

2 . ISCA News

If you are planning to attend INTERSPEECH 2010 and your ISCA membership expires before 30 September 2010, please renew your membership now at http://www.isca-speech.org/

Back to Top

3 . SIG's News: SIG CSLP Report

2010 SIG-CSLP Report

 

The flagship event of SIG-CSLP (Chinese Spoken Language Processing) is a biennial international conference on the same theme known as ISCSLP (International Symposium on Chinese Spoken Language Processing). ISCSLP 2008, the 6th conference, was held in Kunming, China on December 16-19, 2008, with a total 137 full paper submissions from 12 countries and regions; the conference proceedings was indexed by the IEEE Xplore citation database.  The 7th biannual conference of SIG-CSLP, the ISCSLP 2010 has been under preparation by the organizers of Taiwan for over a year, working closely with officers and members of SIG-CSLP.

 

The ISCSLP 2010 (http://conf.ncku.edu.tw/iscslp2010/) will be held on November 29-December 3, 2010 in Tainan and Sun Moon Lake, Taiwan, hosted by the National Cheng Kung University. This conference covers a broad range of research topics, including:

 

  • Speech Production and Perception
  • Phonetics and Phonology
  • Speech Analysis
  • Speech Coding
  • Speech Enhancement
  • Speech Recognition
  • Speech Synthesis
  • Language Modeling and Spoken Language Understanding
  • Spoken Dialog Systems
  • Spoken Language Translation
  • Speaker and Language Recognition
  • Computer-Assisted Language Learning
  • Indexing, Retrieval and Authoring of Speech Signals
  • Multi-Modal Interface including Spoken Language Processing
  • Spoken Language Resources and Technology Evaluation
  • Applications of Spoken Language Processing Technology

 

SIG-CSLP has also been working closely with regional organizations Oriental-COCOSDA, NCMMSC and ACLCLP by sharing information and helping with organizing the following annual events:

 

  • O-COCOSDA 2009 – the 12th Oriental COCOSDA Workshop was held on August 10-12, 2009 in Beijing, China, hosted by the School of Information Science and Engineering of Xinjiang University, China and the Tsinghua University Research Institute of Information Technology, China.
  • NCMMSC 2009 - the 10th National Conference on Man-Machine Speech Communication was held on August 14-16, 2009 in Lanzhou, China, jointly organized by Chinese Information Processing Society of China, Phonetics Association of China, and the Acoustical Society of China.
  • ROCLING 2009 - the 21st Conference on Computational Linguistics and Speech Processing was held on September 1-2, 2009 at the National Chung Hsing University, Taichung, Taiwan, sponsored by Association for Computational Linguistics and Chinese Language Processing (ACLCLP).

 

SIG-CSLP also works closely with the language data consortiums ChineseLDC (Chinese Language Data Consortium) and CCC (Chinese Corpus Consortium). They have collected more than 120 corpora which have been widely shared among universities and companies. This year, 8 new corpora have been added in their big data pools.

 

 

For more information on SIG-CSLP activities, please visit the site: http://www.SIG-CSLP.org/.

 

Back to Top

4 . Future ISCA Conferences and Workshops (ITRW)

For more information on workshop activities, please visit the site: http://www.isca-speech.org/iscaweb/index.php?option=com_content&view=article&id=127&Itemid=67 

Back to Top

4-1 . (2010-06-28) Odyssey 2010 Brno Czech Republic

Odyssey 2010: The Speaker and Language Recognition Workshop will be hosted by Brno University of Technology in Brno, Czech Republic. Odyssey’10 is an ISCA Tutorial and Research Workshop held in cooperation with the ISCA Speaker and Language Characterization SIG. The need for fast, efficient, accurate, and robust means of recognizing people and languages is of growing importance for commercial, forensic, and government applications. The aim of this workshop is to continue to foster interactions among researchers in speaker and language recognition as the successor of previous successful events held in Martigny (1994), Avignon (1998), Crete (2001), Toledo (2004), San Juan (2006) and Stellenbosch (2008). 

http://www.speakerodyssey.com

 

Topics

Topics of interest include speaker and language recognition (verification, identification, segmentation, and clustering): text-dependent and -independent speaker recognition; multispeaker training and detection; speaker characterization and adaptation; features for speaker recognition; robustness in channels; robust classification and fusion; speaker recognition corpora and evaluation; use of extended training data; speaker recognition with speech recognition; forensics, multimodality, and multimedia speaker recognition; speaker and language confidence estimation; language, dialect, and accent recognition; speaker synthesis and transformation; biometrics; human recognition of speaker and language; and commercial applications.

Schedule

Draft papers due:
15 February 2010
Notification of acceptance:
16 April 2010
Final papers due:
30 April 2010
Preliminary program:
17 May 2010
Workshop:
28 June – 1 July 2010 
Back to Top

4-2 . (2010-09-22) 7th ISCA Speech Synthesis Workshop (SSW7), Kyoto,Japan

CALL FOR PAPERS
7th ISCA Speech Synthesis Workshop (SSW7)
Kyoto, Japan - September 22-24, 2010
http://www.ssw7.org/

The Seventh ISCA Tutorial and Research Workshop (ITRW) on Speech
Synthesis will take place at ATR, Kyoto, Japan, September 22-24, 2010.
It is co-sponsored by the International Speech Communication
Association (ISCA), the ISCA Special Interest Group on Speech
Synthesis (SynSIG), the National Institute of Information and
Communications Technology (NICT), and the Effective Multilingual
Interaction in Mobile Environments (EMIME) project.  The workshop will
be held as a satellite workshop of Interspeech 2010 (Chiba, Japan,
September 26-30, 2010).  This workshop follows on from the previous
workshops, Autrans 1990, Mohonk 1994, Jenolan Caves 1998, Pitlochry
2001, Pittsburgh 2004, Bonn 2007, which aim to promote research and
development of all aspects of speech synthesis.

Workshop topics Papers in all areas of speech synthesis technology are
encouraged to be submitted, with emphasis placed on:

* Spontaneous/expressive speech synthesis
* Speech synthesis in dialog systems
* Voice conversion/speaker adaptation
* Multilingual/crosslingual speech synthesis
* Automated methods for speech synthesis
* TTS for embedded devices
* Talking heads with animated conversational agents
* Applications of synthesis technologies to communication disorders
* Evaluation methods

Submissions for the technical program:

The workshop program will consist of invited lectures, oral and poster
presentations, and panel discussions.  Prospective authors are invited
to submit full-length, 4-6 page papers, including figures and
references.  All papers will be handled and reviewed electronically.
The SSW7 website http://www.ssw7.org/ will provide you with further
details.

Important dates:

* May 7, 2010: Paper submission deadline
* June 30, 2010: Acceptance/rejection notice
* June 30, 2010: Registration begins
* July 9, 2010: Revised paper due
* September 22-24, 2010: Workshop at ATR in Kyoto 

Back to Top

4-3 . (2010-09-26) INTERSPEECH 2010 Chiba Japan

     ========================================

     INTERSPEECH2010

 

     Makuhari,Japan / September 26-30, 2010

      http://www.interspeech2010.org

If you are planning to attend INTERSPEECH 2010 and your ISCA membership expires before 30 September 2010, please renew your membership now at http://www.isca-speech.org/.  

 ========================================

 

Dear Colleague,

 

INTERSPEECH is the world's largest and most comprehensive conference on

issues surrounding the science and technology of  spoken language

processing(SLP) both in humans and in machines. It is our great pleasure to

host INTERSPEECH 2010 in  Japan, the birthplace of ICSLP, which has held two

ICSLPs, in Kobe and Yokohama, in the past.

The theme of INTERSPEECH 2010 is "Spoken Language Processing for All Ages,

Health Conditions, Native Languages and  Environments". INTERSPEECH 2010

emphasizes an interdisciplinary approach covering all aspects of speech

science and  technology spanning the basic theories to applications. Besides

regular oral and poster sessions, plenary talks by  internationally renowned

experts, tutorials, exhibits, and special sessions are planned.

 

      "INTERSPEECH conferences are indexed in ISI"

 

We invite you to submit original papers in any related area, including but

not limited to:

 

HUMAN SPEECH PRODUCTION, PERCEPTION AND COMMUNICATION

 

    * Human speech production

    * Human speech and sound perception

    * Linguistics, phonology and phonetics

    * Intersection of spoken and written languages

    * Discourse and dialogue

    * Prosody (e.g., production, perception, prosodic structure, modeling)

    * Paralinguistic and nonlinguistic cues (e.g., emotion and expression)

    * Physiology and pathology of spoken language

    * Spoken language acquisition, development and learning

    * Speech and other modalities (e.g., facial expression, gesture)

 

SPEECH AND LANGUAGE TECHNOLOGY

 

    * Speech analysis and representation

    * Speech segmentation

    * Audio segmentation and classification

    * Speaker turn detection

    * Speech enhancement

    * Speech coding and transmission

    * Voice conversion

    * Speech synthesis and spoken language generation

    * Automatic speech recognition

    * Spoken language understanding

    * Language and dialect identification

    * Cross-lingual and multi-lingual speech processing

    * Multimodal/multimedia signal processing (including sign languages)

    * Speaker characterization and recognition

    * Signal processing for music and song

    * Spoken language technology for prosthesis, rehabilitation, wellness

and welfare

    * Computational linguistics for SLP

    * Written Language Processing for SLP

 

SPOKEN LANGUAGE SYSTEMS AND APPLICATIONS

 

    * Spoken dialogue systems

    * SLP Systems for information extraction/retrieval

    * Systems for spoken language translation

    * Applications for aged and handicapped persons

    * Applications for learning and education

    * Other applications

 

RESOURCES, STANDARDIZATION AND EVALUATION

 

    * Spoken language resources and annotation

    * Evaluation and standardization of spoken language systems

 

Special Sessions

    * Open Vocabulary Spoken Document Retrieval

    * Compressive Sensing for Speech and Language Processing

    * Social Signals in Speech

    * The Voice - a Special Treat for the Social Brain?

    * Quality of Experiencing Speech Services

    * Speech Intelligibility Enhancement for All Ages, Health Conditions,

and Environments

    * INTERSPEECH 2010 Paralinguistic Challenge - Age, Gender, and Affect

    * The Speech Models - Searching for Better Representations of Speech

    * Fact and Replica of Speech Production

 

Paper Submission

 

Papers for the INTERSPEECH 2010 proceedings should be up to four pages in

length and conform to the format given in the  paper preparation guidelines

and author kits which is now available on the INTERSPEECH 2010 website along

with the Final  Call for Papers. Optionally, authors may submit additional

files, such as multimedia files, to be included on the  Proceedings CD-ROM.

Authors shall also declare that their contributions are original and not

being submitted for  publication elsewhere (e.g., another conference,

workshop, or journal). Papers must be submitted via the on-line paper

submission system. The deadline for submitting a paper is 30 April 2010.

This date will not be extended. Inquiries  regarding paper submissions

should be directed via email to submission@interspeech2010.org.

 

Important dates

 

  Paper submission deadline: 30 April 2010

  Notification of acceptance or rejection: 2 July 2010

  Camera-ready paper due: 9 July 2010

  Authors' registration deadline: 12 July 2010

  Early registration deadline: 28 July 2010

  Conference dates: 26-30 September 2010

 

Please visit our website at http://www.interspeech2010.org/

 

General Chair

  Keikichi Hirose

General Vice Chair

  Yoshinori Sagisaka

Back to Top

4-4 . (2011-08-27) INTERSPEECH 2011 Florence Italy

Interspeech 2011

Palazzo dei Congressi,  Italy, August 27-31, 2011.

Organizing committee

Piero Cosi (General Chair),

Renato di Mori (General Co-Chair),

Claudia Manfredi (Local Chair),

Roberto Pieraccini (Technical Program Chair),

Maurizio Omologo (Tutorials),

Giuseppe Riccardi (Plenary Sessions).

More information www.interspeech2011.org

Back to Top

4-5 . (2012-09-09) INTERSPEECH 2012, Portland, Oregon USA

INTERSPEECH 2012

Portland, U.S., 09-13 September 2012
Chair: Jan van Santen, Richard Sproat 
13th INTERSPEECH event

Back to Top

5 . Industry Notes

Carnegie Speech produces systems to teach people how to speak another language understandably. Some of its products include NativeAccent, SpeakIraqi, SpeakRussian, and ClimbLevel4. You can find out more at

www.carnegiespeech.com. You can also read about Forbes.com awarding it a Best Breakout Idea of 2009 at:

http://www.forbes.com/2009/12/21/best-breakout-ideas-2009-entrepreneurs-technology-breakout_slide_11.html

Back to Top

6 . Workshops and conferences supported (but not organized) by ISCA

 

Back to Top

6-1 . (2010-12-12) IEEE Workshop on Spoken Language Technology SLT 2010

IEEE Workshop on Spoken Language Technology

SLT 2010

December 12-15, 2010

Berkeley, CA

www.slt2010.org

 

Call for Papers

 

The Third IEEE Spoken Language Technology (SLT) Workshop will be held between December 12-15, 2010 in Berkeley, CA. The goal of this workshop is to allow the spoken language processing community to share and present recent advances in various areas of spoken language technology. This workshop has been endorsed/sponsored by the ISCA and ACL as well. The Spoken Dialog Challenge 2010 (http://www.dialrc.org/sdc) will be organized as a special session.

 

Important Dates:

• Paper Submission: July 16, 2010

• Notification: September 1, 2010

• Workshop: December 12-15, 2010

 

Workshop Topics:

• Spoken language understanding

• Spoken document summarization

• Machine translation for speech

• Spoken language based systems

• Spoken language generation

• Question answering from speech

• Human/computer interaction

• Educational/healthcare applications

• Speech data mining

• Information extraction

• Spoken document retrieval

• Multimodal processing

• Spoken dialog systems

• Spoken language systems

• Spoken language databases

• Assistive technologies

 

Organizing Chairs:

• Dilek Hakkani-Tür, ICSI

• Mari Ostendorf, U. Washington

 

Technical Chairs:

• Isabel Trancoso, INESC-ID, Portugal

• Tim Paek, Microsoft Research

 

Area Chairs:

• Julia Hirschberg, Columbia U.

• Hermann Ney, RWTH Aachen

• Andreas Stolcke, SRI/ICSI

• Ye-Yi Wang, Microsoft Research

 

Finance Chair:

• Gokhan Tur, SRI International

 

Advisory Board:

• Mazin Gilbert, AT&T Labs

• Srinivas Bangalore, AT&T Labs

• Giuseppe Riccardi, U. Trento

 

Demo Chairs:

• Alex Potamianos, Tech. U. of Crete

• Mikko Kurimo, Helsinki U. of Tech.

 

Publicity Chair:

• Bhuvana Ramabhadran, IBM

• Benoit Favre, U. Le Mans

 

Panel Chairs:

• Sadaoki Furui, Tokyo Inst. Of Tech.

• Eric Fosler-Lussier, Ohio State U.

 

Publication Chair:

• Yang Liu, U. Texas, Dallas

 

Local Organizers:

• Dimitra Vergryi, SRI International

• Murat Akbacak, SRI International

• Sibel Yaman, ICSI

• Arindam Mandal, SRI International

 

Europe Liaisons:

• Frederic Bechet, U. Avignon

• Philipp Koehn, U. Edinburgh

 

Asia Liaisons:

• Helen Meng, C. U. Hong Kong

• Gary Geunbae Lee, POSTECH

 

Keynote Speakers:

• Michael Jordan, U. California, Berkeley

• Chris Manning, Stanford U.

• James W. Pennebaker, U. Texas, Austin

Back to Top

7 . Books,databases and softwares

 

Back to Top

7-1 . Books

 

This section shows recent books whose titles been have communicated by the authors or editors.
 
Also some advertisements for recent books in speech are included.
 
This book presentation is written by the authors and not by this newsletter editor or any  volunteer reviewer.

 

Back to Top

7-1-1 . Digital Speech Transmission

Digital Speech Transmission
Authors: Peter Vary and Rainer Martin
Publisher: Wiley&Sons
Year: 2006
Back to Top

7-1-2 . Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods

Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods
Joseph Keshet and Samy Bengio, Editors
John Wiley & Sons
March, 2009
Website:  Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods
 
About the book:
This is the first book dedicated to uniting research related to speech and speaker recognition based on the recent advances in large margin and kernel methods. The first part of the book presents theoretical and practical foundations of large margin and kernel methods, from support vector machines to large margin methods for structured learning. The second part of the book is dedicated to acoustic modeling of continuous speech recognizers, where the grounds for practical large margin sequence learning are set. The third part introduces large margin methods for discriminative language modeling. The last part of the book is dedicated to the application of keyword-spotting, speaker
verification and spectral clustering. 
Contributors: Yasemin Altun, Francis Bach, Samy Bengio, Dan Chazan, Koby Crammer, Mark Gales, Yves Grandvalet, David Grangier, Michael I. Jordan, Joseph Keshet, Johnny Mariéthoz, Lawrence Saul, Brian Roark, Fei Sha, Shai Shalev-Shwartz, Yoram Singer, and Nathan Srebo. 
 
 
 
Back to Top

7-1-3 . Some aspects of Speech and the Brain.

Some aspects of Speech and the Brain. 
Susanne Fuchs, Hélène Loevenbruck, Daniel Pape, Pascal Perrier
Editions Peter Lang, janvier 2009
 
What happens in the brain when humans are producing speech or when they are listening to it ? This is the main focus of the book, which includes a collection of 13 articles, written by researchers at some of the foremost European laboratories in the fields of linguistics, phonetics, psychology, cognitive sciences and neurosciences.
 
-- 
Back to Top

7-1-4 . Spoken Language Processing,

Spoken Language Processing, edited by Joseph Mariani (IMMI and
LIMSI-CNRS, France). ISBN: 9781848210318. January 2009. Hardback 504 pp

Publisher ISTE-Wiley

Speech processing addresses various scientific and technological areas. It includes speech analysis and variable rate coding, in order to store or transmit speech. It also covers speech synthesis, especially from text, speech recognition, including speaker and language identification, and spoken language understanding. This book covers the following topics: how to realize speech production and perception systems, how to synthesize and understand speech using state-of-the-art methods in signal processing, pattern recognition, stochastic modeling, computational linguistics and human factor studies. 


More on its content can be found at
http://www.iste.co.uk/index.php?f=a&ACTION=View&id=150

Back to Top

7-1-5 . L'imagerie medicale pour l'etude de la parole

 L'imagerie medicale pour l'etude de la parole,

Alain Marchal, Christian Cave

Eds Hermes Lavoisier

99 euros • 304 pages • 16 x 24 • 2009 • ISBN : 978-2-7462-2235-9

Du miroir laryngé à la vidéofibroscopie actuelle, de la prise d'empreintes statiques à la palatographie dynamique, des débuts de la radiographie jusqu'à l'imagerie par résonance magnétique ou la magnétoencéphalographie, cet ouvrage passe en revue les différentes techniques d'imagerie utilisées pour étudier la parole tant du point de vue de la production que de celui de la perception. Les avantages et inconvénients ainsi que les limites de chaque technique sont passés en revue, tout en présentant les principaux résultats acquis avec chacune d'entre elles ainsi que leurs perspectives d'évolution. Écrit par des spécialistes soucieux d'être accessibles à un large public, cet ouvrage s'adresse à tous ceux qui étudient ou abordent la parole dans leurs activités professionnelles comme les phoniatres, ORL, orthophonistes et bien sûr les phonéticiens et les linguistes.

 
 

 

 

 

Back to Top

7-1-6 . Korpusbasierte Sprachverarbeitung

Author: Christoph Draxler
Title: Korpusbasierte Sprachverarbeitung
Publisher: Narr Francke Attempto Verlag Tübingen
Year: 2008
Link: http://www.narr.de/details.php?catp=&p_id=16394

Summary: Spoken language is a major area of linguistic research and speech technology development. This handbook presents an introduction to the technical foundations and shows how speech data is collected, annotated, analysed, and made accessible in the form of speech databases. The book focuses on web-based procedures for the recording and processing of high quality speech data, and it is intended as a desktop reference for practical recording and annotation work. A chapter is devoted to the Ph@ttSessionz database, the first large-scale speech data collection (860+ speakers, 40 locations in Germany) performed via the Internet. The companion web site (http://www.narr-studienbuecher.de/Draxler/index.html) contains audio examples, software tools, solutions to the exercises, important links, and checklists. 

Back to Top

7-2 . Database providers

 

Back to Top

7-2-1 . ELRA Language Resources Catalogue Update

    *****************************************************************
ELRA - Language Resources Catalogue - Update
*****************************************************************

In the framework of our ongoing campaign for updating and reducing the prices of the language resources distributed in the ELRA catalogue, ELRA is happy to announce that the prices for the following resources have been substantially reduced:

ELRA-S0074 British English SpeechDat(II) MDB-1000
This speech database contains the recordings of 1,000 British speakers recorded over the British mobile telephone network. Each speaker uttered around 40 read and spontaneous items.
For more information, see: http://catalog.elra.info/product_info.php?products_id=723

ELRA-S0075 Welsh SpeechDat(II) FDB-2000
This speech database contains the recordings of 2,000 Welsh speakers recorded over the British fixed telephone network. Each speaker uttered around 40 read and spontaneous items.
For more information, see: http://catalog.elra.info/product_info.php?products_id=557

ELRA-S0101 Spanish SpeechDat(II) FDB-1000
This speech database contains the recordings of 1,000 Castillan Spanish speakers recorded over the Spanish fixed telephone network. Each speaker uttered around 40 read and spontaneous items.
This database is a subset of the Spanish SpeechDat(II) FDB-4000 (ref. ELRA-S0102).
For more information, see: http://catalog.elra.info/product_info.php?products_id=726

ELRA-S0102 Spanish SpeechDat(II) FDB-4000
This speech database contains the recordings of 4,000 Castillan Spanish speakers recorded over the Spanish fixed telephone network. Each speaker uttered around 40 read and spontaneous items.
This database includes the Spanish SpeechDat(II) FDB-1000 (ref. ELRA-S0101).
For more information, see: http://catalog.elra.info/product_info.php?products_id=727

ELRA-S0140 Spanish SpeechDat-Car database
The Spanish SpeechDat-Car database contains the recordings in a car of 306 speakers, who uttered around 120 read and spontaneous items. Recordings have been made through 5 different channels, of which 4 were in-car microphones (1 close-talk microphone, 3 far-talk microphones) and 1 channel over the GSM network.
For more information, see: http://catalog.elra.info/product_info.php?products_id=690

ELRA-S0141 SALA Spanish Venezuelan Database
This speech database contains the recordings of 1,000 Venezuelan speakers recorded over the Venezuelan fixed telephone network. Each speaker uttered around 50 read and spontaneous items.
For more information, see: http://catalog.elra.info/product_info.php?products_id=736

ELRA-S0297 Hungarian Speecon database
The Hungarian Speecon database comprises the recordings of 555 adult Hungarian speakers and 50 child Hungarian speakers who uttered respectively over 290 items and 210 items (read and spontaneous).
For more information, see: http://catalog.elra.info/product_info.php?products_id=1094

ELRA-S0298 Czech Speecon database
The Czech Speecon database comprises the recordings of 550 adult Czech speakers and 50 child Czech speakers who uttered respectively over 290 items and 210 items (read and spontaneous).
For more information, see: http://catalog.elra.info/product_info.php?products_id=1095


For more information on the catalogue, please contact Valérie Mapelli mailto:mapelli@elda.org

Visit our On-line Catalogue: http://catalog.elra.info
Visit the Universal Catalogue: http://universal.elra.info
Archives of ELRA Language Resources Catalogue Updates: http://www.elra.info/LRs-Announcements.html  

Back to Top

7-2-2 . LDC News

 

In this newsletter:
 

 
New publications:
 
LDC2010S03
 
LDC2010T09
 
LDC2010T10

 


 

Coming Soon: LDC Data Scholarship Program!

We are pleased to announce that the LDC Data Scholarship program is in the works! This program will provide university students with access to LDC data at no-cost. Each year LDC distributes thousands of dollars worth of data at no- or reduced-cost to students who demonstrate a need for data, yet cannot secure funding.  LDC will formalize this practice through the newly created LDC Data Scholarship program.

Data scholarships will be offered each semester beginning with the fall 2010 semester (September - December 2010). Students will need to complete an application, which will include a data use proposal and letter of support from their faculty adviser.  We anticipate that the selection process will be highly competitive.

Stay tuned for further announcements in our newsletter and on our home page!


[ top ]

New Publications

 

(1) 2003 NIST Speaker Recognition Evaluation was developed by researchers at NIST (National Institute of Standards and Technology). It consists of just over 120 hours of English conversational telephone speech used as training data and test data in the 2003 Speaker Recognition Evaluation (SRE), along with evaluation metadata and test set answer keys.

2003 NIST Speaker Recognition Evaluation is part of an ongoing series of yearly evaluations conducted by NIST. These evaluations provide an important contribution to the direction of research efforts and the calibration of technical capabilities. They are intended to be of interest to all researchers working on the general problem of text independent speaker recognition. To this end the evaluation was designed to be simple, to focus on core technology issues, to be fully supported, and to be accessible to those wishing to participate.

This speaker recognition evaluation focused on the task of 1-speaker and 2-speaker detection, in the context of conversational telephone speech.  The original evaluation consisted of three parts: 1-speaker detection "limited data", 2-speaker detection "limited data", and 1-speaker detection "extended data". This corpus contains training and test data and supporting metadata (including answer keys) for only the 1-speaker "limited data" and 2-speaker "limited data" components of the original evaluation. The 1-speaker "extended data" component of the original evaluation (not included in this corpus) provided metadata only, to be used in conjunction with data from Switchboard-2 Phase II (LDC99S79) and Switchboard-2 Phase III Audio (LDC2002S06). The metadata (resources and answer keys) for the 1-speaker "extended data" component of the original 2003 SRE evaluation are available from the NIST Speech Group website for the 2003 Speaker Recognition Evaluation.

The data in this corpus is a 120-hour subset of data first made available to the public as Switchboard Cellular Part 2 Audio (LDC2004S07), reorganized specifically for use in the 2003 NIST SRE.

2003 NIST Speaker Recognition Evaluation is distributed on one DVD.

2010 Subscription Members will automatically receive two copies of this corpus.  2010 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US$1000.

[ top ]

*

(2)  ACE 2005 Mandarin SpatialML Annotations was developed by researchers at The MITRE Corporation (MITRE). ACE 2005 Mandarin SpatialML Annotations applies SpatialML tags to a subset of the source Mandarin training data in ACE 2005 Multilingual Training Corpus (LDC2006T06).

SpatialML is a mark-up language for representing spatial expressions in natural language documents. SpatialML focuses is on geography and culturally-relevant landmarks, rather than biology, cosmology, geology, or other regions of the spatial language domain. The goal is to allow for better integration of text collections with resources such as databases that provide spatial information about a domain, including gazetteers, physical feature databases and mapping services.

The SpatialML annotation scheme is intended to emulate earlier progress on time expressions such as TIMEX2, TimeML, and the 2005 ACE guidelines. The main SpatialML tag is the PLACE tag which encodes information about location. The central goal of SpatialML is to map location information in text to data from gazetteers and other databases to the extent possible by defining attributes in the PLACE tag. Therefore, semantic attributes such as country abbreviations, country subdivision and dependent area abbreviations (e.g., US states), and geo-coordinates are used to help establish such a mapping. The SpatialML guidelines are compatible with existing guidelines for spatial annotation and existing corpora within the ACE research program.

This corpus consists of a 298-document subset of broadcast material from the ACE 2005 Multilingual Training Corpus (LDC2006T06) that has been tagged by a native Mandarin speaker according to version 2.3 of the SpatialML annotation guidelines, which are included in the documentation for this release.

ACE 2005 Mandarin SpatialML Annotations  is distributed via web download.

2010 Subscription Members will automatically receive two copies of this corpus on disc.  2010 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US$500.

[ top ]

*

(3)  NIST 2002 Open Machine Translation (OpenMT) Evaluation is a package containing source data, reference translations, and scoring software used in the NIST 2002 OpenMT evaluation. It is designed to help evaluate the effectiveness of machine translation systems. The package was compiled and scoring software was developed by researchers at NIST, making use of newswire source data and reference translations collected and developed by LDC.

The objective of the NIST OpenMT evaluation series is to support research in, and help advance the state of the art of, machine translation (MT) technologies -- technologies that translate text between human languages. Input may include all forms of text. The goal is for the output to be an adequate and fluent translation of the original. Additional information about these evaluations may be found at the NIST Open Machine Translation (OpenMT) Evaluation web site.

This evaluation kit includes a single perl script that may be used to produce a translation quality score for one (or more) MT systems. The script works by comparing the system output translation with a set of (expert) reference translations of the same source text. Comparison is based on finding sequences of words in the reference translations that match word sequences in the system output translation.

The Chinese-language source text included in this corpus is a reorganization of data that was initially released to the public as Multiple-Translation Chinese (MTC) Part 2 (LDC2003T17). The Chinese-language reference translations are a reorganized subset of data from the same MTC corpus. The Arabic-language data (source text and reference translations) is a reorganized subset of data that was initially released to the public as Multiple-Translation Arabic (MTA) Part 1 (LDC2003T18). All source data for this corpus is newswire text.

For each language, the test set consists of two files, a source and a reference file. Each reference file contains four independent translations of the data set. The evaluation year, source language, test set, version of the data, and source vs. reference file are reflected in the file name.

NIST 2002 Open Machine Translation (OpenMT) Evaluation is distributed via web download. 

2010 Subscription Members will automatically receive two copies of this corpus on disc.  2010 Standard Members may request a copy as part of their 16 free membership corpora. Non-members may license this data for US$150.

 

Back to Top

7-2-3 . French corpus available



Nous sommes heureux de vous annoncer la mise en ligne de C-Prom, un corpus de parole en français, en libre accès:    http://sites.google.com/site/corpusprom

C-Prom est un corpus transcrit, phonétiquement aligné et annoté, et développé initialement pour l'étude des proéminences syllabiques. Il inclut 24 enregistrements répartis en 7 genres (ou styles) de parole et produits par des locuteurs francophones (issus de Belgique, de France et de Suisse), pour une durée totale de 70 minutes. 


Ce corpus est distribué librement à la communauté scientifique, sous licence CreativeCommons. Nous souhaitons qu'il puisse donner lieu à des études variées permettant aux chercheurs de confronter leurs analyses et d'éprouver leurs méthodologies sur un matériel partagé. Il est ouvert à des extensions d'enregistrements et d'annotations qui seront intégrées au fur et à mesure des contributions de tous.
 
Dans l'attente de vos visites et commentaires, nous vous adressons nos cordiales salutations,
 
Anne Catherine Simon (UCLouvain), Jean-Philippe Goldman (UniGe),
Mathieu Avanzi (UniNe, Paris 10, UCLouvain), Antoine Auchlin (UniGe)
Back to Top

8 . Jobs openings

We invite all laboratories and industrial companies which have job offers to send them to the ISCApad editor: they will appear in the newsletter and on our website for free. (also have a look at http://www.isca-speech.org/jobs.html as well as http://www.elsnet.org/ Jobs). 

The ads will be automatically removed from ISCApad after  6 months. Informing ISCApad editor when the positions are filled will avoid irrelevant mails between applicants and proposers.


Back to Top

8-1 . (2010-01-16) Post-doctoral position in France: Signal processing and Experimental Technique for a Silent Speech Interface.

Postdoctoral position in Paris, France:  Signal Processing and Experimental Technique for a Silent Speech Interface DeadLine: 30/04/2010 denby@ieee.org http://www.neurones.espci.fr/Revoix/  Postdoctoral position in Paris, France  Signal Processing and Experimental Technique for a Silent Speech Interface  The REVOIX project in Paris, France is seeking an excellent candidate for a 12 month postdoctoral position, starting as soon as possible. REVOIX (ANR-09-ETEC-005), a partnership between the Laboratoire d’Electronique ESPCI ParisTech and the Laboratoire de Phonetique et Phonologie, will design and implement a vocal prosthesis that uses a miniature ultrasound machine and a video camera to restore the original voice of persons who have lost the ability to speak due to laryngectomy or a neurological problem. The technologies developed in the project will have an additional field of application in telecommunications in the context of a silent telephone allowing its user to communicate orally but in complete silence (see special issue of Speech Communication, entitled Silent Speech Interfaces, appearing March, 2010, http://dx.doi.org/10.1016/j.specom.2009.08.002).  The project will build upon promising results obtained in the Ouisper project (ANR-06-BLAN-0166) which was completed at the end of 2009. The interdisciplinary REVOIX team includes junior and senior university and medical research staff with skills in signal processing, machine learning, speech processing, phonetics, and phonology. The ideal candidate for the post-doctoral position will have solid skills in signal processing, preferably with speech experience, but also in experimental techniques for man-machine interfaces, coupled with a with a strong motivation for working in an interdisciplinary environment to produce a working, portable silent speech interface system for use in medical and telecommunication applications. Salary is competitive for European research positions.   Contact : Professor Bruce DENBY  denby@ieee.org
Back to Top

8-2 . (2010-01-22) Modelling human speech perception Univ. of Plymouth, UK

Modelling human speech perception  Internal advisors: Dr Susan Denham, Dr Jeremy Goslin and Dr Caroline Floccia (1School of Psychology, University of Plymouth) External advisor: Dr Steven Greenberg (Silicon Speech, USA)  Applications are invited for a University-funded studentship to start in April 2010  Although artificial speech recognition systems have improved considerably over the years, their performance still falls far short of human abilities, and their robustness in the face of changing conditions is limited. In contrast, humans and other animals are able to adapt, seemingly effortlessly, to different listening environments, and are able to communicate effectively with one another in many different circumstances. In this project we aim to investigate a novel theoretical model of human speech perception based on cortical oscillators. We take as our starting point the observation that natural communication sounds contain temporal patterns or regularities evident at many different times scales (Winkler, Denham et al. 2009). The proposal is that the speech message can be extracted through adaptation of a hierarchically organised system of neural oscillators to the characteristic multi-scale temporal patterns present in the speech of the target speaker, and that by doing so extraneous interfering sounds can be simultaneously rejected.  This proposal will be tested using electrophysiological measurements of listeners attending to speech in different background sounds, analyzing activity at various pre-lexical and lexical processing levels (e.g. (Goslin, Grainger et al. 2006)), for application in the development of a biologically inspired computational model of human speech perception.  We are looking for a highly qualified and motivated student with a strong interest in auditory perception, sounds and speech perception. You will join a well-established research environment, and work alongside the brain-technology team which is currently funded by a multi-centre European project SCANDLE (http://www.scandle.eu), and a new joint British ESRC/ French ANR project, RECONVO (investigating multi-lingual speech development). Requirement: Knowledge of experimental methods and/or programming experience with a high level language; Desirable: Knowledge of signal processing techniques, models of auditory perception, electrophysiological techniques. Candidates should have a first or upper second class honours degree in an area related to Cognitive Neuroscience (Computer Science, Maths, Physics, Electrical Engineering, Neuroscience, or Psychology). Applicants with a relevant MSc or MRes are particularly welcome. The studentship will provide a fully funded full-time PhD post for three years, with a stipend of approximately £13,290 per annum. The position is open to UK citizens and EU citizens with appropriate qualification who have been resident or studied in the UK for three years.  For informal queries please contact: Dr Susan Denham (sdenham@plymouth.ac.uk<mailto:sdenham@plymouth.ac.uk>). For an application form and full details on how to apply, please visit www.plymouth.ac.uk/pghowtoapply<http://www.plymouth.ac.uk/pghowtoapply> Applicants should send a completed application form along with the following documentation to The University of Plymouth, Postgraduate Admissions Office, Hepworth House, Drake Circus, Plymouth, PL4 8AA – United Kingdom  • Two references in envelopes signed across their seals • Copies of transcripts and certificates • If English is not your first language, evidence that you meet our English Language requirements (www.plymouth.ac.uk/elr<http://www.plymouth.ac.uk/elr> ) • CV • Ethnic and Disability Monitoring Form  Closing Date: 5PM, Monday 15 February 2010.  Interviews will be held at the end of February 2010, with a proposed start date of 1 April 2010.  References Goslin, J., J. Grainger, et al. (2006). "Syllable frequency effects in French visual word recognition: an ERP study." Brain Res 1115(1): 121-34. Winkler, I., S. L. Denham, et al. (2009). "Modeling the auditory scene: predictive regularity representations and perceptual objects." Trends Cogn Sci 13(12): 532-40.
Back to Top

8-3 . (2010-01-25) Post-doc at Aalto University Postdoc (Espoo, Finland)

Aalto University Postdoc (Espoo, Finland)

The Department of Signal Processing and Acoustics will have a postdoctoral research position for the time period of 1 August 2010 - 31 December 2012 related to one of the following fields:

Digital signal processing in wireless communications, sensor array signal processing, speech processing, audio signal processing, spatial sound, or optical radiation measurements

Successful applicants are expected to strengthen and extend the department's current research and teaching in their field of expertise. The applicants are expected to have earned their doctoral degree between between 1 January 2005 - 31 May 2010.

This recruitment is a result of the department's success in the recent Research Assessment Exercise, and we are looking for strong candidates from all over the world.

The postdoc will be expected to participate in the department's teaching. The annual salary starts from 39 500 euros depending on experience.

Applications should include:

  • Research Plan
  • CV
  • List of publications
  • Names and contact information of 1-3 referees
  • Optional 1-2 letters of recommendation

Please send your applications by email to aalto-postdoc@signal.tkk.fi. Each application
should be in the form of a single pdf file. Name your file as: "surname_application.pdf". Applications are due 15 March 2010.

See also: http://www.aalto.fi/en/current/jobs/postdoc/

 

Back to Top

8-4 . (2010-01-30) PhD position at ACLC/NKI-AVL 2010 The Amsterdam Centre for Language and Communication (ACLC)

0One PhD position at ACLC/NKI-AVL 2010 							 The Amsterdam Centre for Language and Communication (ACLC) focuses on the description and explanations for variation in languages and language use. The ACLC includes both functional and formal approaches to language description and encourages dialogue between these approaches. Studies cover all aspects of speech and languages: phonetics, phonology, morphology, syntax, semantics and pragmatics - in a search for the Language Blueprint. Language typology, including that of Creole and signed languages, plays an important part in the ACLC programme. Language variation in terms of time, space and context is also a specialization. The study of variation in the different types of language user - from the child learning her first language to the adult second language learner including also different types of language pathology - is a clear focus.  Questions of speech and language loss and (re-)acquisition are a focus of the ACLC. The course of speech rehabilitation after serious pathologies of the head and neck area is an example of such loss and re-acquisition. The Department of Head and Neck Oncology and Surgery at The Netherlands Cancer Institute/Antoni van Leeuwenhoek Hospital (NKI-AVL), in collaboration with the Academic Medical Center (AMC), is involved in patient care, education and scientific research in the field of head and neck cancer. The department has a long history of quality of life research, focusing on the functional side effects of head and neck cancer and its treatment. The most common tumours include mouth and tongue, throat, and larynx (voice box) cancer. Voice and speech disorders related to head and neck cancer treatment and the rehabilitation thereof are extensively studied in close collaboration with the ACLC.  The PhD project  Title: Automatic evaluation of voice and speech rehabilitation following treatment of head and neck cancers.  Abstract:The research project will study the use of existing Automatic Speech Recognition (ASR) applications to evaluate pathologic speech after treatment of head and neck cancers in a clinical setting. The aim is to obtain therapeutically meaningful measures of the speech quality of individual patients over the course of their treatment. Basic and applied research into the properties and pathologies of Tracheo-Esophageal (TE) speech following laryngectomy has a long history at the ACLC. The current project also includes the effects of other treatment, e.g. radio- and chemotherapy. The project could also contribute to a practical end goal where ASR systems in the future could be used to obtain objective information on speech quality, real-time during treatment and rehabilitation sessions. Such objective information is needed for evidence based medical treatment and is currently lacking. Emphasis will be given to studying the relation between medical history, speech and voice acoustics, and specific ASR results for individual patients. Of special interest are word recognition errors that can be traced to specific phrasing, prosodic, and phoneme errors known to affect TE speakers. The candidate will study how pre-recorded patient materials can be evaluated using existing ASR applications and process the results. The candidate will collaborate with laboratories in Belgium and Germany.  Application and Procedure  You have to apply as a candidate. Please follow the Guidelines for applying for an internal PhD position 2010 (see below under Information).  Tasks  The PhD student needs to carry out the research and write a dissertation within the duration of the project (4 years (80%) or 3.3 years (full time)).  Requirements  Educational background :Logopedic, linguistics, or phonetics with an affinity to speech pathology  Experience : Experience with speech technology and perception experiments is welcome  Information The following documents give precise information about the application procedure:  Project description Automatic evaluation of voice and speech rehabilitation following treatment of head and neck cancers http://www.hum.uva.nl/template/downloadAsset.cfm?objectid=6A975BC9-1321-B0BE-A4AEADEE9606E295  ACLC guidelines for application 2010 http://www.hum.uva.nl/template/downloadAsset.cfm?objectid=6A992ABD-1321-B0BE-A486D0A4B8D373EB  NB Incomplete applications will be automatically rejected so please read the guidelines carefully.  Further information can be obtained from the intended supervisors of this project, Prof. Dr. Frans Hilgers, phone +31.20.512.2550, e-mail: f.hilgers@nki.nl, Dr. Rob van Son, e-mail: R.J.J.H.vanSon@uva.nl, or from the managing director of the ACLC, Dr. Els Verheugd, phone +31.20.525.2543, e-mail: E.A.B.M.Verheugd-Daatzelaar@uva.nl. The original position can be found at the ACLC web site http://www.hum.uva.nl/aclc/object.cfm/DBFA7FA8-14CF-4213-9DC6A7A4E11E9878/6A8B1555-1321-B0BE-A446C998FB2AC9E6  Position The PhD student will be appointed for a period of 4 years (80%) or 3.3 years (full time) at the Faculty of Humanities of the University of Amsterdam under the terms of employment currently valid for the Faculty. A contract will be given in the first instance for one year, with an extension for the following years on the basis of an evaluation of, amongst other things, a written piece of work. The salary (on a full time base) will be # 2042 during the first year (gross per month) and will reach # 2612 during the fourth year, in accordance with the CAO for Dutch universities.  Submissions  Submissions of your application as a candidate should be sent before 22 February, 2010 to aclc-fgw@uva.nl (or, in the case of a paper version, to the director of the ACLC, Prof. Dr P.C Hengeveld, Spuistraat 210, 1012 VT Amsterdam). Applications received after this date or those that are incomplete will not be taken into consideration.
Back to Top

8-5 . (2010-02-01) DGA ouvre un poste au sein de l'équipe de traitement du langage (France)

La DGA ouvre un poste au sein de l'équipe de traitement du langage.

* Poste et missions :

En lien avec les acteurs scientifiques et industriels du traitement automatique du langage oral et écrit et pour répondre aux besoins des opérationnels de la défense à court comme à long terme, vous serez en charge de concevoir, spécifier, suivre et évaluer des projets technologiques dans le domaine.

De manière à mener à bien efficacement ces projets, vous effectuerez également une veille technologique active, des actions de coordination au niveau national et international, et des travaux d'étude et de développement informatique.

* Profil :

Une expérience dans le domaine du traitement automatique du langage associée à des compétences en gestion de projet est recherchée. La maitrise de l'anglais et des relations internationales sont un plus.

Diplôme grande école ingénieur ou diplôme niveau bac+5 exigé.

* Référence

http://cadres.apec.fr/offres-emploi-cadres/0_0_5_21430248W________offre-d-emploi-expert-traitement-du-langage-h-f.html

Les candidatures peuvent être envoyées soit via l'APEC soit directement. 

Back to Top

8-6 . (2010-02-03) ASSISTANT/ASSOCIATE PROFESSOR POSITION IN MULTIMEDIA AT EURECOM

ASSISTANT/ASSOCIATE PROFESSOR POSITION IN MULTIMEDIA AT EURECOM

The Multimedia Communications Department of EURECOM invites applications
for a faculty position at the Assistant/Associate Professor level. The
new faculty is expected to participate in teaching in our Master program
and to develop a new research activity in
                                      Ambient Multimedia.
We are especially interested in research directions which may extend our
existing activities in audio and video analysis towards pioneering new
approaches to interaction between people and their environment, in
everyday life or professional situations, for better productivity,
security, healthcare or entertainment.

Candidates must have a Ph.D. in computer science or electrical
engineering and between 5 and 10 years of research experience after PhD.
The ideal candidate will have an established research track record at
the international level, and a proven record of successful collaboration
with academic and industrial partners in national and European programs
or equivalent. A strong commitment to excellence in research is
mandatory. Exceptional candidates may be considered at the senior level.

Screening of applications will begin in January, 2010, and the search
will continue until the position is filled. Applicants should send, by
email, a letter of motivation, a resume including a list of their
publications, the names of 3 referees and a copy of their three most
important publications, to:
           mm_position@eurecom.fr
with the subject: ASSISTANT PROFESSOR POSITION IN MULTIMEDIA

EURECOM (http://www.eurecom.fr/) is a graduate school in communication
systems founded in 1992 by EPFL (Swiss Federal Institute of Technology,
Lausanne, http://www.epfl.ch/) and Telecom Paris Tech
(http://www.enst.fr/), together with several academic and industrial
partners. EURECOM's activity includes research and graduate teaching in
corporate, multimedia and mobile communications. EURECOM currently has a
faculty of 20 professors, 200 Master students and 60 PhD students.
EURECOM is involved in many European research projects and joint
collaborations with industry. EURECOM is located in Sophia-Antipolis, a
major European technology park for telecommunications research and
development in the French Riviera.

Back to Top

8-7 . (2010-02-08) Ircam recruits a Developer W/M under limited-term contract of 18 months and full-time Paris

Ircam recruits two Researchers under limited-term contract of 18 months and full-time

From April 1st, 2010

 
Introduction to IRCAM

IRCAM is a leading non-profit organization associated to Centre Pompidou, dedicated to music production, R&D and education in acoustics and music. It hosts composers, researchers and students from many countries cooperating in contemporary music production, scientific and applied research. The main topics addressed in its R&D department include acoustics, audio signal processing, computer music, interaction technologies, musicology. Ircam is located in the centre of Paris near the Centre Pompidou, at 1, Place Igor Stravinsky 75004 Paris.

 
Introduction to Quaero project

Quaero is a 200 M€ collaborative research and development program focusing on the areas of automatic extraction of information, analysis, classification and usage of digital multimedia content for professionals and consumers. The research work shall concentrate on managing virtually unlimited quantities of multimedia and multilingual information, including text, speech, music, image and video. Five main application areas have been identified by the partners:

1.       multimedia internet search

2.       enhanced access services to audiovisual content on portals

3.       personalized video selection and distribution

4.       professional audiovisual asset management

5.       digitalization and enrichment of library content, audiovisual cultural heritage and scientific information.

 The Quaero consortium was created to meet new multimedia content analysis requirements for consumers and professionals, faced with the explosion of accessible digital information and the proliferation of access means (PC,  TV, handheld devices). More information can be found at www.quaero.org/.

 Role of Ircam in Quaero Project

In the Quaero project, Ircam is in charge of the coordination of audio/music indexing research and of development of music-audio indexing technology: music content-description (tempo, rhythm, key, chord, singing-voice, and instrumentation description), automatic indexing (music genre/style, mood), music similarity, music audio summary, chorus detection and audio identification. A specificity of the project is the creation of a large-music-audio corpus in order to train and validate all the algorithms developed during the project.

 Position description

Researchers would be in charge of the development of the technologies related to

·         music-audio content description/ content-extraction: tempo and beat/measure position estimation, key/mode, chord progression, instrument/drum identification, singing voice location, voice description

·         music automatic indexing into music genre, music mood

·         music similarity: especially on large-scale databases

·         music structure discovery, automatic music audio summary generation, chorus location

Researcher will also collaborate with the evaluation team who evaluate algorithms performances and with the developer team.

 Required profile

·         Very high skills in audio signal processing (spectral analysis, audio-feature extraction, parameter estimation)

·         High skill in audio indexing and data mining (statistical modelling, automatic feature selection algorithm, …)

·         High-skill in large-database search algorithms

·         Good knowledge of Linux, Windows, MAC-OS environments

·         High-skill in Matlab programming, skills in C/C++ programming

·         High productivity, methodical work, excellent programming style.

 
Salary

According to background and experience

 Applications

Please send an application letter together with your resume and any suitable information addressing the above issues preferably by email to: peeters_a_t_ircam dot fr with cc to vinet_a_t_ircam dot fr, rod_a_t_ircam_dot_fr, roebel_at_ircam_dot_fr

 

 

 

Back to Top

8-8 . (2010-02-08) Professeurs IFSIC France

« Problématiques informatiques intégrant l’aléatoire » (27 PR 1214)

Profil pédagogique :

Ce professeur rejoindra l’équipe pédagogique de l’IFSIC  et interviendra tant en Licence qu’en Master. La personne recrutée sera de formation informatique et sera à même d’illustrer l’apport des méthodes probabilistes ou statistiques à plusieurs domaines de l’informatique.

Profil de recherche :

 De nombreuses problématiques informatiques, développées au sein du laboratoire, requièrent des approches probabilistes (ou mixtes, déterministes/probabilistes) et/ou impliquant des aspects statistiques. Nous recherchons un professeur dont le profil de recherche aborde ces questions informatiques intégrant l'aléatoire.


Les domaines de recherche incluent la modélisation probabiliste,  les infrastructures (réseaux, qualité des services)  ainsi que le traitement de données numérisées et numériques (fouille de données, apprentissage).  Les applications incluent par exemple l’image et le son.

« Informatique pour la domotique» (27 MCF 1069 - ce poste sera affecté à l'Ecole Supérieure d'Ingénieurs de Rennes (ESIR))


Profil pédagogique :

L'enseignant-chercheur recruté sur ce poste sera affecté à la formation d'ingénieur en informatique et télécommunication de Rennes 1.  Il interviendra particulièrement dans les options Domotique et Informatique de cette formation.  Selon son profil, il y enseignera soit les techniques utilisées dans les infrastructures informatiques de la domotique (ex. réseau, systèmes embarqués, architecture logiciel), soit les techniques utilisées dans les services domotiques (ex. maintien à domicile, commande vocale, sécurité, contrôle énergétique).

Profil de recherche :

L'enseignant-chercheur recruté sur ce poste pourra être affecté à une équipe de l'IRISA (UMR 6074) spécialiste des techniques utilisées dans les infrastructures informatiques de la domotique (voir plus haut), ou à une équipe spécialiste des techniques utilisées dans les services domotiques (ex. traitement de la parole, IHM, traitement de données).  Il devra collaborer avec les spécialistes en domotique de l'IETR (UMR 6164).

Une expérience avérée de l'enseignement en domotique ou de l'application de techniques informatiques en domotique est souhaitée.

Back to Top

8-9 . (2010-02-09) Poste de Professeur (Informatique, Dialogue, Parole, Texte, Apprentissage Automatique) LIA Universite d'Avignon France

*Poste de Professeur en Informatique n° 232 au LIA (Université d'Avignon)
Intitulé : Informatique, Dialogue, Parole, Texte, Apprentissage Automatique
*
Description courte : le profil recherche de ce poste se situe, dans l’idéal, au confluent de trois
disciplines : la Reconnaissance Automatique de la Parole (RAP), le Traitement Automatique de la
Langue Naturelle (TALN) et l'Apprentissage Automatique (AA). Préférence sera donnée aux
candidat(e)s menant des recherches sur les traitements linguistiques de haut niveau de la langue
orale, en particulier dans le cadre d'applications de compréhension de la langue et de traduction
automatique. Les contextes applicatifs envisagés sont les interfaces de dialogue homme-machine et
le traitement de vastes archives sonores (données diffusées et archives de centres d'appels).

*Pour lire la description longue du profil détaillé :
*/=============================
/http://lia.univ-avignon.fr/fileadmin/documents/Users/Intranet/dossiers/Profil_Poste_Pr_232_2010_LIA_UAPV.pdf

*ATTENTION, il s'agit d'un recrutement au fil de l'eau et non de la campagne
synchronisée. Les candidatures sont à déposer avant le 4 Mars 2010.

Back to Top

8-10 . (2010-02-11)The Faculty of Engineering at the University of Sheffield, UK, is recruiting for 5 'Prize Lectureships'

The Faculty of Engineering at the University of Sheffield, UK, is recruiting for 5 'Prize Lectureships': see http://www.jobs.ac.uk/job/AAP217/prize-lectureships-5-positions/

These posts may be at any grade from Lecturer (junior Faculty) to Reader (close to Professor), and they come with an attractive funding package for studentships and  research start-up.

If you are interested in applying for a prize lectureship to join SPandH, the Speech and Hearing group in Computer Science (http://www.dcs.shef.ac.uk/spandh/), please feel free to contact the SPandH academics:  contact details on the web site.

Phil Green, Roger Moore, Guy Brown, Jon Barker, Thomas Hain, Yoshi Gotoh. 

Back to Top

8-11 . (2010-03-04) Post doctoral Position in Speech Recognition (Grenoble, France)

Post doctoral Position in Speech Recognition (Grenoble, France)
Title: Application and Optimization of Speech Detection and Recognition Algorithms in Smart Homes
URLs: http://getalp.imag.fr/ http://sweet-home.imag.fr/
Start Date: October 2010
Duration and salary: 12 months, 1900 euros ¤
Keywords: Speech Recognition, Home Automation, Smart Homes

Description: The GETALP Team of the Laboratory of Informatics of Grenoble invites applications
for a full-time post-doctoral researcher to work on the SWEET-HOME ("Système
Domotique d’Assistance au Domicile") national French project funded by the
ANR ("Agence Nationale de la Recherche"). This project aims to deliver sufficient
support to people who need support for independent living such as elderly of disabled
persons (e.g., Alzheimer, cognitive deficiency . . . ). This assessment is usually
done through sensing technology (e.g., microphones, infra-red presence sensors,
door-contacts, etc.) which detects critical situations in order to generate the appropriate
action to support the inhabitant (call to an emergency service, call to relatives . . . ).
A few microphones are set in an experimental apartment in order to recognize sounds
and speech in real time. The recognition is challenging given that the speaker may
be far from the microphone and because of additive noise and reverberation. Indeed,
the position requires a significant experience in speech recognition. The project consortium
is composed of the LIG (Joseph Fourier University), the ESIGETEL and the
Theoris, Technosens and Camera-Contact companies. The experimental apartment
DOMUS of the Carnot Institute of Grenoble will be used by the consortium during
this project.
Requirements: The succesful candidate will have been awarded a PhD degree in computer science
or signal processing, involving automatic speech recognition. Expertise in environmental
robustness, independent component analysis (ICA), is a bonus, as any other
experience relevant to signal processing. The candidate will have a strong research
track record with significant publications at leading international conferences or in
journals. She/He will be highly motivated to undertake challenging applied research.
Moderate level in French language is required as the project language will be French.
Applications: Please send to the address below (i) a one page statement of your research interests
and motivation, (ii) yout CV and (iii) references before 1st of July 2010. 

Back to Top

8-12 . (2010-03-05) Post-doctoral position: Acoustic to articulatory mapping of fricative sounds, Nancy F

Acoustic to articulatory mapping of fricative sounds

Post-doctoral position

Nancy (France)

Environment

This subject deals with acoustic to articulatory mapping [Maeda et al. 2006], i.e. the recovery of the vocal tract shape from the speech signal possibly supplemented by images of the speaker’s face. This is one of the great challenges in the domain of automatic speech processing which did not receive satisfactory answer yet. The development of efficient algorithms would open new directions of research in the domain of second language learning, language acquisition and automatic speech recognition.

The objective is to develop inversion algorithms for fricative sounds. Indeed, there exist now numerical simulation models for fricatives. Their acoustics and dynamics are better known than those of stops and it will be the first category of sounds to be inverted after vowels for which the Speech group has already developed efficient algorithms.

The production of fricatives differs from that of vowels about two points:

·       The vocal tract is not excited by the vibration of vocal cords located at larynx but by a noise. This noise originates in the turbulent air flow downstream the constriction formed by the tongue and the palate.

·       Only the cavity downstream the constriction is excited by the source.

The approach proposed is analysis-by-synthesis. This means that the signal, or the speech spectrum, is compared to a signal or a spectrum synthesized by means of a speech production model which incorporates two components: an articulatory model intended to approximate the geometry of the vocal tract and an acoustical simulation intended to generate a spectrum or a signal from the vocal tract geometry and the noise source. The articulatory model is geometrically adapted to a speaker from MRI images and is used to build a table made up of couples associating one articulatory vector and the corresponding acoustic image vector. During inversion, all the articulatory shapes whose acoustic parameters are close to those observed in the speech signal are recovered. Inversion is thus an advanced table lookup method which we used successfully for vowels [Ouni & Laprie 2005] [Potard et al. 2008].

Objectives

The success of an analysis by synthesis method relies on the implicit assumption that synthesis can correctly approximate the speech production process of the speaker whose speech is inverted. There exist fairly realistic acoustic simulations of fricative sounds but they strongly depend on the precision of the geometrical approximation of the vocal tract used as an input. There also exist articulatory models of the vocal tract which yield very good results for vowels. On the other hand, these models are inadequate for those consonants which often require a very accurate articulation at the front part of the vocal tract. The first part of the work will be about the elaboration of articulatory models that are adapted to the production of consonants and vowels. The validation will consist of piloting the acoustic simulation from the geometry and of assessing the quality of the synthetic speech signal with respect to the natural one. This work will be carried out for some X-ray films, whose the acoustic signal recorded during the acquisition of them is sufficiently good.

The second part of the work will be about several aspects of the inversion strategy. Firstly, it is now accepted that spectral parameters implying a fairly marked smoothing and frequency integration have to be used, which is the case of MFCC (Mel Frequency Cepstral Coefficients) vectors. However, the most adapted spectral distance to compare natural and synthetic spectra has to be investigated. Another solution consists in modeling the source so as to limit its impact on the computation of the spectral distance.

The second point is about the construction of the articulatory table which has to be revisited for two reasons: (i) only the cavity downstream the constriction plays an acoustic role, (ii) the location of the noise source is an additional parameter but it depends on the other articulatory parameters. The third point concerns the way of taking into account the vocal context. Indeed, the context is likely to provide important information about the vocal tract deformations before and after the fricative sound, and thus constraints for inversion.

A very complete software environment already exists in the Speech group for acoustic-to-articulatory inversion, which can be exploited by the post-doctoral student.

References

 [S. Ouni and Y. Laprie 2005] Modeling the articulatory space using a hypercube codebook for acoustic-to-articulatory inversion, Journal of the acoustical Society of America, Vol. 118, pp. 444-460

[B. Potard, Y. Laprie and S. Ouni], Incorporation of phonetic constraints in acoustic-to-articulatory inversion, JASA, 123(4), 2008 (pp.2310-2323).

[Maeda et al. 2006] Technology inventory of audiovisual-to-articulatory inversion http://aspi.loria.fr/Save/survey-1.pdf

 

Skill and profile

Knowledge of speech processing and articulatory modeling.

Supervision and contact:

Yves Laprie (Yves.Laprie@loria.fr)

Duration:

1 year (possibly extendable)

Important  and useful links

The PhD should have been defended no more than a year before the recruitment date.

Back to Top

8-13 . (2010-03-12) Invitation to join the graduate team at the CLSP (Johns Hopkins U.) for the summer school

Workshops

 

Undergraduate Team Members

The Center for Language and Speech Processing at the Johns Hopkins University is seeking outstanding members of the current junior class to participate in a summer workshop on language engineering from June 7th to July 30th, 2010

No limitation is placed on the undergraduate major. Only enthusiasm for research, relevant skills, past academic and employment record, and the strength of letters of recommendation will be considered. Students of Biomedical Engineering, Computer Science, Cognitive Science, Electrical Engineering, Linguistics, Mathematics, Physics, Psychology, etc. may apply. Women and minorities are encouraged to apply. The workshop is open to both US and international students.

  • An opportunity to explore an exciting new area of research.
  • A two-week tutorial on speech and language technology.
  • Mentoring by an experienced researcher.
  • Use of a computer workstation throughout the workshop.
  • A $5000 stipend and $2520 towards per diem expenses.
  • Private furnished accommodation for the duration of the workshop.
  • Travel expenses to and from the workshop venue.
  • Participation in project planning activities.

The eight-week workshop provides a vigorously stimulating and enriching intellectual environment and we hope it will encourage students to eventually pursue graduate study in the field of human language technologies.

MailScanner has detected a possible fraud attempt from "." claiming to be Click Here to Apply!

 

The 2010 Workshop Teams

 

 

Selection Criteria

 

Four to eight undergraduate students will be selected for next summer's workshop. It is expected that they will be members of the current junior class. Applicants must be proficient in computer usage, including either C, C++, Perl or Python programming and have exposure to basic probability or statistics. Knowledge of the following will be considered, but is not a prerequisite: Linguistics, Speech Communication, Natural Language Processing, Cognitive Science, Machine Learning, Digital Signal Processing, Signals and Systems, Linear Algebra, Data Structures, Foreign Languages, or MatLab or similar software. .


 


 

Equal Opportunity Policy

The Johns Hopkins University admits students of any race, color, sex, religion, national or ethnic origin, age, disability or veteran status to all of the rights, privileges, programs, benefits and activities generally accorded or made available to students at the University. It does not discriminate on the basis of race, color, sex, religion, sexual orientation, national or ethnic origin, age, disability or veteran status in any student program or activity, including the administration of its educational policies, admission policies, scholarship and loan programs, and athletic and other University-administered programs or in employment. Accordingly, the University does not take into consideration personal factors that are irrelevant to the program involved.

Questions regarding access to programs following Title VI, Title IX, and Section 504 should be referred to the Office of Institutional Equity, 205 Garland Hall, (410) 516-8075.

 

Policy on the Reserve Officer Training Corps.

Present Department of Defense policy governing participation in university-based ROTC programs discriminates on the basis of sexual orientation. Such discrimination is inconsistent with the Johns Hopkins University non-discrimination policy. Because ROTC is a valuable component of the University that provides an opportunity for many students to afford a Hopkins education, to train for a career and to become positive forces in the military, the University, after careful study, has decided to continue the ROTC program and to encourage a change in federal policy that brings it into conformity with the University's policy.



 

Back to Top

8-14 . (2010-03-11) Post doc position in Crete.

Post-doctoral position in speech coding for speech synthesis
A post-doctoral research position in the field of speech synthesis is open at France Telecom-Orange Labs in Lannion, France. This study will involve the design and implementation of new speech coding methods particularly suited for speech synthesis. The objective of this work is twofold: to propose new algorithms for compressing acoustic inventories in concatenative synthesis; to implement the building blocks for speech coding/decoding in the context of parametric synthesis (HMM-based).
This one-year post-doctoral contract lies within a collaboration between Orange Labs (France) and the University of Crete (Greece). Travels between these two entities should thus be expected since the work will be developed in both sites.
Required Skills:
Excellent knowledge of signal processing and speech coding;
Extensive experience with C, C++ programming;
Good familiarity with Linux and Windows development environments.
Knowledge about Sinusoidal Speech modelling and coding will be considered as an advantage.
Salary: around 2300 € net per month depending on experience.
Closing date for applications: May 30th 2010.
Starting date: June/September 2010
Please send applications (CV+ 2 ref letters) or questions to:
Olivier Rosec
Tel: +33 2 96 05 20 67
olivier.rosec@orange-ftgroup.com
Yannis Stylianou
Tel: +30 2810 391713
styliano@ics.forth.gr
Back to Top

8-15 . (2010-03-11) Post doc in speech synthesis in Crete

Post-doctoral position in speech synthesis
A post-doctoral research position in the field of speech synthesis is open at Orange Labs in Lannion, France. This study will involve the design and implementation of a new hybrid speech synthesis system combining HMM-based synthesis and unit selection synthesis. The successful candidate will: first, develop a toolkit for training HMM models from the acoustic data available at Orange Labs; second, implement the acoustic parameters generation in the Orange Labs speech synthesizer; third, propose, design and implement an hybrid speech synthesis system combining selected and HMM-based units.
Required Skills:
-
PhD in computer science or electrical engineering
-
Strong knowledge in automatic learning (including HMM)
-
Extensive experience with C/C++ programming
-
Knowledge of HTK/HTS is a plus
Salary: around 2300 € per month depending on experience.
Closing date: April 30th 2010.
Contact:
Cedric BOIDIN
Tel: +33 2 96 05 33 53
cedric.boidin@orange-ftgroup.com
Back to Top

8-16 . (2010-03-11) PhD opportunity in speech transformation in Crete.

PhD Opportunity in Speech Transformation
A full-time 3 year PhD position is available at France Telecom – Orange Labs in Lannion, France.
The position is within Orange Labs speech synthesis team and under academic supervision by Prof. Stylianou from Multimedia Informatics Laboratory at the University of Crete in Heraklion, Greece. Both labs conduct world class research in speech processing in areas like speech synthesis, speech transformation, voice conversion and speech coding.
Starting date: September 2010/January 2011
Application dates: March 30th 2010/October 30th 2010
Research fields: Speech processing, speech synthesis, pattern recognition, statistical signal processing, machine learning.
Project Description:
Speech transformation refers to the various modifications one may apply to the sound produced by a person, speaking or singing. It covers a wide area of research from speech production modeling and understanding to perception of speech, from natural language processing, modeling and control of speaking style, to pattern recognition and statistical signal processing. Speech Transformation has many potential applications in areas like entertainment, film and music industry, toys, chat rooms and games, dialog systems, security and speaker individuality for interpreting telephony, high-end hearing aids, vocal pathology and voice restoration.
In speech transformation, the majority of work is dedicated to pitch modification as well as to timbre transformation. Many techniques have been suggested in the literature, among which methods based on PSOLA, Sinusoidal Modeling, Harmonic plus Noise Model, Phase Vocoder and STRAIGHT. The above methods yield high quality for moderate pitch modifications and for well-mastered spectral envelope modifications. For more sophisticated transformations, the output speech cannot be considered natural.
During this thesis, we will focus on the re-definition of pitch and timbre modification in order to develop a high quality speech modification system. This will be designed and developed in the context of a quasi-harmonic speech representation which was recently suggested for high-quality speech analysis and synthesis purposes.
Salary: around 1700 € net per month.
Please send applications (CV+ 2 ref letters) or questions to:
Yannis Stylianou
Tel: +30 2810 391713
styliano@ics.forth.gr
Olivier Rosec
Tel: +33 2 96 05 20 67
olivier.rosec@orange-ftgroup.com
Back to Top

8-17 . (2010-03-19) Post doc Speech recognition position at INRIA Rennes

Title : Bayesian networks for modeling and handling variability sources in speech recognition

 

- Location: INRIA Nancy Grand Est research center --- LORIA Laboratory, NANCY, France

 

- Project-team: PAROLE

Contact: Denis Jouvet  (denis.jouvet@loria.fr)

 

                In state-of-art speech recognition systems, Hidden Markov Models (HMM) are used to model the acoustic realization of the sounds. The decoding process compares the unknown speech signal to sequences of these acoustic models to find the best matching sequence which determines the recognized words. Lexical and grammatical constraints are taken into account during the decoding process; they limit the amount of model sequences that are considered in the comparisons, which, nevertheless remains very large. Hence precise acoustic models are necessary for achieving good speech recognition performance. To obtain reliable parameters, the HMM-based acoustic models are trained on very large speech corpus. However, speech recognition performance is very dependent on the acoustic environment: good performance is achieved when the acoustic environment matches with that of the training data, and performance degrades when the acoustic environment gets different.

                The acoustic environment depends on many variability sources which impact on the acoustic signal. This includes the speaker gender (male / female), individual speaker characteristics, the speech loudness, the speaking rate, the microphone, the transmission channel, and of course the noise, to name only of few of them [Benzeghiba et al, 2007]. Using a training corpus which exhibits too many different variability sources (for example many different noise levels, too different channel speech coding schemes, ...) makes the acoustic models less discriminative, and thus lowers the speech recognition performance. On the opposite, having many sets of acoustic models, each one of them dedicated to a specific environment condition raises training problems. Indeed, because each training subset is restricted to a specific environment condition, its size gets much smaller, and consequently it might be impossible to train reliably some parameters of the acoustic models associated to this environment condition.

                In recent years, Dynamic Bayesian Networks (DBN) have been applied in speech recognition. In such an approach, certain model parameters are set dependent on some auxiliary features, such as articulatory information [Stephenson et al., 2000], pitch and energy [Stephenson et al. 2004], speaking rate [Shinozaki & Furui, 2003] or some hidden factor related to a clustering of the training speech data [Korkmazsky et al., 2004]. The approach has also been investigated for dealing with multiband speech recognition, non-native speech recognition, as well as for taking estimations of speaker classes into account in continuous speech recognition [Cloarec & Jouvet, 2008]. Although the above experiments were conducted with limited vocabulary tasks, they showed that Dynamics Bayesian Networks provide a way of handling some variability sources in the acoustic modeling.

 

The objective of the work is to further investigate the application of Dynamic Bayesian Network (DBN) for continuous speech recognition application using large vocabularies. The aim is to estimate the current acoustic environment condition dynamically, and to constraint the current acoustic space used during decoding accordingly. The underlying idea is to be able to handle various range of acoustic space constraints during decoding. Hence, when the acoustic environment condition estimation is reliable, the corresponding specific condition constraints can be used (leading, for example, to model parameters associated to a class of very similar speakers in a given environment). On the opposite, when the acoustic environment condition estimation is less reliable, more tolerant constraints should be used (leading, for example, to model parameters associated to a broader class of speakers or to several environment conditions).

                Within the formalism of Dynamic Bayesian Networks, the work to be carried out is the following. The first aspect concerns the optimization of the classification of the training data, and associated methods for estimating the classes that best matches unknown test data automatically. The second aspect involves the development of confidence measures associated to the classification process of test sentences, and the integration of these confidence measures in the DBN modeling (in order to constraint more or less the acoustic space for decoding according to the reliability of the environment condition estimation).

 

 

Back to Top

8-18 . (2010-03-24) Post docs en synthese et en codage chez Orange Labs Lannion France

Proposition post-doctorale en synthèse vocale  Une offre de recherche post-doctorale dans le domaine de la synthèse vocale est à pourvoir à Orange Labs, Lannion, France. L'étude portera sur la conception et l'implémentation d'un système de synthèse hybride alliant sélection d'unités et HMM. L'objectif du travail est dans un premier temps de développer un ensemble d'outils d'apprentissage des modèles HMM à partir de bases de données acoustiques d'Orange Labs. Il s'agit dans un deuxième temps d'implémenter les fonctionnalités de génération des paramètres acoustiques dans le synthétiseur temps réel d'Orange Labs. Enfin, la troisième partie consistera à concevoir et implémenter un système hybride combinant unités sélectionnées et unités générées à partir de modèles HMM.   Compétences requises : - Doctorat en informatique - Très bonnes connaissances en apprentissage automatique (HMM notamment) - Très bonne maîtrise de la programmation en C/C++ - La maîtrise d'HTK/HTS est un plus  Rémunération : environ 2300 € net par mois selon expérience.  Date de clôture : 30 avril 2010.  Contact:  Cedric BOIDIN Tel: +33 2 96 05 33 53  cedric.boidin@orange-ftgroup.com   ===========================================================================  Proposition post-doctorale en codage de la parole pour la synthèse vocale  Une offre de recherche post-doctorale dans le domaine de la synthèse vocale est à pourvoir à Orange Labs, Lannion, France. L'étude portera sur la conception et l'implémentation de nouvelles méthodes de codage de la parole adaptée à la synthèse vocale. L'objectif du travail est d'une part de proposer de nouveaux algorithmes pour la compression de dictionnaires acoustiques dans le cadre de la synthèse par concaténation. D'autre part, le travail consistera également à implémenter des briques algorithmiques de codage/décodage de la parole dédiées à la synthèse paramétrique (par HMM). Ce contrat post-doctoral d'un an s'inscrit dans le cadre d'une collaboration étroite entre Orange Labs et l'Université de Crète. Des déplacements sont donc à prévoir entre ces deux entités.   Compétences requises : - Excellentes connaissances en traitement du signal et codage de la parole ;  - Très bonnes connaissances de la programmation en C, C++ ;  - Bonne maîtrise des environnements de développements sous Linux et Windows.   Rémunération : environ 2300 € net par mois selon experience.  Date de clôture : 30 juin 2010.  Contacts:   Olivier Rosec  Tel: +33 2 96 05 20 67 olivier.rosec@orange-ftgroup.com  Yannis Stylianou Tel: +30 2810 391713 styliano@ics.forth.gr
Back to Top

8-19 . (2010-03-25) PhD grant at INRIA Loria Nancy France

PhD Thesis position at INRIA Nancy

Motivations


Through collaboration with a company which sells documentary rushes,
we are interested in indexing these rushes using the automatic recognition
of the rush dialogues.

The speech team has developed a system for automatic transcription
of broadcast news: ANTS.

Automatic transcription systems are now
reliable to transcript read or "prepared" speech such as broadcast
news, but their performance decreases on spontaneously uttered speech[1, 4, 5].
Spontaneous speech is characterized by :
- speech disfluencies (filled pause, repetition, repair, false start and partial word),
- pronunciation variants as word and syllable contractions (/want to/ > /wanna/),
-  speaking rate variations (reducing the articulation of some phonemes and lengthening other phonemes)
- live environment (laughs, applause) and simultaneous speech.
In addition to disfluencies, spontaneous speech is characterized by ungrammatical sentences and a language register which
is difficult to model because of the small amount of available transcribed data. Therefore, processing spontaneous speech is one of
the challenges of Automatic Speech Recognition (ASR).

Subject
The purpose of the subject is to take into account the specific phenomena
related to spontaneous speech such as hesitations, pauses, false starts, ... to improve the recognition rate[4,6,7].
To do this, it will be necessary to model these specific phenomena.

We have a speech corpus  in which these events were
labeled. This corpus will be used to select parameters, estimate models and evaluate the results.

Scope of Work
The work will be done within the Speech team of Inria-Loria.
The student will use the software ANTS for automatic speech recognition developed by the team.

Profile of candidate
The applicants for this PhD position should be fluent in English or in French. Competence in French is optional, though applicants will be encouraged to acquire this skill during training.
Strong software skills are required, especially Unix/linux, C, Java, and a scripting language such as Perl or Python.

Contact:
fohr@loria.fr or illina@loria.fr or mella@loria.fr

[1] S. Galliano, E. Geoffrois, D.Mostefa , K. Choukri, JF. Bonastre and G. Gravier, The ESTER Phase II Evaluation Campaign for Rich Transcription of French broadcast news, EUROSPEECH 2005,
[2] I. Irina, D. Fohr, O. Mella and C.Cerisara, The Automatic News Transcription System: ANTS some realtime experiments, ISCPL2004
[3] D. Fohr, O. Mella, I. Irina and C. Cerisara, Experiments on the accuracy of phone models and liaison processing in a French broadcast news transcription system, ISCPL2004
[4] J.-L Gauvain, G. Adda, L. Lamel, L. F. Lefevre and H. Schwenk, Transcription de la parole conversationnelle Revue TAL vol 45 n3
[5] M. Garnier-Rizet, G. Adda, F. Cailliau, J.-L. Gauvain, S. Guillemin-Lanne, L. Lamel, S. Vanni, C. Waaste-Richard, CallSurf: Automatic transcription, indexing a nd structuration of call center conversational speech for knowledge extraction and query by content. LREC 2008
[6] J.Ogata, M.Goto, The use of acoustically detected filled and silent pauses in spontaneous speech recognition ICASSP 2009
[7] F. Stouten, J. Duchateau, J.-P. Martens and P. Wambacq, Coping with disfluencies in spontaneous speech recognition: Acoustic detection and linguistic context manipulation, Speech Communication vol 48,  2006 

Back to Top

8-20 . (2010-04-08) Post-Doctoral Position at EURECOM, Sophia Antipolis, France

Post-Doctoral Position at EURECOM, Sophia Antipolis, France

Title:      Adaptable speech activated robot-interface for the elderly
Department: Multimedia Communications
URL:        http://www.eurecom.fr/mm.en.htm

Start date: Early summer 2010
Duration:   18 months

Description: EURECOM’s Multimedia Communications Department invites applications for a full-time, 18-month post-doctoral position related to a project recently awarded from the EU Ambient Assisted Living joint research and development funding programme.  The Adaptable Ambient LIving ASsistant (ALIAS) project aims to develop a mobile robot system that interacts with elderly users, monitors and provides cognitive assistance in daily life, and promotes social inclusion by creating connections to people and events in the wider world.  One of ALIAS’s goals involves the development of an adaptable speech interface and is to be developed through this research position.  It requires the development of speaker diarization, localization and speech recognition systems in order to identify and track users, in addition to speech synthesis and recognition to communicate and recognize spoken commands.  All of these technologies will be integrated into a dedicated dialogue manager.

Requirements: The successful candidate will have been awarded a PhD degree in a relevant field of speech processing prior to their joining Eurecom.  You will have a strong research track record with significant publications at leading international conferences and/or in journals. Experience of collaborative research and development projects at the European level is desirable.  You will be highly motivated to undertake challenging, applied research and have excellent English language speaking and writing skills.   French language skills are a bonus.

Applications: Please send to the address below (i) a one page statement of research interests and motivation, (ii) your CV and (iii) contact details for two referees (preferably one from your PhD or most recent research supervisor) before 31st May 2010.

Contact:        Dr Nicholas Evans
Postal address: 2229 Route des Crêtes BP 193,
                F-06904 Sophia Antipolis cedex, France
Email address:  nicholas.evans@eurecom.fr
Web address:    http://www.eurecom.fr/main/institute/job.en.htm
Phone:          +33/0 4 93 00 81 14
Fax:            +33/0 4 93 00 82 00

EURECOM is located in Sophia Antipolis, a vibrant science park on the French Riviera. It is in close proximity with a large number of research units of leading multi-national corporations in the telecommunications, semiconductor and biotechnology sectors, as well as other outstanding research and teaching institutions. A freethinking, multinational population and the unique geographic location provide a quality of life without equal.

Back to Top

8-21 . (2010-04-20) 2year post doc at Telecom Paris

2-year postdoctoral/research position in Audio and Multimedia scene analysis using multiple sensors Post-Doc DeadLine: 30/09/2010 Firstname.Lastname@telecom-paristech.fr http://www.tsi.telecom-paristech.fr/en/open-positions-phd-thesis-internships/ *2-year Post-Doc/Research position in Audio and Multimedia scene analysis using multiple sensors * /Place:/ TELECOM ParisTech (ENST), Paris, France (http://www.telecom-paristech.fr/) /Duration:/ 2 years (1 year renewable for a second year) /Start:/ Any date from September 1st , 2010 /Salary:/ according to background and experience * *****Position description** The position is supported by the European Network of Excellence project “3Dlife” that aims to integrate research conducted within Europe in the field of Media Internet. In this framework, the research conducted encompasses all aspects related to the Analysis/synthesis of 3D audiovisual content for 3D models animation, virtual humans and virtual environments creation. The role of the PostDoc/researcher will consist, on the one hand, in participating to the network collaborative integration activities, and on the other hand in conducting forefront research in the domain of audio and multimedia scene analysis using multiple sensors. For one of the use cases envisaged (multimedia dance scenes analysis), the signals are of different natures (Music, videos and potentially also electrical sensor output signals) and captured by multiple sensors of potential variable quality. A specific interest will be devoted to the development of innovative statistical fusion approaches capable of processing information on multiple semantic levels (from low level features to high level musical or video concepts). Machine learning methods such as Bayesian networks, support vector machines, Markov and semi-Markov models, or boosting are amongst the statistical frameworks of particular interest for this research. *Candidate Profile* As minimum requirements, the candidate will have: · A PhD in audio or multimedia signal processing, speech processing, statistics, machine learning, computer science, electrical engineering, or a related discipline. · Some knowledge in audio signal processing · Programming skills in particular in Matlab (knowledge of Python would be a plus) The ideal candidate would also have: - Solid knowledge of machine learning techniques, in particular classification, temporal sequence segmentation and multi-sensor information fusion; - Ability to take over research project management responsibilities and work in a multi-partner and international collaborative environment; - Strong communication skills in English. *Contacts* Interested applicants may contact Gaël Richard or Slim Essid for more information or directly email a candidacy letter including a Curriculum Vitae, a list of publications and a statement of research interests. - Gaël Richard (firstname.lastname@telecom-Paristech.fr) ; +33 1 45 81 73 65 - Slim Essid (firstname.lastname@Telecom-ParisTech.fr). *More info on 3Dlife at :* http://www.3dlife-noe.eu/

Back to Top

8-22 . (2010-04-21) Post doc at LIMSI Paris

Post-doctoral position at LIMSI – Audio & Acoustic Group
 
The Audio & Acoustic group at LIMSI (http://www.limsi.fr/Scientifique/aa/) is currently recruiting for a for a 1 year postdoctoral research position on a CNRS grant. Two research subjects are available for this singular position. Candidates should send a CV, letter of motivation for the selected topic, and at least 2 references to Christophe d’Alessandro: cda@limsi.fr. Candidature letters of motivation should cite previous experience, relevance, and specific interests related to the details of the project. These documents should be received by then end of May. Notification should be made by the end of June. The selected candidate should be available to start between 1-July and 1-October.
 
 
 
Research Subject 1- A study on expressive prosody and voice quality using a gesturally driven voice synthesizer.
 
The modeling and analysis of expressive prosody raise many problems both on the perceptive and on the signal measurement sides. The analysis of voice source quality changes for expressive speech in particular faces the limitation of inversion procedures. The Audio & Acoustic group at LIMSI has developed a real-time version of the CALM voice source synthesizer (Doval et al., 2003; d’Alessandro et al., 2006; Le Beux, 2009), mapped on to several gestural controllers (graphic tablet, joystick, cyber glove...). This device constitutes a powerful tool for the analysis of expressive voice quality in an analysis by synthesis paradigm.
 
Hand gestures have been proven to be adequate in order to control prosodic variations (d’Alessandro et al., 2007). By playing this speech synthesizer, like a music through gestural device, a user is able to generate language specific interjections, based on vocalic expressive non-words. Such non-words are meaningful in a given language and culture and convey strong cues to meaning during a spoken interaction (see Wierzbicka, 1992, and also Contini, 1989, for Sardinian or Campbell, 2007, for Japanese).
 
The proposed project aims at acquiring data from such gesturally driven speech production. Then, the analysis of the synthesizer’s parameters in light of perception tests’ results may help to gain a better understanding of the use of voice quality variations during expressive speech. The different stages of the project (gestural production of expressive speech, subjective evaluation of the productions, modeling of the acoustic parameters of the voice) require different skills and the successful candidate will be able to focus on parts of the projects according to his/her own research interests. This project may also be extended towards the use and evaluation of an immersive 3D expressive speech synthesizer (see Research Subject 2).
 
 
 
The successful candidate will have a PhD in either phonetic, language sciences or psycholinguistics or any related field (with a strong emphasis on speech prosody analysis), and/or a PhD in signal processing or natural language processing (with a good knowledge of acoustic voice analysis). Musical training/practice would be an advantage.
 
 
 
References:
 
d'Alessandro, C., Rilliard, A. & Le Beux, S. (2007). Computerized chironomy : evaluation of hand-controlled intonation reiteration. INTERSPEECH 2007, Antwerp, Belgium : 2007, 1270-1273.
 
d'Alessandro, N., d'Alessandro, C., Le Beux, S. & Doval, B. (2006). Real-time CALM synthesizer : new approaches in hands-controlled voice synthesis. New Interface for Musical Expression, Paris, France, 266-271.
 
Contini, M. (1989). L’interjection en Sarde. Une approche linguistique.In Espaces Romans. Études de dialectologie et de géolinguistique offertes à Gaston Tuaillon, Volume 2, ELLUG:Grenoble, 320-329.
 
Campbell, N. (2007). The role and use of speech gestures in discourse. Archives of Acoustics, 32(4), 803–814.
 
Doval, B. ; D'alessandro, C. ; Henrich, N. (2003). The voice source as a causal/anticausal linear filter. VOQUAL'03 - Voice Quality : functions, analysis and synthesis, Geneva, Switzerland : 2003.
 
Le Beux, S. (2009). Contrôle gestuel de la prosodie et de la qualité vocale. PhD thesis, Université Paris Sud/LIMSI, Orsay, France.
 
Wierbizcka, A. (1992). The semantic of interjection. Journal of pragmatics, 18, 159-192.
 
 
 
 
 
Research Subject 2- Vocal directivity, real and virtual, study and practice
 
Directivity of the human voice has been the topic of recent research efforts. This 1 year post-doc position concerns the development and combination of recent efforts at LIMSI pertaining to the measurement and understanding of vocal directivity, and its integration into an immersive virtual environment. Preliminary studies on vocal directivity patterns have recently been shown to vary significantly between phonemes. Relying predominantly on several recently acquired databases of sung and spoken voice directivity, a detailed analysis shall be carried out. Two branches of research are then open to follow. First, it is hoped that a numerical model could be employed (using Boundary Element Method modeling) to validate any conclusions on the physical basis for the directivity variations. A second direction of the proposed project concerns the incorporation of voice directivity patterns into an immersive virtual environment text-to-speech simulator. These variations have also been found to be perceptible, and there is an interest to study to what degree these variations are important to the perceived quality of immersive environments. As such, there shall be implementation of the effect, as well as the performance of subjective evaluations. This integration can be in the form of a text-to-speech synthesizer, or an expressive voice synthesizer (see Research Subject 1). The proportion of effort allocated to these two aspects of the project will depend on the skills and interest of the chosen candidate.
 
 
 
A thesis in acoustics, audio signal processing, or other similar field is required. Familiarization with measurement and analysis procedures is required. Familiarity with general software tools such as MatLab, real-time processing software, such as Max/MSP or PureData, and BEM software are a benefit. Candidates should be highly motivated, and capable of working both independently and in a multi-disciplinary group environment.
 
 
 
References:
 
Katz, Brian F.G. & d'Alessandro, Christophe,  "Directivity Measurements of the Singing Voice." Proceedings of the 19th International Congress on Acoustics (ICA'2007), Madrid, 2-7 September 2007.
 
Brian F.G. Katz, Fabien Prezat, and Christophe d'Alessandro, "Human voice phoneme directivity pattern measurements." Fourth Joint Meeting: ASA and ASJ, Honolulu, November 2006, J. Acoust. Soc. Am., Vol. 120(5), Pt. 2, November 2006.
 
Martin, J.-C., d'Alessandro, C., Jacquemin, C., Katz, B., Max, A., Pointal, L. and Rilliard, A., "3D Audiovisual Rendering and Real-Time Interactive Control of Expressivity in a Talking Head." Proceedings of the 7th
 
International Conference on Intelligent Virtual Agents (IVA'2007), Paris, France September 17-19, 2007.
 
Martin, J.-C., Jacquemin, C., Pointal, L., Katz, B., d'Alessandro, C., Max, A. and Courgeon, M., "A 3D Audio-Visual Animated Agent for Expressive Conversational Question Answering." International Conference on Auditory-Visual Speech Processing (AVSP'2007). Eds. J. Vroomen, M. Swerts, E. Krahmer. Hilvarenbeek, The Netherlands August 31 - September 3, 2007.
 
Martin, J.-C.; D'Alessandro, C.; Jacquemin, C.; Katz, B.F.G.; Max, A.; Pointal, L.; Rilliard, A., "3D audiovisual rendering and real-time interactive control of expressivity in a Talking Head." IVA 2007. 7th International Conference on Intelligent Virtual Agents, Paris, 17-19 September 2007.
 
N. Misdariis, A. Lang, B. Katz and P. Susini, "Perceptual effects of radiation control with a multi-louspeaker device." Proceedings of the 155th ASA, 5th Forum Austicum, & 2nd ASA-EAA Joint Conference, Paris, 29Jun-6Jul 2008.
 
Markus Noisternig, Brian FG Katz, Samuel Siltanen, and Lauri Savioja, "Framework for Real-Time Auralization in Architectural Acoustics." Journal of Acta Acustica united with Acoustica, Vol. 94 (2008), pp. 1000-1015, doi 10.3813/aaa.918116
 
M. Noisternig, L. Savioja and B. Katz, "Real-time auralization system based on beam-tracing and mixed-order Ambisonics." Proceedings of the 155th ASA, 5th Forum Austicum, & 2nd ASA-EAA Joint Conference, Paris, 29Jun-6Jul 2008.
 
M. Noisternig, B. Katz and C. D'Alessandro, "Spatial rendering of audio-visual synthetic speech use for immersive environments." Proceedings of the 155th ASA, 5th Forum Austicum, & 2nd ASA-EAA Joint Conference, Paris, 29Jun-6Jul 2008.
 
 
 
The LIMSI is located approximately 30 minutes South of Paris by commuter train (RER B). The laboratory accommodates approximately 120 permanent personnel (researchers, professors and assistant professors, engineers, technicians) and about sixty PhD candidates. It undertakes multidisciplinary research in Mechanical and Chemical Engineering and in Sciences and Technologies for Information and Communication. The research fields cover a wide disciplinary spectrum from thermodynamics to cognition, encompassing fluid mechanics, energetics, acoustics and voice synthesis, spoken language and text processing, vision, virtual reality...
Back to Top

8-23 . (2010-04-21) Professor at the University of Amsterdam (The Netherlands)

Faculty of Humanities
The Faculty of Humanities provides education and conducts research with a strongly international profile in a large number of disciplines in the field of language and culture. Located in the heart of Amsterdam, the Faculty maintains close ties with many cultural institutes in the capital city. There are almost 1,000 employees affiliated with the Faculty, which has about 7,500 students.
The Department of Dutch Studies currently has a vacancy for a professor in
Speech Communication
1.0 FTE
Job description
The chair in Speech Communication is charged with teaching and research in the broad field of speech communication, which also includes argumentation and rhetoric in institutional contexts. The Faculty of Humanities consists of six departments: History, Archaeology and Area Studies; Art, Religion and Cultural Sciences; Media Studies; Dutch Studies; Language and Literature; and Philosophy. Each department is made up of sections comprising one or more professors and a number of other academic staff working in the relevant field.
The chair in Speech Communication is part of the Speech Communication, Argumentation Theory and Rhetoric section in the Department of Dutch Studies. The Department further comprises sections of the Dutch Literature and the Dutch Linguistics. At present, the section of Speech Communication, Argumentation Theory and Rhetoric has a staff of more than 12 full-time equivalent positions (FTE). Financial developments permitting, additional staff may be recruited during the coming years.
Tasks
The teaching tasks of the professor of Speech Communication focus mainly on the BA and MA programmes in Dutch Language and Culture, the BA programme in Language and Communication, the dual MA programme in Text and Communication, the Research MA programme in Rhetoric, Argumentation Theory and Philosophy (RAP) and the MA programme track in Discourse and Argumentation Studies (DASA), along with several relevant minors and electives (for the curriculum, please see the UvA’s digital course catalogue: www.studiegids.uva.nl ). The Faculty’s BA programmes are taught within the College of Humanities, while MA and doctorate programmes are administered within the Graduate School for Humanities.
Research activities are to cover the broad field of speech communication, including argumentation and rhetoric in institutional contexts. Depending on the interests and specialisation of the appointee, these research activities will be based at either the Amsterdam School for Cultural Analysis (ASCA), the Amsterdam Center for Language and Communication (ACLC) or the interfaculty Institute for Logic, Language and Computation (ILLC).
In defining its research programme for the period 2009-2012, the Faculty has identified three research priority areas: Cultural Heritage and Identities (Cultureel Erfgoed en Identiteit), Cultural Transformations and Globalisation (Culturele Transformaties en Globalisering) and the interfaculty area of Cognitive Modelling and Learning (Cognitieve modellen en leerbaarheid). For further information about the Faculty research programme, please see: www.hum.uva.nl/onderzoek.
Profile
The candidate must be able to demonstrate a thorough knowledge of the field, as evidenced by his/her academic qualifications, publications and teaching experience. S/he has completed doctoral work on a topic in this or a related discipline and, as the prospective chair, has a good understanding of the domain as a whole.
The new professor is expected to both implement and further develop the section’s existing ambitious research profile in speech communication, argumentation theory and rhetoric. Specifically, that profile must be expanded to include the study of language usage, aspects of speech acts and stylistic features of written and oral communication. A further, key part of this process will be the reassessment of the
discipline’s educational objectives, with special emphasis on teaching in the Research Master’s programme.
The successful candidate will have wide-ranging experience of university teaching and of supervising students at all academic levels. In addition, s/he must be able to demonstrate familiarity and affinity with ICT developments relevant to teaching and research. S/he can draw on an existing national and international network in the relevant field.
Where education is concerned, the new professor will be responsible for developing and maintaining a high-quality and appealing contribution to the aforementioned study programmes. This shall be done in consultation with other staff. S/he will ensure that teaching programmes respond to society’s demand for graduates capable of making academic knowledge more accessible to a broad audience. The candidate must show willingness to collaborate with various other educational units both within the University and at other higher education institutions. In view of the Faculty’s general policy that academic staff should be capable of flexible deployment, the new professor must be prepared to teach in an interdisciplinary context as well as outside his/her direct field of expertise.
The successful candidate should have experience of teaching at all levels of university education and in all forms employed at the UvA (seminars, lectures and supervision of dissertations/theses and work experience placements) and also of the methods of assessment associated with each of these. S/he must possess teaching and educational skills of a high order and an approachable personality and manner. The candidate must have a fluent command of both Dutch and English; any appointee lacking this level of linguistic competence will be expected to acquire it within two years of taking up the post.
The importance that the Faculty attaches to this chair is reflected in the standards set for candidates in terms of research experience. The candidate must hold a doctorate degree earned either within this field or in a related discipline. S/he must be able to demonstrate a thorough knowledge of the field by reference to major contributions to international discussions in the broad domain of speech communication and to past publications, including articles in international academic journals and anthologies, as well as to contributions to the wider public debate.
In addition, the successful candidate will be expected to undertake new research, including both independent work and larger-scale projects involving partners outside the Department of Dutch Studies and Faculty of Humanities. S/he must be capable of recruiting the necessary indirect government or private sector funding for this. Further duties include the supervision of doctorate students and postdocs, and candidates are expected to possess experience relevant to the exercise of these responsibilities. In addition, the new professor will be expected to maintain close contacts with the field, or to be in a position to establish such contacts.
The appointee will have administrative responsibility for his/her own field of activity. First and foremost, this will require inspiring and supportive team leadership. By encouraging staff and providing constructive criticism, the new professor will help to advance the quality and effectiveness of University teaching and research. Specific means of achieving this will include regular team meetings with staff and annual consultations and assessment interviews.
In addition, the new professor will be expected to undertake general administrative and organisational duties both within and outside the Faculty. Substantial evidence of practical experience in these areas is extremely desirable.
In keeping with University policy, candidates should hold a Master’s or doctoral degree and have at least three years subsequent work experience at a university or academic research institute other than the UvA, preferably abroad.
Further information
For further information, please contact the secretary of the selection committee, Mr H.A. Mulder, tel. 020-525 3066, email H.A.Mulder@uva.nl, or the committee chairman, Prof. F.P. Weerman, tel. 020-525 4737, email F.P.Weerman@uva.nl.
Appointments
The initial appointment will be on a temporary basis for a period of no more than two years. Subject to satisfactory performance, this will be followed by a permanent appointment. The gross salary will normally conform to professorial scale 2 (between €4904 and €7142 per month on a full-time basis in
accordance with the salary scale established in January 2009). In certain cases, however, different terms of employment may be offered.
Application procedure
Please submit a letter of application in Dutch or English by no later than 15 May 2010, accompanied by a CV and list of publications. The application should be addressed to the selection committee for the chair in Speech Communication, c/o Mr H.A. Mulder, Office of the Faculty of Humanities, Spuistraat 210, 1012 VT Amsterdam, the Netherlands, and should be sent in an envelop marked ‘strictly confidential’.
Applications will be reviewed by the selection committee, headed by the chair of the Department of Dutch Studies, Prof. F.P. Weerman. The selection procedure includes a formal assessment and a trial public lecture, on the basis of which the committee makes a recommendation to the Dean of the Faculty of Humanities. The committee will make an initial selection before the summer recess and invite candidates for interviews in September.
Back to Top

8-24 . (2010-05-12) 2 PhD Positions at Vrije Universiteit of Brussel Belgium

PhD position in Audio Visual Signal Processing

ETRO – AVSP – Vrije Universiteit Brussel

 

PhD position in audiovisual crossmodal attention and multisensory integration.

Keywords: audio visual signal processing, scene analysis, cognitive vision.

 

The Vrije Universiteit Brussel (Brussels, Belgium; http://www.vub.ac.be), department of Electronics and Informatics (ETRO) has available a PhD position in the area of audio visual scene analysis and in particular in crossmodal attention and multisensory integration in the detection and tracking of spatio-temporal events in audiovisual streams.

 

The position is part of an ambitious European project aliz-eAdaptive Strategies for Sustainable Long-Term Social Interaction. The overall aim of the project is to develop the theory and practice behind embodied cognitive robots which are capable of maintaining believable multi-modal any-depth affective interactions with a young user over an extended and possibly discontinuous period of time.

 

Within this context, audiovisual attention plays an important role. Indeed, attention is the cognitive process of selectively concentrating on an aspect of the environment while ignoring others. The human selective attention mechanism enables us to concentrate on the most meaningful signals amongst all information provided by our audio-visual senses. The human auditory system is able to separate acoustic mixtures in order to create a perceptual stream for each sound source. It is widely assumed that this auditory scene analysis interacts with attention mechanisms that select a stream for attentional focus. In computer vision, attention mechanisms are mainly used to reduce the amount of data for complex computations. They employ a method of determining important, salient units of attention and select them sequentially for being subjected to these computations. The most common visual attention model is the bottom-up approach which uses basic features, conjunctions of features or even learned features as saliency information to guide visual attention. Attention can also be controlled by top-down or goal-driven information relevant to current behaviors. The deployment of attention is then determined by an interaction between bottom-up and top-down attention priming or setting.

Motivated by these models, the present research project aims at developing a conceptual framework for audio-visual selective attention in which the formation of groups and streams is heavily influenced by conscious and subconscious attention.

 

 

The position will be within the ETRO research group (http://www.etro.vub.ac.be) under supervision of Prof. Werner Verhelst and Prof. Hichem Sahli, but will also have close collaboration and regular interaction with the research groups participating in Aliz-e.

The ideal candidate is a team worker having theoretical knowledge and practical experience in audio and image processing, machine learning and/or data mining. He/she is a good programmer (preferably matlab or C++). He or she is in the possession of a 2 year master in engineering science (electronics, informatics, artificial intelligence or other relevant discipline).

The position and research grant are available from June 2010. The position is for 4 years.

Applicants should send a letter explaining their research interests and experience, a complete curriculum vitae (with the relevant courses and grades), and an electronic copy of their master thesis (plus, optionally, reports of other relevant projects) to wverhels@etro.vub.ac.be

 

============================================================Post Doc Position in Audio-Visual Signal Processing & Machine Learning

ETRO – AVSP – Vrije Universiteit Brussel

 

Post Doctoral Position in audiovisual signal processing and machine learning.

Keywords: audio visual signal processing, scene analysis, machine learning, affective human-robot interaction.

 

The Vrije Universiteit Brussel (Brussels, Belgium; http://www.vub.ac.be), department of Electronics and Informatics (ETRO) has available a Post Doctoral position in the area of audio visual signal processing and multi-modal affective interaction.

 

The position is part of an ambitious European project aliz-eAdaptive Strategies for Sustainable Long-Term Social Interaction. The overall aim of the project is to develop the theory and practice behind embodied cognitive robots which are capable of maintaining believable multi-modal any-depth affective interactions with a young user over an extended and possibly discontinuous period of time.

 

Skills:

  • PhD with concentration in relevant areas or closely related areas, such as audiovisual speech processing, audiovisual scene analysis, human-machine interaction, affective computing, machine learning
  • Track record of important publications 
  • Ability to generate new ideas and apply them to human-machine applications
  • Good programming skills, especially  implementation of complex algorithms
  • Highly motivated and willing to coordinate the work of 2-3 PhD students
  • Proficient in English, both written and spoken
  • Knowledge of the Dutch language is a plus but not a requirement

 

The position is available from June 2010 at a competitive salary. The position is guaranteed for 3 years and can be extended. In addition, candidates that qualify for an Odysseus grant from the Research Foundation Flanders will be encouraged and supported to do so (http://www.fwo.be/Odysseusprogramma.aspx).

Applicants should send a letter explaining their research interests and experience, a complete curriculum vitae and recommendation letters to wverhels@etro.vub.ac.be

Back to Top

8-25 . (2010-05-12) Post doc at Universite de Bordeaux, Talence , France

Sélection de modèles pour les systèmes de Markov à saut Post-Doc DeadLine: 31/07/2010 audrey.giremus@ims-bordeaux.fr, eric.grivel@ims-bordaux.fr http://www.ims-bordeaux.fr/IMS/pages/accueilEquipe.php?guidPage=NGEwMTNmYWVhODg3OA==&groupe=RECH_EXT  Lieu : Laboratoire IMS, Groupe Signal, Talence, Bordeaux.  Date : Septembre 2010 Domaine : Traitement du signal Contacts : audrey.giremus@ims-bordeaux.fr, eric.grivel@ims-bordaux.fr Fournir un CV avec 2 lettres de personnes référentes  Le post-doctorat proposé porte sur les approches de sélection de modèles pertinents dans un contexte d’estimation par algorithmes dits à modèles multiples. Ces approches consistent à mettre en compétition plusieurs modèles pour décrire l’évolution de l’état d’un système que l’on cherche à estimer. Les premiers algorithmes proposés [1] considèraient des modèles linéaires Gaussiens et étaient donc fondés sur une estimation du vecteur état par filtrage de Kalman. Avec le développement des méthodes de filtrage particulaire [2], le problème posé s’élargit au contexte des systèmes dits de Markov à saut dont l’évolution peut être décrite par différentes lois de probabilités. Dans ce cadre, le post-doctorant s’interrogera sur le choix a priori des modèles d’évolution de l’état du système. Si ce choix n’est pas dicté par des considérations physiques, différentes questions peuvent alors être soulevées telles que : -	le nombre optimal de modèles à utiliser, -	la validité des modèles sélectionnés, -	l’influence du degré de recouvrement ou de ressemblance de ces modèles. Ainsi, il conviendra de déterminer si le fait d’utiliser un jeu de modèles très « différents » les uns des autres permet d’améliorer l’estimation de l’état du système. Le post-doctorant sera donc amené à étudier/développer des critères permettant de mesurer la ressemblance entre deux modèles, ou plus génériquement entre deux lois de probabilité, et s’intéressera entre autres à des outils tels que le facteur de Bayes ou la déviance Bayésienne [3].  [1]	H. A. P. Blom,Y. Bar-Shalom, The interacting multiple model algorithm for systems with Markovian switching coefficients, IEEE Trans. Autom. Control, 33 8, (1988), 780–783. [2]	M. S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, A tutorial on particle filters for online nonlinear/ non-Gaussian Bayesian tracking, IEEE Trans. Signal Processing, vol. 50, no. 2, pp. 174-188, 2002. [3]	C.P. Robert, Le Choix Bayésien, Springer Editions, 2005.
Back to Top

8-26 . (2010-05-20) Associate professor at Nanjng Normal University China

Associate Professor or Lecturer positions in Phonetic Science and Speech Technology at Nanjing Normal University, China

 

The Department of Linguistic Science and Technology at Nanjing Normal University, China, invites applications for two positions at Associate Professor or Lecturer level in the area of Phonetic Sciences and Speech Technologies.

 

Nanjing Normal University (NNU) is situated in Nanjing, a city in China not only famous for its great history and culture but also pride for excellence in education and academy. With Chinese-style buildings and garden-like environment, the NNU campus is often entitled as the “Most Beautiful Campus in the Orient.”

 

NNU is among the top 5 universities of China in the area of Linguistics. Placing a strong emphasis on interdisciplinary research, the Department of Linguistic Science and Technology at NNU is unique in that it bridges the studies of theoretical and applied linguistics, cognitive sciences, and information technologies. A new laboratory has recently been established in phonetic sciences and speech technologies, to stimulate a closer collaboration between linguists, phoneticians, psychologists, and computer/engineering scientists. The laboratory is very well equipped, possessing sound-proof recording studio, professional audio facilities, physiological instruments (e.g., EGG, EMG, EPG, airflow and pressure module, and nasality sensor), EEG for ERP studies, and Linux/Windows workstations.

 

We welcome interested colleagues to join us. The research can cover any related areas in phonetic sciences and speech technologies, including but not limited to speech production, speech perception, prosodic modeling, speech synthesis, automatic speech recognition and understanding, spoken language acquisition, and computer-aided language learning. Outstanding research support will be offered. The position level will be determined based on qualifications and experience.

 

Requirements:

* A PhD degree in related disciplines (e.g., linguistics, psychology, physics, applied mathematics, computer sciences, and electronic engineering) is preferred, though a MS degree with a distinguished experience in R&D of speech technologies at world-class institutes/companies is also acceptable

* 3+ years’ experience and strong publication/patent record in phonetic sciences or speech technologies

* Good oral and written communication skills in both Chinese and English

* Good programming skills

* Team work spirit in a multidisciplinary group

* Candidates working in any related topics are encouraged to apply, but those who have backgrounds and research interests in both phonetic/linguistic sciences and speech technologies will be considered with preference

 

Interested candidates should submit a current CV, a detailed list of publication, the copies of the best two or three publications, and the contact information of at least two references. The application and any further enquiry about the positions should be sent to Prof. Wentao GU by email (preferred) or regular mail to the following address:

 

Prof. Wentao GU

              Dept of Linguistic Science and Technology

              Nanjing Normal University

              122 Ning Hai Road, Nanjing

              Jiangsu 210097, China

              Phone:  +86-189-3687-2840

              Email:  wentaogu@gmail.com    wtgu@njnu.edu.cn

 

The positions will keep open until they are filled.

Back to Top

8-27 . (2010-05-21) Post doc at LORIA Nancy France

Title : Bayesian networks for modeling and handling variability sources in speech recognition - Location: INRIA Nancy Grand Est research center --- LORIA Laboratory, NANCY, France - Project-team: PAROLE Contact: Denis Jouvet (denis.jouvet@loria.fr) In state-of-art speech recognition systems, Hidden Markov Models (HMM) are used to model the acoustic realization of the sounds. The decoding process compares the unknown speech signal to sequences of these acoustic models to find the best matching sequence which determines the recognized words. Lexical and grammatical constraints are taken into account during the decoding process; they limit the amount of model sequences that are considered in the comparisons, which, nevertheless remains very large. Hence precise acoustic models are necessary for achieving good speech recognition performance. To obtain reliable parameters, the HMM-based acoustic models are trained on very large speech corpus. However, speech recognition performance is very dependent on the acoustic environment: good performance is achieved when the acoustic environment matches with that of the training data, and performance degrades when the acoustic environment gets different. The acoustic environment depends on many variability sources which impact on the acoustic signal. This includes the speaker gender (male / female), individual speaker characteristics, the speech loudness, the speaking rate, the microphone, the transmission channel, and of course the noise, to name only of few of them [Benzeghiba et al, 2007]. Using a training corpus which exhibits too many different variability sources (for example many different noise levels, too different channel speech coding schemes, ...) makes the acoustic models less discriminative, and thus lowers the speech recognition performance. On the opposite, having many sets of acoustic models, each one of them dedicated to a specific environment condition raises training problems. Indeed, because each training subset is restricted to a specific environment condition, its size gets much smaller, and consequently it might be impossible to train reliably some parameters of the acoustic models associated to this environment condition. In recent years, Dynamic Bayesian Networks (DBN) have been applied in speech recognition. In such an approach, certain model parameters are set dependent on some auxiliary features, such as articulatory information [Stephenson et al., 2000], pitch and energy [Stephenson et al. 2004], speaking rate [Shinozaki & Furui, 2003] or some hidden factor related to a clustering of the training speech data [Korkmazsky et al., 2004]. The approach has also been investigated for dealing with multiband speech recognition, non-native speech recognition, as well as for taking estimations of speaker classes into account in continuous speech recognition [Cloarec & Jouvet, 2008]. Although the above experiments were conducted with limited vocabulary tasks, they showed that Dynamics Bayesian Networks provide a way of handling some variability sources in the acoustic modeling. The objective of the work is to further investigate the application of Dynamic Bayesian Network (DBN) for continuous speech recognition application using large vocabularies. The aim is to estimate the current acoustic environment condition dynamically, and to constraint the current acoustic space used during decoding accordingly. The underlying idea is to be able to handle various range of acoustic space constraints during decoding. Hence, when the acoustic environment condition estimation is reliable, the corresponding specific condition constraints can be used (leading, for example, to model parameters associated to a class of very similar speakers in a given environment). On the opposite, when the acoustic environment condition estimation is less reliable, more tolerant constraints should be used (leading, for example, to model parameters associated to a broader class of speakers or to several environment conditions). Within the formalism of Dynamic Bayesian Networks, the work to be carried out is the following. The first aspect concerns the optimization of the classification of the training data, and associated methods for estimating the classes that best matches unknown test data automatically. The second aspect involves the development of confidence measures associated to the classification process of test sentences, and the integration of these confidence measures in the DBN modeling (in order to constraint more or less the acoustic space for decoding according to the reliability of the environment condition estimation). [Benzeghiba et al, 2007] M. Benzeghiba, R. de Mori, O. Deroo, S. Dupont, T. Erbes, D. Jouvet, L. Fissore, P. Laface, A. mertins, C. Ris, R. Rose, V. Tyagi & C. Wellekens: "Automatic speech recognition and speech variability: A review"; Speech Communication, Vol. 49, 2007, pp. 763-786. [Cloarec & Jouvet, 2008] G. Cloarec & D. Jouvet: "Modeling inter-speaker variability in speech recognition" ; Proc. ICASSP'2008, IEEE International Conference on Acoustics, Speech, and Signal Processing, 30 March – 4 April 2008, Las Vegas, Nevada, USA, pp. 4529-4532 [Korkmazsky et al., 2004] F. Korkmazsky, M. Deviren, D. Fohr & I. Illina: "Hidden factor dynamic Bayesian networks for speech recognition"; Proc. ICSLP'2004, International Conference on Spoken Language Processing, 4-8 October 2004, Jeju Island, Korea, pp. 1134-1137. [Shinozaki & Furui, 2003] T. Shinozaki & S. Furui: "Hidden mode HMM using bayesian network for modeling speaking rate fluctuation"; Proc. ASRU'2003, IEEE Workshop on Automatic Speech Recognition and Understanding, 30 November - 4 December 2003, US Virgin Islands, pp.417-422. [Stephenson et al., 2000] T.A. Stephenson, H. Bourlard, S. Bengio & A.C. Morris: "Automatic speech recognition using dynamic Bayesian networks with both acoustic and articulatory variables"; Proc. ICSLP'2000, International Conference on Spoken Language Processing, 2000, Beijing, China, vol. 2, pp. 951–954. [Stephenson et al., 2004] T.A. Stephenson, M.M. Doss & H. Bourlard: "Speech recognition with auxiliary information"; IEEE Transactions on Speech and Audio Processing, SAP-12 (3), 2004, pp. 189–203.

Back to Top

8-28 . (2010-05-26) Post-doc position in Speech Recognition, Adaptation, Retrieval and Translation- Aalto University Finland

               Post-doc position in Speech Recognition, Adaptation, Retrieval and Translation

In Aalto University School of Science and Technology (previously known as Helsinki University of Technology) in Department of Computer and Information Science

http://ics.tkk.fi/en/current/news/view/postdoc_position_in_the_speech_recognition_group/

or directly: http://www.cis.hut.fi/projects/speech/jobs10.shtml

We are looking for a postdoc to join our research group working on machine learning and probabilistic modeling in speech recognition, adaptation, retrieval and translation. Speech recognition group (led by Mikko Kurimo) belongs to the Adaptive Informatics Research Centre (by Prof. Oja, 2006-) which is the successor to the Neural Networks Research Centre (by Prof. Kohonen, 1995-2005).

We are happy to consider outstanding candidates interested in any of our research themes, for example:

·      large-vocabulary speech recognition

·      acoustic and language model adaptation

·      speech recognition in noisy environments

·      spoken document retrieval

·      speech translation based on unsupervised morpheme models

·      speech recognition in multimodal and multilingual interfaces

Postdoc: 1 year + extension possibilities. Starting date: near August 1, 2010. Position requires a relevant doctoral degree in CS or EE, skills for doing excellent research in a group, and outstanding research experience in any of the research themes mentioned above. The candidate is expected to perform high-quality research, and provide assistance in supervision of our PhD students.

In Helsinki you will join the innovative international computational data analysis and ICT community. Among European cities, Helsinki is special in being clean, safe, liberal, Scandinavian, and close to nature, in short, having a high standard of living. English is spoken everywhere. See. e.g. Visit Finland.

Please attach a CV including a list of publications and email addresses of 2-3 people willing to give more information. Include a brief description of research interests and send the application by email to

Mikko Kurimo, Mikko.Kurimo@tkk.fi
Adaptive Informatics Research Centre, Department of Information and Computer Science, Aalto University School of Science and Technology

 

Back to Top

8-29 . (2010-05-26) Ph D grant at University of Nantes France

Fusion Strategies for Handwriting and Speech Modalities – Application in Mathematical Expression Recognition These DeadLine: 15/07/2010 christian.viard-gaudin@univ-nantes.fr http://www.projet-depart.org/index.php  Keywords : Handwriting recognition, Speech recognition, Data/decision fusion.  IRCCyN - UMR CNRS 6597 - NANTES Equipe IVC  Description of the Ph-D thesis: Handwriting and speech are the two most common modalities of interaction for human beings. Each of them has specific features related to usability, expressibility, and requires dedicated tools and techniques for digitization. The goal of this PhD is to study fusion strategies for a multi-modal input system, combining on-line handwriting and speech, so that extended facilities or increased performances are achieved with respect to a single modality. Several fusion methods will be investigated in order to take advantage of a possible mutual disambiguation. They will range from early fusion to late fusion, for exploiting as much as possible redundancy and complementarity of the two streams. The joint analysis of handwritten documents and speech is a quite new area of research, and only a few works have emerged concerning applications such as identity verification [1], white board interaction [2], lecture note taking [3], and mathematical expression recognition [4]. Precisely, the focus of this thesis will be on mathematical expression recognition [4,5,6]. This is a very challenging domain where a lot of difficulties have to be faced. Specifically, the large number of symbols, and the 2D layout of expressions have to be considered. Pattern recognition, machine learning, fusion techniques will play fundamental roles in this work. This PhD is part of the DEPART (Document Ecrit, Parole et Traduction) project funded by the Pays de la Loire Region. Applications, including cover letter, CV, and the contact information for references should be emailed to christian.viard-gaudin@univ-nantes.fr  Qualification required: • Master’s degree in computer science or a related field such as electrical, telecommunications engineering, signal processing or machine learning • Good programming skills in C++, Java, C, Unix/Linux • High motivation in research and applications • Good communication skills in English or French. French knowledge is welcome but not mandatory Starting date: September or October 2010 Place: Nantes (France). The position is within IRCCyN IVC team in Nantes (Christian Viard-Gaudin, H. Mouchère) in collaboration with LIUM speech team in Le Mans (Simon Petitrenaud).  http://gdr-isis.org/rilk/gdr/Kiosque/poste.php?jobid=3802
Back to Top

8-30 . (2010-06-08) Two Associate Professor positions in Speech Communication at KTH.

Two Associate Professor positions in Speech Communication at KTH.
The positions are placed in the School of Computer Science and
Communication, Department of  Speech, Music and Hearing.
Further information is available on:
http://www.kth.se/om/work-at-kth/vacancies/associate-professor-in-speech-communication-1.61450?l=en_UK
and
http://www.kth.se/om/work-at-kth/vacancies/associate-professor-in-speech-communication-with-specialization-in-multimodal-embodied-systems-1.61437?l=en_UK
Deadline for applications is June 28, 2010 

Back to Top

9 . Journals

 

Back to Top

9-1 . SPECIAL ISSUE OF SPEECH COMMUNICATION on Sensing Emotion and Affect - Facing Realism in Speech Processin

Call for Papers

SPECIAL ISSUE OF SPEECH COMMUNICATION on

Sensing Emotion and Affect - Facing Realism in Speech Processing

 

http://www.elsevier.com/framework_products/promis_misc/specomsensingemotion.pdf

_______________________________________________________________________________

 

Human-machine and human-robot dialogues in the next generation will be dominated by natural speech which is fully spontaneous and thus driven by emotion. Systems will not only be expected to cope with affect throughout actual speech recognition, but at the same time to detect emotional and related patterns such as non-linguistic vocalization, e.g. laughter, and further social signals for appropriate reaction. In most cases, this analysis clearly must be made independently of the speaker and for all speech that "comes in" rather than only for pre-selected and pre-segmented prototypical cases. In addition - as in any speech processing task, noise, coding, and blind speaker separation artefacts, together with transmission errors need to be dealt with. To provide appropriate back-channelling and sociSPECIAL ISSUE of SPEECH COMMUNally competent reaction fitting the speaker's emotional state in time, on-line and incremental processing will be among further concerns. Once affective speech processing is applied in real-life, novel issues as standards, confidences, distributed analysis, speaker adaptation, and emotional profiling are coming up next to appropriate interaction and system design. In this respect, the Interspeech Emotion Challenge 2009, which has been organized by the guest editors, provided the first forum for comparison of results, obtained for exactly the same realistic conditions. In this special issue, on the one hand, we will summarise the findings from this challenge, and on the other hand, provide space for novel original contributions that further the analysis of natural, spontaneous, and thus emotional speech by late-breaking technological advancement, recent experience with realistic data, revealing of black holes for future research endeavours, or giving a broad overview. Original, previously unpublished submissions are encouraged within the following scope of topics:

 

    * Machine Analysis of Naturalistic Emotion in Speech and Text

    * Sensing Affect in Realistic Environments (Vocal Expression, Nonlinguistic Vocalization)

    * Social Interaction Analysis in Human Conversational Speech

    * Affective and Socially-aware Speech User Interfaces

    * Speaker Adaptation, Clustering, and Emotional Profiling

    * Recognition of Group Emotion and Coping with Blind Speaker Separation Artefacts

    * Novel Research Tools and Platforms for Emotion Recognition

    * Confidence Measures and Out-of-Vocabulary Events in Emotion Recognition

    * Noise, Echo, Coding, and Transmission Robustness in Emotion Recognition

    * Effects of Prototyping on Performance

    * On-line, Incremental, and Real-time Processing

    * Distributed Emotion Recognition and Standardization Issues

    * Corpora and Evaluation Tasks for Future Comparative Challenges

    * Applications (Spoken Dialog Systems, Emotion-tolerant ASR, Call-Centers, Education, Gaming, Human-Robot Communication, Surveillance, etc.)

 

 

Composition and Review Procedures

_______________________________________________________________________________

 

This Special Issue of Speech Communication on Sensing Emotion and Affect - Facing Realism in Speech Processing will consist of papers on data-based evaluations and papers on applications. The balance between these will be adjusted to maximize the issue's impact. Submissions will undergo the normal review process.

 

 

Guest Editors

_______________________________________________________________________________

 

Björn Schuller, Technische Universität München, Germany

Stefan Steidl, Friedrich-Alexander-University, Germany

Anton Batliner, Friedrich-Alexander-University, Germany

 

 

Important Dates

_______________________________________________________________________________

 

Submission Deadline April 1st, 2010

First Notification July 1st, 2010

Revisions Ready September 1st, 2010

Final Papers Ready November 1st, 2010

Tentative Publication Date December 1st, 2010

 

 

Submission Procedure

_______________________________________________________________________________

 

Prospective authors should follow the regular guidelines of the Speech Communication Journal for electronic submission (http://ees.elsevier.com/specom/default.asp). During submission authors must select the "Special Issue: Sensing Emotion" when they reach the "Article Type"

 

 __________________________________________

 

Dr. Björn Schuller

Senior Researcher and Lecturer

 

LIMSI-CNRS

BP133 91403 Orsay cedex

France

 

Technische Universität München

Institute for Human-Machine Communication

D-80333 München

 

schuller@IEEE.org

Back to Top

9-2 . CfP EURASIP Journal on Advances in Signal Processing Special Issue on Emotion and Mental State Recognition from Speech

EURASIP Journal on Advances in Signal Processing  Special Issue on Emotion and Mental State Recognition from Speech  Call for Papers  http://downloads.hindawi.com/journals/asp/si/emsr.pdf  http://www.hindawi.com/journals/asp/si/emsr.html  _____________________________________________________   As research in speech processing has matured, attention has shifted from linguistic-related applications such as speech recognition towards paralinguistic speech processing problems, in particular the recognition of speaker identity, language, emotion, gender, and age. Determination of emotion or mental state is a particularly challenging problem, in view of the significant variability in its expression posed by linguistic, contextual, and speaker-specific characteristics within speech.  Some of the key research problems addressed to date include isolating emotion-specific information in the speech signal, extracting suitable features, forming reduced-dimension feature sets, developing machine learning methods applicable to the task, reducing feature variability due to speaker and linguistic content, comparing and evaluating diverse methods, robustness, and constructing suitable databases. Automatic detection of other types of mental state, which share some characteristics with emotion, are also now being explored, for example, depression, cognitive load, and "cognitive epistemic" states such as interest or skepticism. Topics of interest in this special issue include, but are not limited to:  * Signal processing methods for acoustic feature extraction in emotion recognition  * Robustness issues in emotion classification, including speaker and speaker group normalization and reduction of mismatch due to coding, noise, channel, and transmission effects * Applications of prosodic and temporal feature modeling in emotion recognition * Novel pattern recognition techniques for emotion recognition * Automatic detection of depression or psychiatric disorders from speech * Methods for measuring stress, emotion-related indicators, or cognitive load from speech * Studies relating speech production or perception to emotion and mental state recognition * Recognition of nonprototypical spontaneous and naturalistic emotion in speech * New methods for multimodal emotion recognition, where nonverbal speech content has a central role * Emotional speech synthesis research with clear implications for emotion recognition * Emerging research topics in recognition of emotion and mental state from speech * Novel emotion recognition systems and applications * Applications of emotion modeling to other related areas, for example, emotion-tolerant automatic speech recognition and recognition of nonlinguistic vocalizations  Before submission authors should carefully read over the journal's Author Guidelines, which are located at http://www.hindawi.com/journals/asp/guidelines.html. Prospective authors should submit an electronic copy of their complete manuscript through the journal Manuscript Tracking System at http://mts.hindawi.com/ according to the following timetable:  _____________________________________________________  Manuscript Due          August 1, 2010 First Round of Reviews  November 1, 2010 Publication Date        February 1, 2011 _____________________________________________________    Lead Guest Editor (for correspondence) _____________________________________________________  Julien Epps, The University of New South Wales, Australia; National ICT Australia, Australia    Guest Editors _____________________________________________________  Roddy Cowie, Queen's University Belfast, UK  Shrikanth Narayanan, University of Southern California, USA  Björn Schuller, Technische Universitaet Muenchen, Germany  Jianhua Tao, Chinese Academy of Sciences, China

Back to Top

9-3 . CfP Special Issue on Speech and Language Processing of Children's Speech for Child-machine Interaction Applications

ACM Transactions on Speech and Language Processing
                                 
                                                                      Special Issue on

        
                                     Speech and Language Processing of Children's Speech
                                   for Child-machine Interaction Applications

 

 
                                                                                        Call for Papers
 
The state-of the-art in  automatic speech recognition (ASR) technology is suitable  for a broad  range of interactive  applications. Although
children  represent an  important user  segment for  speech processing technologies,  the  acoustic  and  linguistic variability  present  in
children's speech poses additional challenges for designing successful interactive systems for children.

Acoustic  and  linguistic  characteristics  of children's  speech  are widely  different  from  those  of  adults and  voice  interaction  of
children with  computers opens challenging  research issues on  how to develop  effective  acoustic, language  and  pronunciation models  for
reliable recognition  of children's speech.  Furthermore, the behavior of children  interacting with  a computer is  also different  from the
behavior of adults. When using a conversational interface for example, children have a different language strategy for initiating and guiding
conversational exchanges, and may adopt different linguistic registers than adults.

In order to develop reliable voice-interactive systems further studies are  needed to  better  understand the  characteristics of  children's
speech and the different aspects of speech-based interaction including the role of speech in  multimodal interfaces. The development of pilot
systems for a broad range of applications is also important to provide  experimental evidence  of the degree  of progress in  ASR technologies
and  to focus  research on  application-specific problems  emerging by using systems in realistic operating environments.

We invite prospective authors to submit papers describing original and previously  unpublished work  in the  following broad  research areas:
analysis of children's speech, core technologies for ASR of children's speech,    conversational    interfaces,   multimodal    child-machine
interaction and computer  instructional systems for children. Specific topics of interest include, but are not limited to:
  • Acoustic and linguistic analysis of children's speech
  • Discourse analysis of spoken language in child-machine interaction
  • Intra- and inter-speaker variability in children's speech
  • Age-dependent characteristics of spoken language
  • Acoustic, language and pronunciation modeling in ASR for children
  • Spoken dialogue systems
  • Multimodal speech-based child-machine interaction
  • Computer assisted language acquisition and language learning
  • Tools  for children  with special  needs (speech  disorders, autism,  dyslexia, etc)

Papers  should have  a major  focus  on analysis  and/or acoustic  and linguistic processing of children's speech. Analysis studies should
be clearly  related to technology development  issues and implications should  be extensively discussed  in the  papers. Manuscripts  will be
peer reviewed according to the standard ACM TSLP process.

Submission Procedure
Authors should  follow the ACM TSLP  manuscript preparation guidelines described on  the journal web  site http://tslp.acm.org and  submit an
electronic  copy  of their  complete  manuscript  through the  journal manuscript  submission  site http://mc.manuscriptcentral.com/acm/tslp.
Authors are required to specify  that their submission is intended for this Special  Issue by including on  the first page  of the manuscript
and in the  field "Author's Cover Letter" the  note "Submitted for the Special Issue  on Speech and Language Processing  of Children's Speech
for Child-machine Interaction  Applications". Without this indication, your submission cannot be considered for this Special Issue.

Schedule
Submission deadline: May 12, 2010
Notification of acceptance: November 1, 2010
Final manuscript due: December 15, 2010

Guest Editors
Alexandros   Potamianos,  Technical   University   of  Crete,   Greece (potam@telecom.tuc.gr)
Diego Giuliani, Fondazione Bruno Kessler, Italy (giuliani@fbk.eu)
Shrikanth   Narayanan,   University   of  Southern   California,   USA (shri@sipi.usc.edu)
Kay  Berkling,   Inline  Internet  Online   GmbH,  Karlsruhe,  Germany (Kay@Berkling.com)
Back to Top

9-4 . ACM TSLP - Special Issue: call for Papers:“Machine Learning for Robust and Adaptive Spoken Dialogue Systems"

ACM TSLP - Special Issue: call for Papers:
“Machine Learning for Robust and Adaptive Spoken Dialogue Systems"

* Submission Deadline 1 July 2010 *
http://tslp.acm.org/specialissues.html

During the last decade, research in the field of Spoken Dialogue
Systems (SDS) has experienced increasing growth, and new applications
include interactive search, tutoring and “troubleshooting” systems,
games, and health agents. The design and optimization of such SDS
requires the development of dialogue strategies which can robustly
handle uncertainty, and which can automatically adapt to different
types of users (novice/expert, youth/senior) and noise conditions
(room/street). New statistical learning techniques are also emerging
for training and optimizing speech recognition, parsing / language
understanding, generation, and synthesis for robust and adaptive
spoken dialogue systems.

Automatic learning of adaptive, optimal dialogue strategies is
currently a leading domain of research. Among machine learning
techniques for spoken dialogue strategy optimization, reinforcement
learning using Markov Decision Processes (MDPs) and Partially
Observable MDPs (POMDPs) has become a particular focus.
One concern for such approaches is the development of appropriate
dialogue corpora for training and testing. However, the small amount
of data generally available for learning and testing dialogue
strategies does not contain enough information to explore the whole
space of dialogue states (and of strategies). Therefore dialogue
simulation is most often required to expand existing datasets and
man-machine spoken dialogue stochastic modelling and simulation has
become a research field in its own right. User simulations for
different types of user are a particular new focus of interest.

Specific topics of interest include, but are not limited to:

 • Robust and adaptive dialogue strategies
 • User simulation techniques for robust and adaptive strategy
learning and testing
 • Rapid adaptation methods
 • Modelling uncertainty about user goals
 • Modelling user’s goal evolution along time
 • Partially Observable MDPs in dialogue strategy optimization
 • Methods for cross-domain optimization of dialogue strategies
 • Statistical spoken language understanding in dialogue systems
 • Machine learning and context-sensitive speech recognition
 • Learning for adaptive Natural Language Generation in dialogue
 • Machine learning for adaptive speech synthesis (emphasis, prosody, etc.)
 • Corpora and annotation for machine learning approaches to SDS
 • Approaches to generalising limited corpus data to build user models
and user simulations
 • Evaluation of adaptivity and robustness in statistical approaches
to SDS and user simulation.

Submission Procedure:
Authors should follow the ACM TSLP manuscript preparation guidelines
described on the journal web site http://tslp.acm.org and submit an
electronic copy of their complete manuscript through the journal
manuscript submission site http://mc.manuscriptcentral.com/acm/tslp.
Authors are required to specify that their submission is intended for
this Special Issue by including on the first page of the manuscript
and in the field “Author’s Cover Letter” the note “Submitted for the
Special Issue of Speech and Language Processing on Machine Learning
for Robust and Adaptive Spoken Dialogue Systems”. Without this
indication, your submission cannot be considered for this Special
Issue.

Schedule:
• Submission deadline : 1 July 2010
• Notification of acceptance: 1 October 2010
• Final manuscript due: 15th November 2010

Guest Editors:
Oliver Lemon, Heriot-Watt University, Interaction Lab, School of
Mathematics and Computer Science, Edinburgh, UK.
Olivier Pietquin, Ecole Supérieure d’Électricité (Supelec), Metz, France.

http://tslp.acm.org/cfp/acmtslp-cfp2010-02.pdf

Back to Top

9-5 . Special issue on Content based Multimedia Indexing in Multimedia Tools and Applications Journal

Special Issue on Content-Based Multimedia Indexing CBMI’2010
Second call for submissions


This call is related to the CBMI’2010 workshop but is open to
all contributions on a relevant topic, whether submitted at
CBMI’2010 or not.


The Special issue of Multimedia Tools and Applications Journal
will contain selected papers, after resubmission and review
from 8th International Workshop on Content-Based Multimedia
Indexing CBMI’2010. Following the seven successful previous
events (Toulouse 1999, Brescia 2001, Rennes 2003, Riga 2005,
Bordeaux 2007, London 2008, Chania 2009), 2010 International
Workshop on Content-Based Multimedia Indexing (CBMI) will be
held on June 23-25, 2010 in Grenoble, France. It will be
organized by the Laboratoire d'Informatique de Grenoble
http://www.liglab.fr/. CBMI 2010 aims at bringing together
the various communities involved in the different aspects of
content-based multimedia indexing, such as image processing
and information retrieval with current industrial trends and
developments. Research in Multimedia Indexing covers a wide
spectrum of topics in content analysis, content description,
content adaptation and content retrieval. Hence, topics of
interest for the Special Issue include, but are not limited to:

- Multimedia indexing and retrieval (image, audio, video, text)
- Matching and similarity search
- Construction of high level indices
- Multimedia content extraction
- Identification and tracking of semantic regions in scenes
- Multi-modal and cross-modal indexing
- Content-based search
- Multimedia data mining
- Metadata generation, coding and transformation
- Large scale multimedia database management
- Summarisation, browsing and organization of multimedia content
- Presentation and visualization tools
- User interaction and relevance feedback
- Personalization and content adaptation


Paper Format
Papers must be typed in a font size no smaller than 10 pt,
and presented in single-column format with double line spacing
on one side A4 paper. All pages should be numbered. The
manuscript should be formatted according to the requirements
of the journal. Detailed information about the Journal,
including an author guide and detailed formatting information
is available at:
http://www.springer.com/computer/information+systems/journal/11042.

Paper Submission
All papers must be submitted through the journals Editorial
Manager system: http://mtap.edmgr.com. When uploading your paper,
please ensure that your manuscript is marked as being for this
special issue.

Important Dates
Manuscript due:              19th of April 2010
Notification of acceptance:  1st of July 2010
Publication date:            January 2011

Guest Editors
Dr. Georges Quénot
LIG UMR 5217 INPG-INRIA-University Joseph Fourier, UPMF -CNRS
Campus Scientifique, BP 53, 38041 Grenoble Cedex 9, France
e-mail : Georges.Quenot@imag.fr

Prof. Jenny Benois-Pineau,
University of Bordeaux1, LABRI UMR 5800 Universities Bordeaux-CNRS,
e-mail: jenny.benois@labri.fr

Prof. Régine André-Obrecht
University Paul Sabatier, Toulouse, IRIT UMR UPS/CNRS/UT1/UTM, France
e-mail: obrecht@irit.fr

http://www.springer.com/cda/content/document/cda_downloaddocument/CFP-11042-20091003.pdf

Back to Top

9-6 . New book series: Frontiers in Mathematical Linguistics and Language Theory.

New book series:  Mathematics, Computing, Language, and Life: Frontiers in Mathematical Linguistics and Language Theory  to be published by Imperial College Press starting in 2010.  Editor: Carlos Martin-Vide  carlos.martin@urv.cat
Back to Top

9-7 . CfP Speech recognition in adverse environment in

Call for papers: Special issue on Speech Recognition in Adverse Conditions of Language and Cognitive Processes/ Cognitive Neurosciences of Language

 

Language and Cognitive Processes, jointly with Cognitive Neuroscience of Language, is launching a call for submissions for a special issue on:

 

Speech Recognition in Adverse Conditions

 

This special issue is a unique opportunity to promote the development of a unifying thematic framework for understanding the perceptual, cognitive and neuro-physiological mechanisms underpinning speech recognition in adverse conditions. In particular, we seek papers focusing on the recognition of acoustically degraded speech (e.g., speech in noise, “accented” or motor-disordered speech), speech recognition under cognitive load (e.g., divided attention, memory load) and speech recognition by theoretically relevant populations (e.g., children, elderly or non-native listeners). We welcome both cognitive and neuroscientific perspectives on the topic that report strong and original empirical data firmly grounded in theory.

 

Guest editors: Sven Mattys, Ann Bradlow, Matt Davis, and Sophie Scott.

Submission deadline: 30 November 2010.

 

Please see URL below for further details:

 

http://www.tandf.co.uk/journals/cfp/PLCPcfp2.pdf

Back to Top

9-8 . CfP Special Issue of Speech Communication on Advanced Voice Function Assessment

Speech Communication
Call for papers for the Special
Issue on “Advanced Voice
Function Assessment”
Everyday we use our voice to communicate, express emotions and feelings. Voice is
also an important instrument for many professionals like teachers, singers, actors,
lawyers, managers, salesmen etc. Modern style of life has increased the risk of
experiencing some kind of voice alterations. It is believed that around the 19% of the
population suffer or have suffered dysphonic voicing due to some kind of disease or
dysfunction. So there exist a need for new and objective ways to evaluate the quality
of voice, and its connection with vocal folds activity and the complex interaction
between the larynx and the voluntary movements of the articulators (i.e. mouth,
tongue, velum, jaw, etc).
Diagnosis of voice disorders, the screening of vocal and voice diseases (and
particularly their early detection), the objective determination of vocal function
alterations and the evaluation of surgical as well as pharmacological treatments and
rehabilitation, are considered as major goals of the voice function assessment.
Applications of Voice Function Assessment also include control of voice quality for
voice professionals such as teachers, singers, speakers, as well as for the evaluation of
the stress, vocal fatigue and loading, etc. Although the state of the art reports
significant achievements in understanding the voice production mechanism and in
assessing voice quality, there is a continuous need for improving the existing models
of the normal and pathological voice source to analyse healthy and pathological
voices. This special issue aims at offering an interdisciplinary platform for presenting
new knowledge in the field of models and analysis of voice signals in conjunction
with videoendoscopic images with applications in occupational, pathological, and
oesophageal voices. The scope of the special issue includes all aspects of voice
modelling and analysis, ranging from fundamental research to all kind of biomedical
applications and related established and advanced technologies. Original, previously
unpublished submissions are encouraged within the following scope of topics:
- Databases of voice disorders
- Robust analysis of pathological and oesophageal voices
- Inverse filtering for voice function assessment
- Automatic detection of voice disorders from voice and speech
- Automatic assessment and classification of voice quality
- Multi-modal analysis of disordered speech (voice, speech, vocal folds images
using videolaryngoscopy, videokymography, fMRI and other emerging
techniques)
- New strategies for parameterization and modelling normal and pathological voices
(e.g. biomechanical-based parameters, chaos modelling, etc)
- Signal processing to support the remote diagnosis
- Assessment of voice quality in rehabilitation
- Speech enhancement for pathological and oesophageal speech
- Technical aids and hands-free devices: vocal prostheses and aids for disabled
- Non-speech vocal emissions (e.g. infant cry, cough and snoring)
- Relationship between speech and neurological dysfunctions (e.g. epilepsy, autism,
schizophrenia, stress etc.)
- Computer-based diagnostic and training systems for speech dysfunctions
Composition and Review Procedures
The emphasis of this special issue is on both basic and applied research related to
evaluation of voice quality and diagnosis schemes, as well as in the results of voice
treatments. The submissions received for this Special Issue of Speech Communication on
Advanced Voice Function Assessment will undergo the normal review process.
Guest Editors
• Juan I. Godino-Llorente, Universidad Politécnica de Madrid, Spain,
igodino@ics.upm.es
• Yannis Stylianou, University of Crete, Greece, yannis@csd.uoc.gr
• Philippe H. DeJonckere, University Medical Center Utrecht, The Netherlands,
ph.dejonckere@umcutrecht.nl
• Pedro Gómez-Vilda, Universidad Politécnica de Madrid, Spain,
pedro@pino.datsi.fi.upm.es
Important Dates
Deadline for submission: June, 15th, 2010.
First Notification: September 15th, 2010.
Revisions Ready: October 30st, 2010
Final Notification: November, 30th, 2010
Final papers ready: December, 30th, 2010
Tentative publication date: January, 30th, 2011
Submission Procedure
Prospective authors should follow the regular guidelines of the Speech Communication
Journal for electronic submission (http://ees.elsevier.com/specom). During submission
authors must select the Section “Special Issue Paper”, not “Regular Paper”, and the title
of the special issue should be referenced in the “Comments” (Special Issue on Advanced
Voice Function Assessment) page along with any other information
Back to Top

9-9 . CfP Special Issue on Deep Learning for Speech and Language Processing, IEEE Trans. ASLT

Call for Papers
IEEE Transactions on Audio, Speech, and Language Processing
IEEE Signal Processing Society
Special Issue on Deep Learning for Speech and Language Processing
Over the past 25 years or so, speech recognition technology has been dominated largely by hidden Markov models (HMMs). Significant technological success has been achieved using complex and carefully engineered variants of HMMs. Next generation technologies require solutions to technical challenges presented by diversified deployment environments. These challenges arise from the many types of variability present in the speech signal itself. Overcoming these challenges is likely to require “deep” architectures with efficient and effective learning algorithms. There are three main characteristics in the deep learning paradigm: 1) layered architecture; 2) generative modeling at the lower layer(s); and 3) unsupervised learning at the lower layer(s) in general. For speech and language processing and related sequential pattern recognition applications, some attempts have been made in the past to develop layered computational architectures that are “deeper” than conventional HMMs, such as hierarchical HMMs, hierarchical point-process models, hidden dynamic models, layered multilayer perceptron, tandem-architecture neural-net feature extraction, multi-level detection-based architectures, deep belief networks, hierarchical conditional random field, and deep-structured conditional random field. While positive recognition results have been reported, there has been a conspicuous lack of systematic learning techniques and theoretical guidance to facilitate the development of these deep architectures. Recent communication between machine learning researchers and speech and language processing researchers revealed a wealth of research results pertaining to insightful applications of deep learning to some classical speech recognition and language processing problems. These results can potentially further advance the state of the arts in speech and language processing.
In light of the sufficient research activities in this exciting space already taken place and their importance, we invite papers describing various aspects of deep learning and related techniques/architectures as well as their successful applications to speech and language processing. Submissions must not have been previously published, with the exception that substantial extensions of conference or workshop papers will be considered.
The submissions must have specific connection to audio, speech, and/or language processing. The topics of particular interest will include, but are not limited to:
 Generative models and discriminative statistical or neural models with deep structure
 Supervised, semi-supervised, and unsupervised learning with deep structure
 Representing sequential patterns in statistical or neural models
 Robustness issues in deep learning
 Scalability issues in deep learning
 Optimization techniques in deep learning
 Deep learning of relationships between the linguistic hierarchy and data-driven speech units
 Deep learning models and techniques in applications such as (but not limited to) isolated or continuous speech recognition, phonetic recognition, music signal processing, language modeling, and language identification.
The authors are required to follow the Author’s Guide for manuscript submission to the IEEE Transactions on Audio, Speech, and Language Processing at
http://www.signalprocessingsociety.org/publications/periodicals/taslp/taslp-author-information
Submission deadline: September 15, 2010
Notification of Acceptance: March 15, 2011
Final manuscripts due: May 15, 2011
Date of publication: August 2011
For further information, please contact the guest editors:
Dong Yu
Back to Top

9-10 . ACM TSLP Special issue:“Machine Learning for Robust and Adaptive Spoken Dialogue Systems"

 ACM TSLP - Special Issue: call for Papers:
“Machine Learning for Robust and Adaptive Spoken Dialogue Systems"

* Submission Deadline 1 July 2010 *
http://tslp.acm.org/specialissues.html

During the last decade, research in the field of Spoken Dialogue
Systems (SDS) has experienced increasing growth, and new applications
include interactive search, tutoring and “troubleshooting” systems,
games, and health agents. The design and optimization of such SDS
requires the development of dialogue strategies which can robustly
handle uncertainty, and which can automatically adapt to different
types of users (novice/expert, youth/senior) and noise conditions
(room/street). New statistical learning techniques are also emerging
for training and optimizing speech recognition, parsing / language
understanding, generation, and synthesis for robust and adaptive
spoken dialogue systems.

Automatic learning of adaptive, optimal dialogue strategies is
currently a leading domain of research. Among machine learning
techniques for spoken dialogue strategy optimization, reinforcement
learning using Markov Decision Processes (MDPs) and Partially
Observable MDPs (POMDPs) has become a particular focus.
One concern for such approaches is the development of appropriate
dialogue corpora for training and testing. However, the small amount
of data generally available for learning and testing dialogue
strategies does not contain enough information to explore the whole
space of dialogue states (and of strategies). Therefore dialogue
simulation is most often required to expand existing datasets and
man-machine spoken dialogue stochastic modelling and simulation has
become a research field in its own right. User simulations for
different types of user are a particular new focus of interest.

Specific topics of interest include, but are not limited to:

 • Robust and adaptive dialogue strategies
 • User simulation techniques for robust and adaptive strategy
learning and testing
 • Rapid adaptation methods
 • Modelling uncertainty about user goals
 • Modelling user’s goal evolution along time
 • Partially Observable MDPs in dialogue strategy optimization
 • Methods for cross-domain optimization of dialogue strategies
 • Statistical spoken language understanding in dialogue systems
 • Machine learning and context-sensitive speech recognition
 • Learning for adaptive Natural Language Generation in dialogue
 • Machine learning for adaptive speech synthesis (emphasis, prosody, etc.)
 • Corpora and annotation for machine learning approaches to SDS
 • Approaches to generalising limited corpus data to build user models
and user simulations
 • Evaluation of adaptivity and robustness in statistical approaches
to SDS and user simulation.

Submission Procedure:
Authors should follow the ACM TSLP manuscript preparation guidelines
described on the journal web site
http://tslp.acm.org and submit an
electronic copy of their complete manuscript through the journal
manuscript submission site
http://mc.manuscriptcentral.com/acm/tslp.
Authors are required to specify that their submission is intended for
this Special Issue by including on the first page of the manuscript
and in the field “Author’s Cover Letter” the note “Submitted for the
Special Issue of Speech and Language Processing on Machine Learning
for Robust and Adaptive Spoken Dialogue Systems”. Without this
indication, your submission cannot be considered for this Special
Issue.

Schedule:
• Submission deadline : 1 July 2010
• Notification of acceptance: 1 October 2010
• Final manuscript due: 15th November 2010

Guest Editors:
Oliver Lemon, Heriot-Watt University, Interaction Lab, School of
Mathematics and Computer Science, Edinburgh, UK.
Olivier Pietquin, Ecole Supérieure d’Électricité (Supelec), Metz, France.

http://tslp.acm.org/cfp/acmtslp-cfp2010-02.pdf

Back to Top

10 . Future Speech Science and Technology Events

10-1 . (2010-06-21) Second International Workshop on Quality of Multimedia Experience, QoMex'10

10-2 . (2010-06-29) 2nd Int. Symposium on BASAL GANGLIA SPEECH DISORDERS AND DEEP BRAIN STIMULATION


*BASAL GANGLIA SPEECH DISORDERS AND DEEP BRAIN STIMULATION*
***2nd INTERNATIONAL SYMPOSIUM*
**
**
**that will take place on

*2010, June 29th - July 1st*
*Maison Méditerranéenne des Sciences de l'Homme*
**
*Aix-en-Provence, France*
**
**
**
***Registration is now open*
**
*Please, register online @*
http://lpl-aix. fr/~speechdbs2010 <http://lpl-aix.%20fr/%7Espeechdbs2010>
__
__
__Symposium organised by Serge PINTO (LPL), François VIALLET (LPL) & Elina TRIPOLITI (London, UK)
*LPL- Laboratoire Parole et Langage*
**(CNRS - Aix-Marseille University)

/Fees/
/Normal: 200 euros/
/Student: 100 euros/
//
//For further any enquiries, please contact us
//speechdbs.org@ lpl-aix.fr
speechdbs.registration@lpl-aix. fr

*Who should attend*
*//*ENTs, ENT surgeons, Neurologists Neurophysiologists, Neuropsychologists, Neuroscientists, Neurosurgeons, Physiotherapists,
Speech & Language Therapists, Specialist Nurses and any other health/scientific professionals with a special interest on speech
in movement disorders


*We look forward seing you in Aix-en-Provence, Cezanne's place and other...*
**(_http://www.aixenprovencetourism.com/_

Back to Top

10-3 . (2010-07-12) eNTERFACE’10 - the 6th Intl. Summer Workshop on Multimodal Interfaces, Amsterdam

eNTERFACE’10 - the 6th Intl. Summer Workshop on Multimodal Interfaces

Amsterdam, the Netherlands,

July 12th – August 6th, 2010

-----------------------------

Call for Participation - apologies for cross-posting

-----------------------------

 

The eNTERFACE workshops aim at establishing a tradition of

collaborative, localized research and development work by gathering,

in a single place, a team of leading professionals in multimodal

man-machine interfaces together with students (both graduate and

undergraduate), to work on a pre-specified list of challenges, for 4

complete weeks. In this respect, it is an innovative and intensive

collaboration scheme, designed to allow researchers to integrate their

software tools, deploy demonstrators, collect novel databases, and

work side by side with a great number of experts.

Outcomes of synergy and success stories of past eNTERFACE Workshops

held in Mons (2005), Dubrovnik (2006), Istanbul (2007), Paris (2008),

and Genova (2009) can be seen at www.enterface.net. Intelligent

Systems Lab Amsterdam of the University of Amsterdam is organizing the

2010 edition of the Workshop.

Senior researchers, PhD, MS, or undergraduate students interested in

participating at the Workshop should send their application by

emailing the Organizing Committee at a.a.salah@uva.nl on or before

March 1, 2010 (extended). The application should contain:

 

-       A short CV.

-       A list of three preferred projects to work on.

-       A list of interests/skills to offer for these projects.

-       Possible dates of participation (full/partial).

 

The workshop is FREE for all participants, but participants must

procure their own travel and accommodation expenses. Information about

the venue and accommodation are provided on the eNTERFACE’10 website:

http://enterface10.science.uva.nl

 

eNTERFACE'10 will welcome students, researchers, and seniors, working

in teams on the following projects:

#01 CoMediAnnotate: a usable multimodal annotation framework

#02 Looking around in a virtual world

#03 Parameterized user modelling of people with disabilities and

simulation of their behaviour in a virtual environment

#04 Continuous interaction for ECAs

#05 Multimodal Speaker Verification in NonStationary Noise Environments

#06 Vision based Hand Puppet

#07 Audio-visual speech recognition

#08 Affect-responsive interactive photo-frame

#09 Automatic Fingersign to Speech Translator

 

Full descriptions of the projects are available at:

http://enterface10.science.uva.nl/projectsTeams.php

 

 

eNTERFACE'10 Scientific Committee:

 

Lale Akarun, Boğaziçi University, Turkey

Antonio Camurri, University of Genova, Italy

Cristophe d'Alessandro, CNRS-LIMSI, Orsay, France

Thierry Dutoit, Faculté Polytechnique de Mons, Belgium

Theo Gevers, University of Amsterdam, The Netherlands

Ben Kröse, University of Amsterdam, The Netherlands

Maurizio Mancini, University of Genova, Italy

Panos Markopoulos, Technical University Eindhoven, The Netherlands

Ferran Marques, Universitat Politécnica de Catalunya, Spain

Ramon Morros, Universitat Politécnica de Catalunya, Spain

Anton Nijholt, Twente University, The Netherlands

Igor Pandzic, Zagreb University, Croatia

Catherine Pelachaud, TELECOM Paris-Tech, France

Albert Ali Salah, University of Amsterdam, The Netherlands

Bülent Sankur, Bogazici University, Turkey

Ben Schouten, FONTYS, The Netherlands

Bjorn Schuller, Technical University of  Munich, Germany

Nicu Sebe, University of Trento, Italy

Alessandro Vinciarelli, IDIAP, Switzerland

Gualtiero Volpe, University of Genova, Italy

 

Back to Top

10-4 . (2010-07-15) ACL 2010 Workshop on Domain Adaptation for Natural Language Processing (DANLP 2010) Sweden

2nd CALL FOR PAPERS

            ACL 2010 Workshop on Domain Adaptation
          for Natural Language Processing (DANLP 2010)
            http://sites.google.com/site/danlp2010/

              July 15, 2010, Uppsala, Sweden

====================================================================

Most modern Natural Language Processing (NLP) systems are subject to
the well known problem of lack of portability to new domains/genres:
there is a substantial drop in their performance when tested on data
from a new domain, i.e., their test data is drawn from a related but
different distribution as their training data. This problem is
inherent in the assumption of independent and identically distributed
(i.i.d.) variables for machine learning systems, but has started to
get attention only in recent years. The need for domain adaptation
arises in almost all NLP tasks: part-of-speech tagging, semantic role
labeling, statistical parsing and statistical machine translation, to
name but a few.

Studies on supervised domain adaptation (where there are limited
amounts of annotated resources in the new domain) have shown that
baselines comprising of very simple models (e.g. models based only on
source-domain data, only target-domain data, or the union of the two)
achieve relatively high performance and are "surprisingly difficult to
beat" (Daume III, 2007). Thus, one conclusion from that line of work
is that as long as there is a reasonable (often even small) amount of
labeled target data, it is often more fruitful to just use that.

In contrast, semi-supervised adaptation (i.e., no annotated resources
in the new domain) is a much more realistic situation but is clearly
also considerably more difficult.  Current studies on semi-supervised
approaches show very mixed results. For example, Structural
Correspondence Learning (Blitzer et al., 2006) was applied
successfully to classification tasks, while only modest gains could be
obtained for structured output tasks like parsing. Many questions thus
remain open.

The goal of this workshop is to provide a meeting-point for research
that approaches the problem of adaptation from the varied perspectives
of machine-learning and a variety of NLP tasks such as parsing,
machine-translation, word sense disambiguation, etc.  We believe there
is much to gain by treating domain-adaptation as a general learning
strategy that utilizes prior knowledge of a specific or a general
domain in learning about a new domain; here the notion of a 'domain'
could be as varied as child language versus adult-language, or the
source-side re-ordering of words to target-side word-order in a
statistical machine translation system.

Sharing insights, methodologies and successes across tasks will thus
contribute towards a better understanding of this problem. For
instance, self-training the Charniak parser alone was not effective
for adaptation (it has been common wisdom that self-training is
generally not effective), but self-training with a reranker was
surprisingly highly effective (McClosky et al., 2006). Is this an
insight into adaptation that can be used elsewhere?  We believe that
the key to future success will be to exploit large collections of
unlabeled data in addition to labeled data. Not only because unlabeled
data is easier to obtain, but existing labeled resources are often not
even close to the envisioned target application domain. Directly
related is the question of how to measure closeness (or differences)
among domains.

===============
Workshop Topics
===============

We especially encourage submissions on semi-supervised approaches of
domain adaptation with a deep analysis of models, data and results,
although we do not exclude papers on supervised adaptation. In
particular, we welcome submissions that address any of the following
topics or other relevant issues:

* Algorithms for semi-supervised DA
* Active learning for DA
* Integration of expert/prior knowledge about new domains
* DA in specific applications (e.g., Parsing, MT, IE, QA, IR, WSD)
* Automatic domain identification and model adjustment
* Porting algorithms developed for one type of problem structure to
another (e.g.
from binary classification to structured-prediction problems)
* Analysis and negative results: in-depth analysis of results, i.e.
which model
parts/parameters are responsible for successful adaptation; what can we
learn
from negative results (impact of negative experimental results on
learning strategies/
parameters)
* A complementary perspective: (Better) generalization of ML models,
i.e. to
make NLP models more broad-coverage and domain-independent, rather than
domain-specific
* Learning from multiple domains

==========
Submission
==========

Papers should be submitted via the ACL submission system:

https://www.softconf.com/acl2010/DANLP/

All submissions are limited to 6 pages (including references) and
should be formatted using the ACL 2010 style file that can be found at:

http://acl2010.org/authors.html.

As the reviewing will be blind, papers must not include the authors'
names and affiliations.  Submissions should be in English and should
not have been published previously. If essentially identical papers
are submitted to other conferences or workshops as well, this fact
must be indicated at submission time.

The submission deadline is 23:59 CET on April 5, 2010.

===============
Important Dates
===============

April 5, 2010: Submission deadline
May 6, 2010: Notification of acceptance
May 16, 2010: Camera-ready papers due
July 15, 2010: Workshop

===============
Invited speaker
===============

John Blitzer, University of California, United States

============
Organization
============

Hal Daumé III, University of Utah, USA
Tejaswini Deoskar, University of Amsterdam, The Netherlands
David McClosky, Stanford University, USA
Barbara Plank, University of Groningen, The Netherlands
Jörg Tiedemann, Uppsala University, Sweden

=================
Program Committee
=================

Eneko Agirre, University of the Basque Country, Spain
John Blitzer, University of California, United States
Walter Daelemans, University of Antwerp, Belgium
Mark Dredze, Johns Hopkins University, United States
Kevin Duh, NTT Communication Science Laboratories, Japan (formerly
University of Washington, Seattle)
Philipp Koehn, University of Edinburgh, United Kingdom
Jing Jiang, Singapore Management University, Singapore
Oier Lopez de Lacalle, University of the Basque Country, Spain
Robert Malouf, San Diego State University, United States
Ray Mooney, University Texas, United States
Hwee Tou Ng, National University of Singapore, Singapore
Khalil Sima'an, University of Amsterdam, The Netherlands
Michel Simard, National Research Council of Canada, Canada
Jun'ichi Tsujii, University of Tokyo, Japan
Antal van den Bosch, Tilburg University, The Netherlands
Josef van Genabith, Dublin City University, Ireland
Yi Zhang, German Research Centre for Artificial Intelligence (DFKI GmbH)
and Saarland University, Germany

=======
Sponsor
=======

This workshop is kindly supported by the Stevin project PaCo-MT (Parse
and Corpus-based Machine Translation) .

=======
Contact
=======

Email: danlp.acl2010@gmail.com
Website: http://sites.google.com/site/danlp2010/

-- 
Back to Top

10-5 . (2010-09-06) Thirteenth International Conference on TEXT, SPEECH and DIALOGUE (TSD 2010) Brno, Czech Republic

 Thirteenth International Conference on TEXT, SPEECH and DIALOGUE (TSD 2010)  

              Brno, Czech Republic, 6-10 September 2010

                   http://www.tsdconference.org/

 

The conference is organized by the Faculty of Informatics, Masaryk

University, Brno, and the Faculty of Applied Sciences, University of

West Bohemia, Pilsen.  The conference is supported by International

Speech Communication Association.

 

Venue: Brno, Czech Republic

 

 

THE SUBMISSION DEADLINES:

 

    March 15 2010 ............ Submission of abstracts

    March 22 2010 ............ Submission of full papers

 

Submission of abstract serves for better organization of the review

process only - for the actual review a full paper submission is

necessary.

 

TSD SERIES

 

TSD series evolved as a prime forum for interaction between researchers in

both spoken and written language processing from the former East Block

countries and their Western colleagues. Proceedings of TSD form a book

published by Springer-Verlag in their Lecture Notes in Artificial

Intelligence (LNAI) series. TSD Proceedings are regularly indexed by

Thomson Reuters Conference Proceedings Citation Index.  Moreover, LNAI

series are listed in all major citation databases such as DBLP, SCOPUS, EI,

INSPEC or COMPENDEX.

 

 

TOPICS

 

Topics of the conference will include (but are not limited to):

 

    text corpora and tagging

    transcription problems in spoken corpora

    sense disambiguation

    links between text and speech oriented systems

    parsing issues

    parsing problems in spoken texts

    multi-lingual issues

    multi-lingual dialogue systems

    information retrieval and information extraction

    text/topic summarization

    machine translation

    semantic networks and ontologies

    semantic web

    speech modeling

    speech segmentation

    speech recognition

    search in speech for IR and IE

    text-to-speech synthesis

    dialogue systems

    development of dialogue strategies

    prosody in dialogues

    emotions and personality modeling

    user modeling

    knowledge representation in relation to dialogue systems

    assistive technologies based on speech and dialogue

    applied systems and software

    facial animation

    visual speech synthesis

 

Papers on processing languages other than English are strongly

encouraged.

 

 

PROGRAM COMMITTEE

 

    Frederick Jelinek, USA (general chair)

    Hynek Hermansky, Switzerland (executive chair)

    Eneko Agirre, Spain

    Genevieve Baudoin, France

    Jan Cernocky, Czech Republic

    Attila Ferencz, Romania

    Alexander Gelbukh, Mexico

    Louise Guthrie, GB

    Jan Hajic, Czech Republic

    Eva Hajicova, Czech Republic

    Patrick Hanks, Czech Republic

    Ludwig Hitzenberger, Germany

    Jaroslava Hlavacova, Czech Republic

    Ales Horak, Czech Republic

    Eduard Hovy, USA

    Ivan Kopecek, Czech Republic

    Steven Krauwer, The Netherlands

    Siegfried Kunzmann, Germany

    Natalija Loukachevitch, Russia

    Vaclav Matousek, Czech Republic

    Hermann Ney, Germany

    Elmar Noeth, Germany

    Karel Oliva, Czech Republic

    Karel Pala, Czech Republic

    Nikola Pavesic, Slovenia

    Vladimir Petkevic, Czech Republic

    Fabio Pianesi, Italy

    Adam Przepiorkowski, Poland

    Josef Psutka, Czech Republic

    James Pustejovsky, USA

    Leon Rothkrantz, The Netherlands

    Milan Rusko, Slovakia

    Ernst G. Schukat-Talamazzini, Germany

    Pavel Skrelin, Russia

    Pavel Smrz, Czech Republic

    Petr Sojka, Czech Republic

    Marko Tadic, Croatia

    Tamas Varadi, Hungary

    Zygmunt Vetulani, Poland

    Taras Vintsiuk, Ukraine

    Yorick Wilks, GB

    Victor Zakharov, Russia

 

 

KEYNOTE SPEAKERS

 

    John Carroll, University of Sussex, UK

    Christiane Fellbaum, Princeton University, USA

 

 

FORMAT OF THE CONFERENCE

 

The conference program will include presentation of invited papers,

oral presentations, and poster/demonstration sessions. Papers will

be presented in plenary or topic oriented sessions.

 

Social events including a trip in the vicinity of Brno will allow

for additional informal interactions.

 

 

SUBMISSION OF PAPERS

 

Authors are invited to submit a full paper not exceeding 8 pages

formatted in the LNCS style (see below). Those accepted will be

presented either orally or as posters. The decision about the

presentation format will be based on the recommendation of the

reviewers. The authors are asked to submit their papers using the

on-line form accessible from the conference website.

 

Papers submitted to TSD 2010 must not be under review by any other

conference or publication during the TSD review cycle, and must not be

previously published or accepted for publication elsewhere.

 

As reviewing will be blind, the paper should not include the authors'

names and affiliations. Furthermore, self-references that reveal the

author's identity, e.g., "We previously showed (Smith, 1991) ...",

should be avoided. Instead, use citations such as "Smith previously

showed (Smith, 1991) ...".  Papers that do not conform to the

requirements above are subject to be rejected without review.

 

The authors are strongly encouraged to write their papers in TeX

or LaTeX formats. These formats are necessary for the final

versions of the papers that will be published in the Springer

Lecture Notes.  Authors using a WORD compatible software for the

final version must use the LNCS template for WORD and within the

submit process ask the Proceedings Editors to convert the paper

to LaTeX format.  For this service a service-and-license fee of

CZK 1500 will be levied automatically.

 

The paper format for review has to be either PDF or PostScript file

with all required fonts included. Upon notification of acceptance,

presenters will receive further information on submitting their

camera-ready and electronic sources (for detailed instructions on

the final paper format see

http://www.springer.de/comp/lncs/authors.html#Proceedings.)

 

Authors are also invited to present actual projects, developed

software or interesting material relevant to the topics of the

conference.  The presenters of the demonstration should provide the

abstract not exceeding one page. The demonstration abstracts will not

appear in the conference proceedings.

 

 

IMPORTANT DATES

 

March 15 2010 ............ Submission of abstracts

March 22 2010 ............ Submission of full papers

May 15 2010 .............. Notification of acceptance

May 31 2010 .............. Final papers (camera ready) and registration

July 23 2010 ............. Submission of demonstration abstracts

July 30 2010 ............. Notification of acceptance for

                           demonstrations sent to the authors

September 6-10 2010 ...... Conference date

 

Submission of abstracts serves for better organization of the review

process only - for the actual review a full paper submission is

necessary.

 

The contributions to the conference will be published in proceedings

that will be made available to participants at the time of the

conference.

 

 

OFFICIAL LANGUAGE

 

The official language of the conference is English.

 

 

ACCOMMODATION

 

The organizing committee will arrange discounts on accommodation in

the 3-star hotel at the conference venue. The current prices of the

accommodation will be available at the conference website.

 

 

ADDRESS

 

All correspondence regarding the conference should be

addressed to

   

    Dana Hlavackova, TSD 2010

    Faculty of Informatics, Masaryk University

    Botanicka 68a, 602 00 Brno, Czech Republic

    phone: +420-5-49 49 33 29

    fax: +420-5-49 49 18 20

    email: tsd2010@tsdconference.org

 

The official TSD 2010 homepage is: http://www.tsdconference.org/

 

 

LOCATION

 

Brno is the second largest city in the Czech Republic with a

population of almost 400.000. The city is the country's judiciary and

trade-fair center. Brno is the capital of Moravia, which is in the

south-east part of the Czech Republic. Brno had been a Royal City

since 1347 and with its six universities it forms also a cultural

center of the region.

 

Brno can be reached easily by direct flights from London, Moscow

and Prague and by trains or buses from Prague (200 km) or

Vienna (130 km).

 

For the participants with some extra time, some nearby places may also

be of interest.  Local ones include: Brno Castle now called Spilberk,

Veveri Castle, the Old and New City Halls, the Augustine Monastery

with St. Thomas Church and crypt of Moravian Margraves, Church of St.

James, Cathedral of St. Peter & Paul, Cartesian Monastery in Kralovo

Pole, the famous Villa Tugendhat designed by Mies van der Rohe along

with other important buildings of between-war Czech architecture.

 

For those willing to venture out of Brno, Moravian Karst with Macocha

Chasm and Punkva caves, battlefield of Battle of three emperors

(Napoleon, Russian Alexander and Austrian Franz - Battle by

Austerlitz), Chateau of Slavkov (Austerlitz), Pernstejn Castle,

Buchlov Castle, Lednice Chateau, Buchlovice Chateau, Letovice Chateau,

Mikulov with one of the largest Jewish cemeteries in Central Europe,

Telc - a town on the UNESCO heritage list, and many others are all

within easy reach.                                        
Back to Top

10-6 . (2010-09-08) 21st Conference on Electronic Speech Signal Processing (ESSV)

Call for Papers

21st Conference on Electronic Speech Signal Processing (ESSV)

8 - 10 September 2010 in Berlin

Dear friends of our conference series,

Also in the year 2010 the conference Electronic Speech Signal Processing will bring together those interested in speech technology in research and applications. After a long break the event will be once again held in Berlin, at Beuth University of Applied Sciences. Although this has traditionally been a German event, we also invite our colleagues from abroad to contribute. Therefore conference languages will be German and English. The conference will again focus on speech signal processing at large, with the following being potential topics of contributions, but not an exhaustive list:

  • Speech recognition and synthesis in embedded systems
  • Speech technology in vehicles
  • Speech technology and education
  • Speech technology for the disabled
  • Speech and multimedia
  • Applications to non-speech acoustic signals from biological, musical and technological fields

This time is the twenty-first that the ESSV takes place. As always the organizers strive to develop a scientifically sophisticated program reflecting the cutting edge of speech technology. We are relying on your active cooperation and invite you cordially to make a contribution in the form of a talk or a poster. The proceedings will be published as usual in the series "Studientexte zur Sprachkommunikation" of TUDpress publishing.

Paper Submission

More info about the proceedings, venue and accommodations will be updated regularly online at the following address:

http://public.beuth-hochschule.de/~mixdorff/essv2010/index_english.html.

You can also contact us by post, fax or E-mail at the following address:

Beuth Hochschule für Technik Berlin
Fachbereich Informatik und Medien
Prof. Dr.-Ing. habil. Hansjörg Mixdorff
13353 Berlin
Luxemburger Straße 10

Tel: 030 4504 2364
Fax: 030 4505 2013
E-Mail: essv2010@beuth-hochschule.de

Important Dates

  • Abstract Submission Deadline (max. 1 page):
    1 May 2010
  • Notification of Acceptance:
    15 May 2010
  • Deadline for conference papers to be published in the proceedings:
    15 July 2010

Local Organizers

Hansjörg Mixdorff
Sascha Fagel
Lutz Leutelt

 Call for Papers

 

Back to Top

10-7 . (2010-09-15) 52nd International Symposium ELMAR-2010

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~         52nd International Symposium ELMAR-2010 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                   September 15-17, 2010                     Zadar, Croatia         Paper submission deadline: March 15, 2010                http://www.elmar-zadar.org/                       CALL FOR PAPERS                     http://www.elmar-zadar.org/2010/call_for_papers/elmar2010_cfp07.pdf   TECHNICAL CO-SPONSORS  IEEE Region 8 IEEE Croatia Section IEEE Croatia Section Chapter of the Signal Processing Society IEEE Croatia Section Joint Chapter of the AP/MTT Societies EURASIP - European Assoc. Signal, Speech and Image Processing   CONFERENCE PROCEEDINGS INDEXED BY  IEEE Xplore, INSPEC and SCOPUS   TOPICS  --> Image and Video Processing --> Multimedia Communications --> Speech and Audio Processing --> Wireless Commununications --> Telecommunications --> Antennas and Propagation --> Navigation Systems --> Ship Electronic Systems --> Power Electronics and Automation --> Naval Architecture --> Sea Ecology --> Special Sessions Proposals - A special session consist      of 5-6 papers which should present a unifying theme from      a diversity of viewpoints   KEYNOTE TALKS  * Prof. Lajos Hanzo, University of Southampton, UK:   Telepresence, the 'World-Wide Wait' and 'Green' Radios...  * Dr. Michael M. Bronstein, Technion - Israel Institute    of Technology, Haifa, ISRAEL:   Non-rigid, non-rigid, non-rigid world  * Dr. Mikel M. Miller, AFRL Munitions Directorate,    Eglin Air Force Base, Florida, USA:   Got GPS? The Navigation Gap  * Dr. Panos Liatsis, City University London, UK:   3D reconstruction and stenosis quantification    in CT angiograms   SUBMISSION  Papers accepted by two reviewers will be published in  conference proceedings available at the conference and  abstracted/indexed in IEEE Xplore, INSPEC and SCOPUS  databases. More info is available here:  http://www.elmar-zadar.org/2010/paper_submission/   SCHEDULE OF IMPORTANT DATES  Deadline for submission of full papers: March 15, 2010 Notification of acceptance mailed out by: May 10, 2010 Submission of (final) camera-ready papers: May 20, 2010 Preliminary program available online by: June 14, 2010 Registration forms and payment deadline: June 21, 2010   GENERAL CO-CHAIRS  Ive Mustac, Tankerska plovidba, Zadar, Croatia Branka Zovko-Cihlar, University of Zagreb, Croatia   PROGRAM CHAIR  Mislav Grgic, University of Zagreb, Croatia  CONTACT INFORMATION  Prof. Mislav Grgic  FER, Unska 3/XII  HR-10000 Zagreb  CROATIA  Telephone: + 385 1 6129 851  Fax: + 385 1 6129 717  E-mail: elmar2010 (at) fer.hr  For further information please visit:  http://www.elmar-zadar.org/
Back to Top

10-8 . (2010-09-27)Summer School CPMSP2 - 2010 Cognitive and Physical Models of Speech Production, Speech Perception and Production-Perception Interaction

Announcement
Summer School CPMSP2 - 2010
“Cognitive and Physical Models of Speech Production, Speech Perception and Production-Perception Interaction”
Part III: Planning and Dynamics
Berlin – September 27-October 1, 2010
After two successful editions in Lubmin (2004) and Autrans (2007), we are pleased to announce the 3rd International CPMSP2 Summer School on “Cognitive and Physical Models of Speech Production, Speech Perception and Production-Perception Interaction”. The summer school will be held in Berlin from the 27th of September to the 1st of October 2010.
The focus of this summer school will be the planning of speech sequences and its interactions with dynamical properties of speech production and speech perception. It will be organized around 9 tutorials addressing related issues from the linguistic, neurophysiologic, motor control, and perception perspectives. The following invited speakers have accepted to present these tutorials:
 Rachine Ridouane – LPP – Paris: Units in speech planning
 Pierre Hallé – LPP – Paris: Units in speech acquisition
 Noël Nguyen – LPL – Aix-en-Provence: The dynamical approach to speech perception
 Linda Wheeldom – University of Birmingham: Phonological monitoring in the production of spoken sequences
 Jelena Krivokapic – Yale University: Prosodic planning in spoken sequences
 Paul Cisek – Département de Physiologie – Université de Montréal: Human movement planning and control
 Marianne Pouplier – IPS – München: Dynamical coupling of intergestural planning
 Pascal Perrier – Gipsa-lab – Grenoble: Gesture planning integrating dynamical constraints and related issues in speech motor control
 Peter Dominey – SCBRI - Lyon : Sensorimotor interactions and the construction of speech sequences
The summer school is open to all students, postdocs and researchers. Its aim is to provide a platform for interchanges between students, junior and senior researchers by means of poster presentations, discussion forums and working groups, related to the topics addressed in the tutorials. Further information and conditions for participation can be found at http://summerschool2010.danielpape.info/
Dates
Applications with one page abstract : April 6, 2010
Notification of Acceptance: May 31, 2010
Conference: September 27 – October 1, 2010
We are looking forward to seeing you there!
Organisers:
Susanne Fuchs (ZAS Berlin)
Melanie Weirich (ZAS Berlin)
Daniel Pape (IEETA, University of Aveiro, Aveiro)
Pascal Perrier (GIPSA-lab, Grenoble INP, Grenoble)
Contact: berlin.dynamics@gmail.com
Back to Top

10-9 . (2010-09-27) Intern Conf on Latent semantic variable analysis and signal separation- St Malo F

 LVA/ICA 2010
       September 27-30, 2010 - Saint-Malo, France
           9th International Conference on
     Latent Variable Analysis and Signal Separation

        formerly the International Conference on
  Independent Component Analysis and Signal Separation

                http://lva2010.inria.fr/

    ----------------------------------------------

Ten years after the first workshop on Independent Component Analysis in
Aussois, the series of ICA conferences has shown the liveliness of the
community of theoreticians and practitioners working in this field.
While ICA and blind signal separation have become mainstream topics, new
approaches have emerged to solve problems involving signal mixtures or
various other types of latent variables: semi-blind models, matrix
factorization using Sparse Component Analysis (SCA), Non-negative Matrix
Factorization (NMF), Probabilistic Latent Semantic Indexing (PLSI), but
also tensor decompositions, Independent Vector Analysis (IVA),
Independent Subspace Analysis (ISA), ...

The 9th edition of the conference, renamed LVA/ICA to reflect this
evolution towards more general Latent Variable Analysis problems in
signal processing, will offer an interdisciplinary forum for scientists
and engineers to experience renewed theoretical surprises and face
real-world problems.

In addition to contributed papers (oral and poster presentations), the
meeting will feature keynote talks by leading researchers:

    Pierre Comon, University of Nice, France
    Stephane Mallat, Ecole Polytechnique, France
    Mark Girolami, University of Glasgow, UK
    Arie Yeredor, Tel-Aviv University, Israel

as well as a community-based evaluation campaign (SiSEC 2010), a panel
discussion session, and a special late-breaking / demo session.

    ----------------------------------------------

VENUE

Saint Malo (http://www.saint-malo-tourisme.com/index.jsp?lang=en), the
corsair city, is an ancient city and pitoresque sea resort located in
Brittany, in the north-west of France.
Chateaubriand, Surcouf, Jacques Cartier... from writers to privateers
and sailors, many were the good men who hailed from Saint-Malo. As if in
honour of their pride and independence, the forts and ramparts of the
corsair city face the sea, adding to the city's charm and its
exceptional setting. To visitors and event-goers, the city offers the
beauty of its maritime views and the wealth of its historical heritage.
A city of 52,000 inhabitants that is lively all year round, Saint-Malo's
heart beats to the rhythm of the major event it hosts, festivals as the
Etonnants Voyageurs or Internationally renowned regattas such as the
Route du Rhum.

    ----------------------------------------------

IMPORTANT DATES

• April 7, 2010: Paper submission deadline
• June 15, 2010: Notification of acceptance
• June 30, 2010: Final paper due
• July 31, 2010: Late-breaking / demo / SiSEC abstract submission deadline

Detailed submission instructions will shortly be made available on the
conference website http://lva2010.inria.fr/.

    ----------------------------------------------

CALL FOR PAPERS

Prospective authors are invited to submit papers in all areas of latent
variable analysis and signal separation, including but not limited to:
• Theoretical frameworks: probabilistic, geometric &
biologically-inspired modeling; flat, hierarchical & dynamic structures;
sparse coding; kernel methods; neural networks
• Models: linear & nonlinear models; continuous & discrete latent
variables; convolutive & noisy mixtures; linear & quadratic
time-frequency representations
• Algorithms: blind & semi-blind estimation; identification &
convergence conditions; local & evolutionary optimization; computational
complexity; adaptation & modularity
• Speech and audio data: source separation; denoising & dereverberation;
Computational Auditory Scene Analysis (CASA); Automatic Speech
Recognition (ASR)
• Images: segmentation; fusion; texture analysis; color imaging; coding;
scene analysis
• Biomedical data: functional imaging; BCI; genomic data analysis;
systems biology
• Unsolved and emerging problems: causality detection; feature
selection; data mining; control; psychology; social networks; finance;
artificial intelligence; real-time applications
• Resources: software; databases; objective & subjective evaluation
procedures

Papers must be original and must not be already published nor under
review elsewhere. Papers linked to a submission to SiSEC 2010 are highly
welcome. The proceedings will be published in Springer-Verlag’s Lecture
Notes in Computer Science (LNCS) Series.

    ----------------------------------------------

SPECIAL ISSUE AND BEST STUDENT PAPER AWARD

Extended versions of selected papers will be considered for a special
issue of a journal.

The Best Student Paper Award will distinguish the work of a PhD student
with original scientific contributions and the quality of his/her
presentation at LVA/ICA 2010. Eligible papers must be first-authored and
presented by the PhD student during the Conference. Candidates will be
asked to notify their participation on the submission form. A prize of
400 € offered by the Fondation Metivier will be awarded to the winner.

    ----------------------------------------------

LATE-BREAKING / DEMO / SiSEC SESSION

A special session will be dedicated to the presentation of:
• early results and ideas that are not yet fully formalized and evaluated
• software and data of interest to the community, with focus on open
source resources
• signal separation systems evaluated in SiSEC 2010 but not associated
with a full paper

Presenters are invited to submit a non-reviewed abstract, which will be
included in the conference program but not published in the proceedings.


We look forward to receiving your technical contribution and meeting you
in Saint-Malo!


Remi Gribonval and Emmanuel Vincent
General Chairs

Vincent Vigneron and Eric Moreau
Technical Chairs


   

Back to Top

10-10 . (2010-10-24) 10th IEEE International Conference on Signal Processing , Beijing, China

 The 10th IEEE International Conference on Signal Processing Beijing, China October 24-28, 2010 http://icsp10.bjtu.edu.cn Important Deadline: Submission of Papers: June 15, 2010 The International Conference on Signal Processing (ICSP), sponsored by the IEEE Beijing Section, is the premier forum for the presentation of technological advances and research results in the fields of theoretical, experimental, and applied signal processing. ICSP 2010 will bring together leading engineers and scientists in signal processing from around the world. Research frontiers in fields ranging from traditional signal processing applications to evolving multimedia and video technologies are regularly advanced by results first reported in ICSP technical sessions. Topics include, but are not limited to: A. Digital Signal Processing (DSP) B. Spectrum Estimation & Modeling C. TF Spectrum Analysis & Wavelet D. Higher Order Spectral Analysis E. Adaptive Filtering &SP F. Array Signal Processing G. Hardware Implementation for Signal Processing H. Speech and Audio Coding I. Speech Synthesis & Recognition J. Image Processing & Understanding K. PDE for Image Processing L. Video compression &Streaming M. Computer Vision & VR N. Multimedia & Human-computer Interaction O. Statistic Learning & Pattern Recognition P. AI & Neural Networks Q. Communication Signal processing R. SP for Internet and Wireless Communications S. Biometrics & Authentification T. SP for Bio-medical & Cognitive Science U. SP for Bio-informatics V. Signal Processing for Security W. Radar Signal Processing X. Sonar Signal Processing and Localization Y. SP for Sensor Networks Z. Application & Others *Attention* Under the support of numerous reviewers and authors, ICSP has been holded for 20 years. In this session, as a celebration for ICSP, we will hold celebration events and awards, which include Outstanding Paper Award, Outstanding Student Paper Award, etc. For details, please visit http://icsp10.bjtu.edu.cn . *Proceedings* The proceedings with Catalog number of IEEE and Library of Congress will be published prior to the conference in both hardcopy and CD-ROM, and distributed to all registered participants at the conference. The proceedings will be indexed by EI. *Paper Submission* Prospective authors are invited to submit full-length, four-page, double-column papers, including figures and references, to the ICSP Technical Committee by June 15, 2010 at http://icsp10.bjtu.edu.cn. For questions about paper submission, please contact the technical program secretaries, Ms. TANG Xiaofang and Dr. AN Gaoyun at bfxxstxf@bjtu.edu.cn and gyan@bjtu.edu.cn . For more information, please visit the ICSP 2010 web site at: http://icsp10.bjtu.edu.cn.

Back to Top

10-11 . (2010-10-29) Conference on Phonetic Universals Max Planck Institute

We invite papers from linguists, as well as from scholars from related
disciplines, who are concerned with phonetic universals.

http://www.eva.mpg.de/lingua/conference/10-PhoneticUniversals/index.html

Back to Top

10-12 . (2010-10-29)CfP: ACM Multimedia 2010 Workshop on Searching Spontaneous Conversational Speech (SSCS 2010)

----------------------------------------------------------------------
Extended aper submission deadline: June 14, 2010
----------------------------------------------------------------------
CfP: ACM Multimedia 2010 Workshop on
Searching Spontaneous Conversational Speech (SSCS 2010)
-----------------------------------------------------------------------
Workshop held on 29 October 2010, in Firenze, Italy
in conjunction with ACM Multimedia 2010

Website: http://www.searchingspeech.org/

The SSCS 2010 workshop is devoted to presentation and discussion of recent research results concerning advances and innovation in the area of spoken content retrieval and the area of multimedia search that makes use of automatic speech recognition technology.

Spoken audio is a valuable source of semantic information, and speech analysis techniques, such as speech recognition, hold high potential to improve information retrieval and multimedia search. Nonetheless, speech technology remains underexploited by multimedia systems, in particular, by those providing access to multimedia content containing spoken audio. Early success in the area of broadcast news retrieval has yet to be extended to application scenarios in which the spoken audio is unscripted, unplanned and highly variable with respect to speaker and style characteristics. The SSCS 2010 workshop is concerned with a wide variety of challenging spoken audio domains, including: lectures, meetings, interviews, debates, conversational broadcast (e.g., talkshows), podcasts, call center recordings, cultural heritage archives, social video on the Web and spoken natural language queries. As speech steadily moves closer to rivaling text as a medium for access and storage of information, the need for technologies that can effectively make use of spontaneous conversational speech to support search becomes more pressing.

In order to move the use of speech and spoken content in retrieval applications and multimedia systems beyond the current state of the art, sustained collaboration of researchers in the areas of speech recognition, audio processing, multimedia analysis and information retrieval is necessary. Motivated by the aim of providing a forum where these disciplines can engage in productive interaction and exchange, Searching Spontaneous Conversational Speech (SSCS) workshops were held in conjunction with SIGIR 2007, SIGIR 2008 and ACM Multimedia 2009. The SSCS workshop series continues at ACM Multimedia 2010 with a focus on research that strives to move retrieval systems beyond conventional queries and beyond the indexing techniques used in traditional mono-modal settings or text-based applications.

We welcome contributions on a range of trans-disciplinary research issues related to these research challenges, including:

- Information Retrieval techniques in the speech domain (e.g., applied to speech recognition lattices)
- Multimodal search techniques exploiting speech transcripts (audio/speech/video fusion techniques including re-ranking)
- Search effectiveness (e.g., evidence combination, query/document expansion)
- Exploitation of audio analysis (e.g., speaker’s emotional state, speaker characteristics, speaking style)
- Integration of higher level semantics, including topic segmentation and cross-modal concept detection
- Spoken natural language queries
- Large-scale speech indexing approaches (e.g., collection size, search speed)
- Multilingual settings (e.g., multilingual collections, cross-language access)
- Advanced interfaces for results display and playback of multimedia with a speech track
- Exploiting user contributed information, including tags, rating and user community structure
- Affordable, light-weight solutions for small collections, i.e., for the long tail

Contributions for oral presentations (short papers of 4 pages or long papers of 6 pages) and demonstration papers (4 pages) will be accepted. The submission deadline is 10 June 2010. For further information see the website: http://www.searchingspeech.org/

At this time, we area also pre-announcing a special issue of ACM Transactions on Information Systems on the topic of searching spontaneous conversational speech. The special issue is based on the SSCS workshop series, but will involve a separate call for papers. We will especially encourage the authors of the best papers from SSCS 2010 to submit to the special issue call.

SSCS 2010 Organizers
Martha Larson, Delft University of Technology, Netherlands
Roeland Ordelman, Sound & Vision and Uni. of Twente, Netherlands
Florian Metze, Carnegie Mellon University, USA
Franciska de Jong, University of Twente, Netherlands
Wessel Kraaij, TNO and Radboud University, Netherlands
_______________________________________________ 

Back to Top

10-13 . (2010-10-29)CfP Multimedia in Forensics, Security and Intelligence (MiFor 2010)-Firenze Italy

CALL FOR PAPERS
 
The ACM Workshop on Multimedia in Forensics, Security and Intelligence (MiFor 2010), http://madm.dfki.de/mifor2010/MiFor2010.html
 
in conjunction with the 2010 ACM Multimedia (ACM-MM), http://www.acmmm10.org/
 
Firenze, Italy
October 29, 2010
 

With the proliferation of multimedia data on the web, surveillance cameras in cities, and mobile phones in everyday life we see an enormous growth in multimedia data that needs to be secured to prevent illegal use, to be analyzed by forensic investigators to detect and reconstruct illegal activities, or be used as source of intelligence. The sheer volume of such datasets makes manual inspection of all data impossible. In recent years the multimedia community has developed new exciting solutions for management of large collections of video footage, images, audio and other multimedia content, knowledge extraction and categorization, pattern recognition, indexing and retrieval, searching, browsing and visualization, and modeling and simulation in various domains. Due to the inherent uncertainty and complexity of the data appearing in criminal cases applying those techniques are not straightforward. The time is ripe, however, to tailor these results for forensics, security and intelligence.

The workshop topics include (but are not limited to) the following:

Forensics

  • Forgery detection and identification, detection of stenography
  • Device characterization and identification
  • Media forensic applications and attack analysis
  • Crime scene reconstruction and annotation
  • Forensic investigation of surveillance data, video analytics
  • Multimodal analysis of surveillance data
  • Multimodal analysis of biometric traces
  • Authenticity of multimedia data

 

Security

  • Digital/encrypted domain watermarking for multimedia
  • Signal processing in the encrypted domain
  • Multimedia content protection and violation detection
  • Digital rights management
  • Robust hashing and content fingerprinting
  • Cryptography for content protection

 

Intelligence

  • Searching for illicit content in multimedia data
  • Image, video, and text linking
  • Multimedia near duplicate detection and retrieval
  • Multimedia interfaces, visual analytics
  • Identity detection
  • Scalable multimedia search

 

Important Dates (tentative)

  • Paper Submission: June 10, 2010
  • Notification of Acceptance: July 10, 2010
  • Camera-ready Version: July 20, 2010
  • Workshop Date: October 29, 2010

 Paper Submissions and Author Guidelines

Papers submissions for MiFor 2010 should follow the submission format and guidelines for regular ACM Multimedia 2010 papers, and be up to 6 pages in length. Guidelines for preparing submissions can be found at: http://www.acmmm10.org/authors/submission/full-and-short-papers/.

Submitted papers will undergo a peer review process by at least two reviewers.

Accepted papers for oral and poster presentations at the workshop will be included in the workshop's proceedings, which will be published together with the proceedings of the ACM Multimedia Conference 2010. In addition, we plan to realize a special issue or an edited volume by asking the authors of the best papers to submit a substantially extended version of their workshop papers.

Additional information is available at the workshop website:
http://madm.dfki.de/mifor2010/MiFor2010.html
 
 
Workshop Chairs:
Sebastiano Battiato (University of Catania, Italy)
Sabu Emmanuel (Nanyang Technological University, Singapore)
Adrian Ulges (German Research Center for Artificial Intelligence (DFKI), Germany)
Marcel Worring (University of Amsterdam, The Netherland)
Back to Top

10-14 . (2010-11-08) 12th International Conference on Multimodal Interfaces

Call for Papers: ICMI-MLMI 2010

 

12th International Conference on Multimodal Interfaces

and

7th Workshop on Machine Learning for Multimodal Interaction

 

Beijing, China, November 8-12, 2010

 

http://www.acm.org/icmi/2010/

 

The Twelfth International Conference on Multimodal Interfaces and the

Seventh Workshop on Machine Learning for Multimodal Interaction will be

held jointly in Beijing China during November 8-12, 2010. The primary aim

of ICMI-MLMI 2010 is to further scientific research within the broad

field of multimodal interaction, methods, and systems, focusing on major

trends and challenges, and working towards identifying a roadmap for

future research and commercial success. The conference will continue to

feature a single-track with keynote speakers, technical paper

presentations, poster sessions, a doctoral consortium, and

demonstrations of state of the art multimodal systems and concepts. The

conference will be followed by workshops.

 

Topics of interest include, but are not limited to:

    - Multimodal input and output interfaces

    - Multimodal human behavior analysis

    - Machine learning methods for multimodal processing

    - Fusion techniques and hybrid architectures

    - Processing of language and action patterns

    - Gaze and vision-based interfaces

    - Speech and conversational interfaces

    - Pen-based interfaces

    - Haptic interfaces

    - Brain-computer interfaces

    - Cognitive modeling of users  

    - Multi-biometric interfaces

    - Multimodal-multisensor interfaces

    - Interfaces for attentive and intelligent environments

    - Mobile, tangible and virtual/augmented multimodal interfaces

    - Distributed/collaborative multimodal interfaces

    - Tools and system infrastructure issues for designing multimodal interfaces

    - Evaluation of multimodal interfaces

    - AI techniques and adaptive multimodal interfaces

 

Paper Submission

There are two different submission categories: regular paper and short

paper. The page limit is 8 pages for regular papers and 4 pages for

short papers.

 

Demo Submission

Proposals for demos shall be submitted to demo chairs electronically. A

two page description with photographs of the demo is required.

 

Organizing Committee

General Chairs:

Wen Gao, Peking University

Chin-Hui Lee, Georgia Tech

Jie Yang, Carnegie Mellon University

 

Program Chairs

Xilin Chen, Chinese Academy of Sciences

Maxine Eskenazi, Carnegie Mellon University

Zhengyou Zhang, Microsoft Research

 

Important Dates

    Workshop proposals due: April 1, 2010

    Workshop proposal acceptance notification: May 1, 2010

    Paper submission: May 20, 2010

    Author notification: July 20, 2010

    Camera-ready due: August 20, 2010

    Conference: Nov. 8-10, 2010

    Workshops: Nov. 11-12, 2010

 



Back to Top

10-15 . (2010-11-11) 3rd Workshop on Child, Computer and Interaction (WOCCI 2010)

The 3rd Workshop on Child, Computer and Interaction WOCCI 2010 (www.wocci.org) will be held in Beijing, China, on November 11-12, 2010. The Workshop is a satellite event of the Twelth International Conference on Multimodal Interfaces (ICMI), which is held this year jointly with the Workshop on Machine Learning and Multimodal Interaction ICMI-MLMI 2010 (http://www.acm.org/icmi/2010/index.html) that will take place in the same venue on November 8-10, 2010. This 2-day session follows the first two of the WOCCI series which were held in Crete in October 2008 and Boston in November 2009, respectively.

Two page abstract submission: July 1, 2010

Notification of acceptance: July 20, 2010

Final paper (4-8 pages) submission and authors' registration: August 20, 2010

The Workshop aims at bringing together researchers and practitioners from universities and industry working in all aspects of multimodal child-machine interaction with particular emphasis on, but not limited to, speech interactive interfaces.

Children are special both at the acoustic/linguistic level but also at the interaction level. The Workshop provides a unique opportunity for bringing together different research communities to demonstrate various state-of-the-art components that can make up the next generation of child centred computer interaction. These technological advances are increasingly necessary in a world where education and health pose growing challenges to the core wellbeing of our societies. Noticeable examples are remedial treatments for children with or without disabilities, and first and second language learning. The Workshop should serve for presenting recent advancements in all core technologies for multimodal child-machine interaction as well as experimental systems and prototypes.

 

Back to Top

10-16 . (2010-11-15) Tutorial and Special Session on Forensic Voice Comparison and Forensic Acoustics

CALL FOR PAPERS

Tutorial and Special Session on Forensic Voice Comparison and Forensic Acoustics at 2nd Pan-American/Iberian Meeting on Acoustics, Cancún, Mexico, 15–19 November 2010.
http://cancun2010.forensic-voice-comparison.net/

The official call for papers for the Pan-Am/Iberian meeting is now out and the deadline for submissions is 1 June 2010.
http://asa.aip.org/meetings.html


In February 2009 the National Research Council (NRC) Report to Congress on Strengthening Forensic Science in the United States found that:

“[S]ome forensic disciplines are supported by little rigorous systematic research to validate the discipline’s basic premises and techniques. There is no evident reason why such research cannot be conducted” (p. 22).

“The development of scientific research, training, technology, and databases associated with DNA analysis have resulted from substantial and steady federal support for both academic research and programs employing techniques for DNA analysis. Similar support must be given to all credible forensic science disciplines if they are to achieve the degrees of reliability needed to serve the goals of justice.” (p. 13)

Over the last decade, a small number of researchers (principally in Australia, Spain, and Switzerland) have been working on developing demonstrably valid and reliable forensic voice comparison with evidence evaluated using the same framework as is applied to the evaluation of DNA evidence.

Meanwhile in the Americas there has been little interest in this field of research.

The NRC report gives a new impetus for conducting forensic voice comparison research and holds out the hope for new funding opportunities in this area.

The 2nd Pan-American/Iberian Meeting on Acoustics provides an excellent opportunity to bring together researchers from Iberia and other parts of the world with researchers from the Americas to help foster research in this area in the Americas.

It also provides a venue for an exchange of ideas between researchers working on acoustic-phonetic and signal-processing approaches to forensic voice comparison. 

Back to Top

10-17 . (2010-11-29) 2010 Int. Symposium on Chinese Spoken Language Processing (ISCSLP 2010) Taiwan

CALL FOR PAPERS

2010 International Symposium on Chinese Spoken Language Processing (ISCSLP 2010)
November 29 – December 3, 2010  -  Tainan and Sun Moon Lake, Taiwan
 

===========================================================
 
ISCSLP is the flagship conference of ISCA SIG-CSLP (International Speech Communication Association, Special Interest Group on Chinese Spoken Language Processing). ISCSLP2010 will be held during November 29 - December 3, 2010 in Tainan and Sun Moon Lake, Taiwan hosted by National Cheng Kung University.
 
Tainan, located in south-western Taiwan, is the city of cultural origin. There are many historical places and heritage sites. In addition, Tainan is a modern city with various shopping centers, department stores, and night markets. It will be a wonderful opportunity to experience Taiwanese cultures when you visit Tainan. Sun Moon Lake, the largest lake located in central Taiwan, is a beautiful alpine lake, with its eastern part rounded like the sun and the western side shaped like a crescent moon. Its crystalline, emerald green body of water reflects the hills and mountains surrounding on all sides. Its natural beauty is further enhanced by numerous cultural and historical sites.

 
We invite your participation in this premier conference, where the language from ancient civilizations embraces modern computing technology. ISCSLP 2010 will feature world-renowned plenary speakers, tutorials, exhibits, and a number of lecture and poster sessions on the following topics:
Speech Production and Perception
Phonetics and Phonology
Speech Analysis
Speech Coding
Speech Enhancement
Speech Recognition
Speech Synthesis
Language Modeling and Spoken Language Understanding
Spoken Dialog Systems
Spoken Language Translation
Speaker and Language Recognition
Computer-Assisted Language Learning
Indexing, Retrieval and Authoring of Speech Signals
Multi-Modal Interface including Spoken Language Processing
Spoken Language Resources and Technology Evaluation
Applications of Spoken Language Processing Technology 
 
Official Language & Publication
The official language of ISCSLP is English.
All papers accepted will be included in IEEE Xplore and indexed by EI Compendex.
 
Paper Submission
Authors are invited to submit original, unpublished work in English.
Papers should be submitted via
MailScanner has detected a possible fraud attempt from "imap:" claiming to be http://conf.ncku.edu.tw/iscslp2010/paper.htm
Each submission will be reviewed by two or more reviewers.
At least one author of each paper is required to register. 
 
Important Dates
Full paper submission by July 15, 2010
Notification of acceptance by Aug. 30, 2010
Camera ready papers by Sep. 13, 2010
Registration to cover an accepted paper by Oct.13, 2010

Back to Top

10-18 . (2010-12-02) CfP 7th International Workshop on Spoken Language Translation (IWSLT 2010)

7th International Workshop on Spoken Language Translation

                             (IWSLT 2010)

 

                  First Call for Participants / Papers

 

                          December 2-3, 2010

                             Paris, France

 

                       http://iwslt2010.fbk.eu

 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

 

The International Workshop on Spoken Language Translation (IWSLT)

is a yearly scientific workshop, associated with an open evaluation

campaign on spoken language translation, where both scientific papers

and system descriptions are presented. The 7th International Workshop

on Spoken Language Translation will take place in Paris, France

on 2-3 December 2010.

 

=== Scientific Papers

 

The IWSLT invites submissions of scientific papers to be published

in the workshop proceedings and presented in dedicated technical sessions

of the workshop, either in oral or poster form. The workshop welcomes high

quality contributions covering theoretical and practical issues in the field

of machine translation (MT), in general, and spoken language translation (SLT),

including Automatic Speech Recognition (ASR), Text-to-Speech Synthesis (TTS)

and MT, in particular. Possible topics include, but are not limited to:

 

  - Speech and text MT

  - Integration of ASR and MT

  - MT and SLT approaches

  - MT and SLT evaluation

  - Language resources for MT and SLT

  - Open source software for MT and SLT

  - Pivot-language-based MT

  - Adaptation in MT

  - Simultaneous speech translation

  - Efficiency in MT

  - Stream-based algorithms for MT

  - Multilingual ASR and TTS

 

Submitted manuscripts will be peer-reviewed by three members of the workshop

program committee. Authors of accepted papers are requested to present their

paper at the workshop.

 

=== Evaluation Campaign

 

IWSLT evaluations are not organized for the sake of competition, but their goal

is to foster cooperative work and scientific exchange. In this respect, IWSLT

proposes challenging research tasks and an open experimental infrastructure

for the scientific community working on spoken and written language translation.

This year, the IWSLT evaluation campaign will offer three tasks:

 

  - public speeches (TALK) on a variety of topics, from English to French (NEW CHALLENGE),

  - spoken dialogues (DIALOG) in travel situations, between Chinese and English,

  - traveling expressions (BTEC), from Arabic, Turkish, and French to English.

 

For each task, monolingual and bilingual language resources will be provided to

participants in order to train their translation systems, as well as sets of manual

and automatic speech transcripts (with n-best and lattices) and reference translations,

allowing researchers working only on written language translation to also participate.

Moreover, blind test sets will be released and all translation outputs produced

by the participants will be evaluated using several automatic translation quality

metrics. Human assessment will be carried out for the translation of spoken dialogues

and basic travel expressions.

 

The goal of this year's new challenge (translation of public speeches) will be

to establish reference baselines and appropriate evaluation protocols for future

evaluations. As a consequence, although an evaluation server will be set-up to compute

several translation accuracy metrics, there will be no official ranking of participants

published by the organizers for this task.

 

Each participant in the evaluation campaign is requested to submit a paper describing

the MT system, the utilized resources, and results using the provided test data.

Contrastive run submissions using only the bilingual resources provided by IWSLT

as well as investigations of the contribution of each utilized resource are highly

appreciated. Results feedback will be provided by the organizers a few days after

the run submissions. Finally, all participants are requested to present their papers

describing their MT systems at the workshop.

 

=== Important Dates

 

Evaluation Campaign:

 

  - Training corpus release           28 May 2010

  - Test corpus release                       23 August 2010

  - Run submissions due (DIALOG, BTEC)        6 September 2010

  - Run submissions due (TALK)        30 September 2010

  - MT system description due         14 October 2010

  - Notification of acceptance        29 October 2010

  - Camera-ready paper due            10 November 2010

 

Scientific Papers:

 

  - Paper submission due              4 September 2010

  - Notification of acceptance        16 October 2010

  - Camera-ready paper due            10 November 2010

 

=== Organizers

 

IWSLT Steering Committee:

 

  - Alex Waibel (CMU, USA / Karlsruhe Institute of Technology (KIT), Germany)

  - Marcello Federico (FBK-irst, Italy)

  - Satoshi Nakamura (NICT, Japan)

 

Chairs:

 

  * Workshop:

 

    - Alex Waibel (CMU, USA / KIT, Germany)

    - Joseph Mariani (LIMSI-CNRS & IMMI, France)

 

  * Evaluation Committee:

 

    - Michael Paul (NICT, Japan)

    - Marcello Federico (FBK-irst, Italy)

 

  * Program Committee:

 

    - Ian Lane (CMU, USA)

    - François Yvon (LIMSI-CNRS/U. Paris Sud 11, France)

 

Local Organizing Committee:

 

  - Martine Garnier-Rizet (IMMI, Chair)

  - Lynn Barreteau (IMMI)

  - Joseph Mariani (LIMSI-CNRS & IMMI)

  - Aurélien Max (LIMSI-CNRS/U. Paris Sud 11)

  - Guillaume Wisniewski (LIMSI-CNRS/U. Paris Sud 11)

 

Program Committee:

 

  - Alexandre Allauzen (LIMSI-CNRS/U. Paris Sud 11, France)

  - Laurent Besacier (LIG, France)

  - Arianna Bisazza (FBK-irst, Italy)

  - Francisco Casacuberta (ITI-UPV, Spain)

  - Boxing Chen (NRC, Canada)

  - Mehmet Uğur Doğan (Tubitak-Uekae, Turkey)

  - Matthias Eck (Mobile Technologies, USA)

  - Philipp Koehn (Univ. Edinburgh, UK)

  - Philippe Langlais (Univ. Montreal, Canada)

  - Geunbae Lee (Postech, Korea)

  - Yves Lepage (GREYC, France)

  - Haizhou Li (I2R, Singapore)

  - José B. Mariño (TALP-UPC, Spain)

  - Coskun Mermer (Tubitak-Uekae, Turkey)

  - Hermann Ney (RWTH, Germany)

  - Hwee Tou Ng (NUS, Singapore)

  - Matthias Paulik (CMU, USA)

  - Holger Schwenk (LIUM, France)

  - Wade Shen (MIT-LL, USA)

  - Sebastian Stüker (KIT, Germany)

  - Eiichiro Sumita (NICT, Japan)

  - Hajime Tsukada (NTT, Japan)

  - Haifeng Wang (Baidu, China)

  - Andy Way (DCU, Ireland)

  - Joy Zhang (CMU, USA)

  - Chengqing Zong (CASIA, China)

 

For all information, please visit the IWSLT 2010 Web site: http://iwslt2010.fbk.eu

Back to Top

10-19 . (2010-12-12)IEEE Workshop on Spoken Language Technology SLT 2010, Berkeley CA

IEEE  Workshop on Spoken Language Technology                               SLT 2010                       December 12-15, 2010                           Berkeley, CA                         www.slt2010.org  ********************** CALL FOR PAPERS ******************************** Apologies for multiple postings.  The Third IEEE Spoken Language Technology (SLT) workshop will be held from December 12 to December 15, 2010 in Berkeley, CA. The goal of this workshop is to allow the spoken language processing community to share and present recent advances in various areas of spoken language technology.  WORKSHOP TOPICS:  * Spoken language understanding * Spoken document summarization * Machine translation for speech * Spoken language based systems * Spoken language generation * Question answering from speech * Human/Computer Interaction * Educational/Healthcare applications * Speech data mining * Information extraction * Spoken document retrieval * Multimodal processing * Spoken dialog systems * Spoken language systems * Spoken language databases * Assistive technologies  SUBMISSIONS FOR THE TECHNICAL PROGRAM:  The workshop program will consist of tutorials, oral and poster presentations, and panel discussions. Prospective authors are invited to submit full-length papers to the SLT 2010 website http://www.slt2010.org All papers will be handled and reviewed electronically. The website will provide you with further details.  IMPORTANT DATES:  Paper Submission Deadline: July 16, 2010 Paper acceptance/rejection: September 1, 2010 Workshop dates: December 12-15, 2010  ORGANIZATION COMMITTEE:  Organizing Chairs: Dilek Hakkani-Tür, ICSI Mari Ostendorf, U. Washington  Finance Chair: Gokhan Tur, SRI International  Advisory Board: Mazin Gilbert, AT&T Labs - Research Srinivas Bangalore, AT&T Labs - Research Giuseppe Riccardi, U. Trento  Technical Chairs: Isabel Trancoso, INESC-ID, Portugal Tim Paek, Microsoft Research  Demo Chairs: Alex Potamianos, Tech. U. of Crete Mikko Kurimo, Helsinki U. of Tech.  Publicity Chair: Bhuvana Ramabhadran, IBM Research Benoit Favre, Univ. Le Mans  Panel Chairs: Sadaoki Furui, Tokyo Inst. Of Tech. Eric Fosler-Lussier, Ohio State U.  Publication Chair: Yang Liu, U. Texas, Dallas  Local Organizers: Dimitra Vergryi, SRI International Murat Akbacak, SRI International Sibel Yaman, ICSI Arindam Mandal, SRI International  Europe Liaisons: Frederic Bechet, U. Avignon Philipp Koehn, U. Edinburgh  Asia Liaisons: Helen Meng, C. U. Hong Kong Gary Geunbae Lee, POSTECH



IEEE Workshop on Spoken Language Technology SLT 2010 December 12-15, 2010 Berkeley, CA
www.slt2010.org ********************** CALL FOR PAPERS ******************************** Apologies for multiple postings. The Third IEEE Spoken Language Technology (SLT) workshop will be held from December 12 to December 15, 2010 in Berkeley, CA. The goal of this workshop is to allow the spoken language processing community to share and present recent advances in various areas of spoken language technology. WORKSHOP TOPICS: * Spoken language understanding * Spoken document summarization * Machine translation for speech * Spoken language based systems * Spoken language generation * Question answering from speech * Human/Computer Interaction * Educational/Healthcare applications * Speech data mining * Information extraction * Spoken document retrieval * Multimodal processing * Spoken dialog systems * Spoken language systems * Spoken language databases * Assistive technologies SUBMISSIONS FOR THE TECHNICAL PROGRAM: The workshop program will consist of tutorials, oral and poster presentations, and panel discussions. Prospective authors are invited to submit full-length papers to the SLT 2010 website http://www.slt2010.org All papers will be handled and reviewed electronically. The website will provide you with further details. IMPORTANT DATES: Paper Submission Deadline: July 16, 2010 Paper acceptance/rejection: September 1, 2010 Workshop dates: December 12-15, 2010 ORGANIZATION COMMITTEE: Organizing Chairs: Dilek Hakkani-Tür, ICSI Mari Ostendorf, U. Washington Finance Chair: Gokhan Tur, SRI International Advisory Board: Mazin Gilbert, AT&T Labs - Research Srinivas Bangalore, AT&T Labs - Research Giuseppe Riccardi, U. Trento Technical Chairs: Isabel Trancoso, INESC-ID, Portugal Tim Paek, Microsoft Research Demo Chairs: Alex Potamianos, Tech. U. of Crete Mikko Kurimo, Helsinki U. of Tech. Publicity Chair: Bhuvana Ramabhadran, IBM Research Benoit Favre, Univ. Le Mans Panel Chairs: Sadaoki Furui, Tokyo Inst. Of Tech. Eric Fosler-Lussier, Ohio State U. Publication Chair: Yang Liu, U. Texas, Dallas Local Organizers: Dimitra Vergryi, SRI International Murat Akbacak, SRI International Sibel Yaman, ICSI Arindam Mandal, SRI International Europe Liaisons: Frederic Bechet, U. Avignon Philipp Koehn, U. Edinburgh Asia Liaisons: Helen Meng, C. U. Hong Kong Gary Geunbae Lee, POSTECH   
Back to Top

10-20 . (2010-12-13) The Second IEEE International Workshop on Content-Based Audio/Video Analysis for Novel TV Services.

The Second IEEE International Workshop on Content-Based Audio/Video Analysis for Novel TV Services.  13/12/2010 - // DeadLine: 20100712 Taichung  Taiwan http://cbtv2010.inria.fr/  Following the success of the first edition of the workshop on Content-Based Audio/Video Analysis for Novel TV Services (CBTV), we are pleased to announce the second one in this series.   The objective of the workshop is twofold. First, it aims at highlighting the need for powerful and automatic audio and video content-based techniques in building novel TV services. The second objective is to bring in professionals and researchers, and to present the recent advances in the field.  The workshop will be held in conjunction with the International IEEE Symposium on Multimedia 2010.
Back to Top

10-21 . (2010-12-14) CfP Thirteenth Australasian International Conference on Speech Science and Technology

SST2010: Thirteenth Australasian International Conference on Speech Science and Technology
Melbourne, Australia, 14-16 December 2010
http://www.assta.org/sst/2010/

Second Call for Papers
Call Deadline: 18 June 2010

ASSTA and La Trobe University are pleased to announce the Thirteenth Australasian International Conference on Speech Science and Technology (SST2010). The conference will be held at the La Trobe University City Campus, Melbourne.

*Paper submission guidelines and templates are now available from http://www.assta.org/sst/2010/

Conference Themes
Submissions are invited for oral presentations. Submissions should describe original contributions to spoken language, speech science and/or technology that will be of interest to an audience including scientists, engineers, linguists, psychologists, speech and language therapists, audiologists and other professionals.

Submissions are invited in all areas of speech science and technology, but particularly in the following areas:
•       Speech production
•       Acoustic phonetics
•       Acoustics of accent change
•       Phonetics and phonology of Australasian languages (OzPhon)
•       Phonetics and Phonology of Australian and New Zealand English (PANZE)
•       Speech prosody, emotional speech, voice
•       Music and speech processing
•       Applications of speech science and technology
•       Speech processing for forensic applications
•       Speech recognition and understanding
•       Speaker recognition and classification
•       Speech enhancement and noise cancellation
•       Pedagogical technologies for speech and singing
•       Corpus management and speech tools
•       Contributions of speech science and technology to audiology and speech language therapy

Plenary Speakers:

Professor  D.R. Ladd
Linguistics and English Language, University of Edinburgh

Professor Hugh McDermott
Bionic Ear Institute and University of Melbourne

Professor Michael Robb        
Department of Communication Disorders, University of Canterbury

Key dates:

Paper submissions due           Friday 18 June 2010
Notification of acceptance      Friday 27 August 2010
Early-bird registration due     Friday 1 October 2010 

Back to Top

10-22 . (2010-12-17) ACM DEV 2010: First ACM Annual Symposium on Computing for Development

*ACM DEV 2010: First ACM Annual Symposium on Computing for Development*  The First ACM Annual Symposium on Computing for Development (DEV 2010)  will be co-located with ICTD 2010 and the focus of the symposium will be  on new computing innovations for development. The scope of DEV 2010 is  broad covering a wide range of research areas within computer science  with a direct focus on development. ACM DEV 2010 aims to bring together  all CS researchers with an interest in computing for development. The  deadline for paper submissions is July 10th, 2010. We strongly encourage  you to submit your best works here.  The conference website is: http://dev2010.news.cs.nyu.edu   *Call for Papers*  DEV 2010 provides an international forum for research in the design and  implementation of information and communication technologies (ICTs) for  social and economic development. In particular, we focus on emerging  contexts where conventional computing solutions are often inappropriate  due to various contextual factors - including, but not limited to, cost,  language, literacy, and the availability of power and bandwidth.  Focusing on innovative technical solutions to these unique application,  infrastructure and user challenges, DEV fosters exchange between  computer scientists, engineers, and other scholars and practitioners  interested in the use of ICTs for development.  DEV provides a high-quality, single-track forum for presenting results  and discussing new ideas. We expect paper contributions from different  existing sub-areas of Computer Science and Engineering with a direct  relevance to development.  Papers should describe original and previously unpublished research.  Three metrics will be applied to judge papers: (a) Relevance of the  problem for development; (b) Novelty of the technical solution; (c)  Evaluation of the solution, making a case for development-focused  impact. All ACM DEV paper submissions should either provide or directly  motivate a novel technical solution that has direct implications for  development. Topics of interest include, but are not limited to:  Networks/Systems/Security/Architecture    * Low-cost wireless connectivity   * Intermittent networks and systems   * Power-efficient systems   * Low-cost computing devices   * Mobile systems and applications   * Security challenges in developing regions   HCI/Applications    * User interfaces for low-literacy populations   * Multi-lingual computing   * User-interfaces for low-cost devices   * Participatory methods and user-centered design   * Accessibility to disabled populations in developing regions   * Design and evaluation of applications for health, microfinance,     education, agriculture, entertainment   AI/NLP/Data Mining/Speech/Vision    * Machine learning techniques for large-scale data analysis in     development contexts   * Adapting content and applications to local languages and education     levels   * Understanding social relationships and information flows in     disadvantaged societies   * Speech interfaces and speech recognition for low-resource languages   * Development of new AI-centric tools/solutions for development   * Computer vision challenges in development  We also welcome papers outside of these topics that address the DEV  focus on computing innovations supporting social and economic development.  *Important Dates * Registration Deadline     July 3, 2010 Submission Deadline     July 10, 2010 Paper Acceptance     September 5, 2010 Final Version     October 5, 2010 Conference     December 17-18, 2010  General Chair    Andrew Dearden, Sheffield Hallam University  PC Chairs   Tapan Parikh, UC Berkeley Lakshminarayanan Subramanian, NYU  *Steering Committee * Saman Amarasinghe, MIT Gaetano Borriello, University of Washington Eric Brewer, UC Berkeley Deborah Estrin, UCLA Margaret Martonosi, Princeton Roni Rosenfeld, CMU Kentaro Toyama, UC Berkeley  *Program Committee *Muneeb Ali, Princeton, USA Saman Amarasinghe, MIT, USA Richard Anderson, Univ of Washington, USA Ravin Balakrishnan, Univ of Toronto, Canada Simone Barbosa, PUC - Rio, Brazil Etienne Barnard, Meraka Institute, South Africa Michael Best, Georgia Tech, USA Gaetano Borriello, Univ of Washington, USA Eric Brewer, UC Berkeley, USA John Canny, UC Berkeley, USA Ed Cutrell, MSR India, India James Davis, UC Santa Cruz, USA Andrew Dearden, Sheffield Hallam University, UK Nathan Eagle, MIT & Santa Fe Institute, USA Deborah Estrin, UCLA, USA Neil Ferguson, Imperial College, UK Beki Grinter, Georgia Tech, USA Eric Horovitz, MSR Redmond, USA Ravi Jain, Google, USA Matt Jones, Swansea, UK Matthew Kam, CMU, USA Srinivasan Keshav, University of Waterloo, Canada Zhengjie Liu, Dalian Maritime University, China Gary Marsden, Univ of Cape Town, South Africa Vanessa Frias Martinez, Telefonica Research, Spain Margaret Martonosi, Princeton, USA Srini Narayanan, UC Berkeley, USA Bonnie Nardi, UC Irvine, USA Tapan Parikh, UC Berkeley, USA Balaji Prabhakar, Stanford, USA John Quinn, Makerere University, Uganda Nitendra Rajput, IBM Research India, India Bhaskaran Raman, IIT-Bombay, India Roni Rosenfeld, CMU, USA Umar Saif, LUMS, Pakistan Lakshmi Subramanian, NYU, USA Bill Thies, MSR India, India Kentaro Toyama, UC Berkeley, USA Terry Winograd, Stanford, USA
Back to Top

10-23 . (2011-05-19) Quatrièmes Journées de Phonétique Clinique Srasbourg F

Quatrièmes Journées de Phonétique Clinique
IVèmes JPC
Strasbourg

 

Les Quatrièmes Journées de Phonétique Clinique (IVèmes JPC) auront lieu du 19 au 21 mai 2011 à Strasbourg. Ces journées s'inscrivent dans la lignée des premières, deuxièmes et troisièmes journées d'études de phonétique clinique, qui s'étaient tenues respectivement à Paris en 2005, à Grenoble en 2007 et à Aix-en-Provence en 2009.

Elles seront organisées par l'Institut de Phonétique de Strasbourg (IPS) & l'U.R. 1339 Linguistique, Langues et Parole (LiLPa)  - Equipe Parole et Cognition et la Maison Interuniversitaire des Sciences de l'Homme Alsace (MISHA).

Le  calendrier, ainsi que les  modalités de  soumission  et d'inscription  suivront sous peu.

--  Rudolph Sock Institut de Phonétique de Strasbourg (IPS)  &  Composante Parole et Cognition (PC)   E.A. 1339 - Linguistique, Langues et Parole (LiLPa) Université de Strasbourg 22, rue René Descartes 67084 Strasbourg cedex  Téléphone : +33 3 68 85 65 68 Fax : +33 3 68 85 65 69  http://misha1.u-strasbg.fr/IPS --------------------------------
Back to Top

10-24 . (2011-05-22) ICASSP 2011, Prague

ICASSP 2011

 

Prague hosts IEEE International Conference on Acoustics, Speech and

Signal Processing, ICASSP 2011. Prague Congress Centre, May 22-27, 2011.

 

ICASSP is one of the world's major conferences for signal processing,

bringing together over 2000 participants and experts from industry and

universities.

 

The conference features world-class speakers, tutorials, exhibits,

demos, and over 120 lecture and poster sessions on the following topics:

Signal Processing Theory and Methods, Machine Learning for Signal

Processing, Sensor Array and Multichannel Systems, Audio and Acoustic

Signal Processing, Speech and Language Processing, Signal Processing for

Communications and Networking, Image, Video, and Multidimensional Signal

Processing, Biomedical  Imaging, Information Forensics and Security, and

Signal Processing Education.

 

Important deadlines

Special Session & Tutorial Proposals

Due

September 1, 2010

Notification of Special Session &

Tutorial Acceptance

October 6, 2010

Submission of Camera Ready Papers

October 20, 2010

Notification of Paper Acceptance

January 17, 2011

Revised Paper Upload Deadline

February 20, 2011

Registration Deadline for Authors

March 13, 2011

 

More information can be found at http://www.icassp2011.com/

 

Back to Top