ISCApad number 109

July 7th, 2007


Dear Members,
ISCA is a worldwide association and it is not always easy to figure out all consequences. I was ready to wish you excellent summer holidays but an Australian colleague drew my attention on his wintertime! Anyway, inhabitants of the north part of the world have currently started a holiday season. I wish them an excellent resting period and I remind them not to extend their holidays later than the last week of August when we will all meet at Interspeech 2007 in Antwerp, Belgium.
I also draw your attention to the increased number of job openings advertised in ISCApad and on our website. I contact the labs and the companies each month to check if the positions are filled or not and all offers published in this issue have been validated.
Do not hesitate to inform me on the recent books devoted to speech science or technology: it is a service offered to the community to advertise books.
ISCA board will be partly renewed at the next General Assembly at Interspeech 2007 in Antwerp. Please read carefully the information you received on the voting procedure: please do participate if you want a strongly representative board.
My last recommendation is to read carefully the call for proposals for Interspeech 2011: be audacious and submit attractive proposals.

Christian Wellekens


  1. ISCA News
  2. SIG's activities
  3. Courses, internships
  4. Books, databases, softwares
  5. Job openings
  6. Journals
  7. Future Interspeech Conferences
  8. Future ISCA Tutorial and Research Workshops (ITRW)
  9. Forthcoming Events supported (but not organized) by ISCA
  10. Future Speech Science and technology events


Reminder : ISCA Board Election 2007
All current ISCA members should now have received information on how to vote for members of the new board. If you have not received the election message, please contact the ISCA secretariat.
The ISCA Secretariat

Individuals or organizations interested in organizing:
should submit by 15 November 2007 a brief preliminary proposal, including:
* The name and position of the proposed general chair and other principal organizers.
* The proposed period in September/October 2011 when the conference would be held.
* The institution assuming financial responsibility for the conference and any other cooperating institutions.
* The city and conference center proposed (with information on that center's capacity).
* The commercial conference organizer (if any).
* Information on transportation and housing for conference participants.
* Likely support from local bodies (e.g. governmental).
* A preliminary budget.
Interspeech conferences may be held in any country, although they generally should not occur in the same continent in two consecutive years. The coming Interspeech events will take place in Antwerp (Belgium, 2007), Brisbane (Australia, 2008), Brighton (UK, 2009) and Makuhari (Japan, 2010).
Guidelines for the preparation of the proposal are available on our website.
Additional information can be provided by Isabel Trancoso.
Those who plan to put in a bid are asked to inform ISCA of their intentions as soon as possible. They should also consider attending Interspeech 2007 in Anwerp to discuss their bids, if possible.
Proposals should be submitted by email to the above address. Candidates fulfilling basic requirements will be asked to submit a detailed proposal by 28 February 2008.

Prix de these AFCP
Association Francophone de la Communication Parlee
Cloture definitive le 21 juin 2007
L'Association Francophone de la Communication Parlee (AFCP) decerne chaque annee un prix scientifique pour une excellente these du domaine. L'AFCP souhaite ainsi promouvoir toutes les facettes de la recherche en communication parlee : des travaux fondamentaux aux travaux appliques, du domaine des STIC, SHS ou SDV. L'objectif de ce prix est de dynamiser les jeunes chercheurs, tout en faisant connaitre leurs travaux.
Le jury est compose des chercheurs elus du CA de l'AFCP.
- R. Ridouane est laureat du prix 2004 pour: "Suite de consonnes en berbere: phonetique et phonologie",
- M. Dohen en 2005 pour: "Deixis prosodique multisensorielle: production et perception audiovisuelle de la focalisation contrastive en francais"
La remise officielle du prix se fera lors de la rencontre francophone "Les Journees d'Etudes sur la Parole" (JEP) 2008 (Avignon). Chaque recipiendaire se verra remettre la somme de 500 euros, et sera invite a resumer ses travaux lors d'un expose.
Peut candidater tout docteur ayant soutenu sa these entre le 1er octobre 2005 et le 31 decembre 2006. On ne peut candidater qu'a une seule edition.
Candidature avant le 21 juin 2007 (depot de la these & envoi postal).
Resultats : mi juillet 2007.
serveur AFCP des theses qui regroupe la plupart des theses francophones du domaine
2/ Postez un CD a:
Herve' Glotin Prix AFCP UMR CNRS LSIS
Univ. Sud Toulon Var, BP20132
83957 La Garde Cedex 20 - France
contenant un seul fichier (votre nom.pdf) avec dans l'ordre:
* resume de votre these (2 pages),
* liste de vos publications,
* tous les rapports (jury et rapporteus) scannes de votre soutenance de these,
* une lettre de recommandation scannee de votre dir. de these pour ce prix,
* votre CV (avec coord. completes dont Email).

SIG's activities

A list of Speech Interest Groups can be found on our web.



Master in Human Language Technologies and Interfaces at the University of Trento

organized by: University of Trento and Fondazione Bruno Kessler Irst
Call for applications, Academic Year 2007/08
Human language technology gives people the possibility of using speech and/or natural language to access a variety of automated services, such as airline reservation systems or voicemail, to access and communicate information across different languages, and to keep under control the increasing amount of information available by automatically extracting useful content and summarizing it. This master aims at providing skills in the basic theories, techniques, and applications of this technology through courses taught by internationally recognized researchers from the university, research centers and supporting industry partners. Students enrolled in the master will gain in depth knowledge from graduate courses and from substantial practical projects carried out in research and industry labs.
Speech Processing, Machine Learning for NLP, Human Language, Text Processing, Spoken Dialog Systems,Human Computer Interaction, Language Resources, Multilingual Technology
Master degree level ( min 4 years) in the area of computer science, electrical engineering, computational linguistics and cognitive science and other related disciplines. English language (official language)
Student Grants
A limited number of fellowships will be available.
Application Deadline
Non EU Students: June, 15
EU Students: end of July
University of Trento-Department of Information and Communication
Technologies Via Sommarive, 14-38100 Povo (Trento), Italy

Studentships available for 2006/7 at the Department of Computer Science
The University of Sheffield - UK

The Sheffield MSc in Human Language Technology has been carefully tailored to meet the demand for graduates with the highly-specialised multi-disciplinary skills that are required in HLT, both as practitioners in the development of HLT applications and as researchers into the advanced capabilities required for next-generation HLT systems. The course provides a balanced programme of instruction across a range of relevant disciplines including speech technology, natural language processing and dialogue systems.
The programme is taught in a research-led environment. This means that you will study the most advanced theories and techniques in the field, and also have the opportunity to use state- of-the-art software tools. You will also have opportunities to engage in research-level activity through in-depth exploration of chosen topics and through your dissertation.
Graduates from this course are highly valued in industry, commerce and academia. The programme is also an excellent introduction to the substantial research opportunities for doctoral-level study in HLT.
A number of studentships are available, on a competitive basis, to suitably qualified applicants. These awards pay a stipend in addition to the course fees.
See further details of the course
Information on how to apply




HIWIRE database
We would like to draw your attention to the Interspeech 2007 special session "Novel techniques for the NATO non-native Air Traffic Control and HIWIRE cockpit databases"
that we are co-organizing. For this special session we make available (free of charge) the cockpit database, along with training and testing HTK scripts. Our goal is to investigate feature extraction, acoustic modelling and adaptation algorithms for the problem of (hands-free) speech recognition in the cockpit. A description of the task, database and ordering information can be found at the website of the project We hope that you will be able to participate to this special session.
Alex Potamianos, TUC
Thibaut Ehrette, Thales Research
Dominique Fohr, LORIA
Petros Maragos, NTUA
Marco Matassoni, ITC-IRST
Jose Segura, UGR

- Language Resources Catalogue - Update
ELRA is happy to announce that new Speech Related Resources are now available in its catalogue. Moreover, we are pleased to announce that years 2005 and 2006 from the Text Corpus of "Le Monde" (ELRA-W0015) are now available.
*ELRA-S0235 LC-STAR Hebrew (Israel) phonetic lexicon
*The LC-STAR Hebrew (Israel) phonetic lexicon comprises 109,580 words, including a set of 62,431 common words, a set of 47,149 proper names (including person names, family names, cities, streets, companies and brand names) and a list of 8,677 special application words. The lexicon is provided in XML format and includes phonetic transcriptions in SAMPA. More information
*ELRA-S0236 LC-STAR English-Hebrew (Israel) Bilingual Aligned Phrasal lexicon
*The LC-STAR English-Hebrew (Israel) Bilingual Aligned Phrasal lexicon comprises 10,520 phrases from the tourist domain. It is based on a list of short sentences obtained by translation from US-English 10,449 phrasal corpus. The lexicon is provided in XML format. More information
*ELRA-S0237 LC-STAR US English phonetic lexicon
*The LC-STAR US English phonetic lexicon comprises 102,310 words, including a set of 51,119 common words, a set of 51,111 proper names (including person names, family names, cities, streets, companies and brand names) and a list of 6,807 special application words. The lexicon is provided in XML format and includes phonetic transcriptions in SAMPA. More information
*ELRA-W0015 Text corpus of "Le Monde"
*Corpus from "Le Monde" newspaper. Years 1987 to 2002 are available in an ASCII text format. Years 2003 to 2006 are available in .XML format. Each month consists of some 10 MB of data (circa 120 MB per year). More information
*ELRA-S0238 MIST Multi-lingual Interoperability in Speech Technology database
*The MIST Multi-lingual Interoperability in Speech Technology database comprises the recordings of 74 native Dutch speakers (52 males, 22 females) who uttered 10 sentences in Dutch, English, French and German, including 5 sentences per language identical for all speakers and 5 sentences per language per speaker unique. Dutch sentences are orthographically annotated. More information
and also More information
*ELRA-S0239 N4 (NATO Native and Non Native) database
*The (NATO Native and Non Native) database comprises speech data recorded in the naval transmission training centers of four countries (Germany, The Netherlands, United Kingdom, and Canada) during naval communication training sessions in 2000-2002. The material consists of native and non-native speakers using NATO Naval English procedure between ships, and reading from a text, "The North Wind and the Sun," in both English and the speaker's native language. The audio material was recorded on DAT and downsampled to 16kHz-16bit, and all the audio files have been manually transcribed and annotated with speakers identities using the tool, Transcriber.
More information
and also More information
*ELRA-W0047 Catalan Corpus of News Articles
*The Catalan Corpus of News Articles comprises articles in Catalan from 1 January 1999 to 31 March 2007. These articles are grouped per trimester without chronological order inside.
More information
and also More information
*ELRA-L0075 Bulgarian Linguistic Database
*This database contains 81,647 entries in Bulgarian with a linguistic environment tool (for WINDOWS XP). The data may be used for morphological analysis and synthesis, syntactic agreement checking, phonetic stress determining.
More information
and also More information
For more information on the catalogue, please contact Valérie Mapelli
Our on-line catalogue has moved to the following address. Please update your bookmarks.


Speech enhancement-Theory and Practice
Author: Philipos C. Loizou, University of Texas, Dallas, USA
Publisher: CRC Press

Speech and Language Engineering
Editor: Martin Rajman
Publisher: EPFL Press, distributed by CRC Press
Year: 2007

Human Communication Disorders/ Speech therapy
This interesting series can be listed on Wiley website

Incurses em torno do ritmo da fala
Author: Plinio A. Barbosa
Publisher: Pontes Editores (city: Campinas)
Year: 2006 (released 11/24/2006)
(In Portuguese, abstract attached.) Website

Speech Quality of VoIP: Assessment and Prediction
Author: Alexander Raake
Publisher: John Wiley & Sons, UK-Chichester, September 2006

Self-Organization in the Evolution of Speech, Studies in the Evolution of Language
Author: Pierre-Yves Oudeyer
Publisher:Oxford University Press

Speech Recognition Over Digital Channels
Authors: Antonio M. Peinado and Jose C. Segura
Publisher: Wiley, July 2006

Multilingual Speech Processing
Editors: Tanja Schultz and Katrin Kirchhoff ,
Elsevier Academic Press, April 2006

Reconnaissance automatique de la parole: Du signal a l'interpretation
Authors: Jean-Paul Haton
Christophe Cerisara
Dominique Fohr
Yves Laprie
Kamel Smaili
392 Pages
Publisher: Dunod



We invite all laboratories and industrial companies which have job offers to send them to the ISCApad editor: they will appear in the newsletter and on our website for free. (also have a look at as well as Jobs)

Chef de Projet technique TAL / Text Mining - basé dans le Nord Pas de Calais

Société :
Vous intégrez la division digitalisation du leader Européen du traitement de l’information (1200 collaborateurs en France, Europe, Asie et aux USA), internationalement reconnu pour ses expertises et ses savoir-faire au service du client. Ses clients sont principalement les institutions (centres de recherche, grandes bibliothèques, offices des brevets,...) et les grands acteurs internationaux de l’édition. L'activité Content Mining s'adresse notamment à une clientèle appartenant aux secteurs de l'industrie pharmaceutique et de la bioinformatique.
Offre :
En coordination avec le service commercial et la production, vous êtes responsable de la bonne réalisation des projets, et de la mise en oeuvre des solutions les mieux adaptées aux besoins des clients, et aux contraintes économiques des projets. Pour cela, vous vous investissez dans une parfaite connaissance des moyens et solutions de l'entreprise et de leurs applications. Vos missions s'articulent autour de trois activités:
1/ Consultant Avant Vente (25% du poste)
- Accompagner l'équipe commerciale auprès des clients et prospects, dans un objectif d'information, de compréhension, d'identification des besoins, et leur apporter une réponse technique complète;
- Définir la ou les solutions adaptées pour couvrir l'ensemble des besoins en adéquation avec les moyens techniques à disposition; le cas échéant, reformuler la demande du client en fonction des savoir-faire de l'entreprise; - En coordination avec le commercial, présenter la proposition technique et financière au client, s'accorder sur la solution retenue;
- Savoir repérer auprès des clients les besoins non satisfaits ou futurs;
- Assurer une fonction de veille tant commerciale que technologique.
2/ Gestion de Projet Technique (75% du poste):
- Apporter les solutions techniques et financières aux besoins des clients avec la collaboration des services techniques de l'entreprise (production, R&D, Méthodes,...);
- Assurer la mise en œuvre de la solution en s'appuyant sur l'ensemble des acteurs de l'entreprise, avec pour objectif la satisfaction client.
3/ Administratif:
- Établir des comptes-rendus d'activité sur l'ensemble des projets pris en charge;
- Assurer l'interface entre les services commerciaux, de production, et le client.
Profil :
- Formation supérieure (Bac+4/5) en informatique.
- Vous avez acquis une expérience de 3 à 5 ans en tant que Chef de Projet technique, dans les domaines suivants:
Text Mining, Reconnaissance des formes, Traitement du Langage, Linguistique, TAL.
Une connaissance du domaine de la GED ou du Knowledge Management serait un plus.
- Autonome en Anglais dans un contexte professionnel (oral et écrit).
Comment candidater?
Entrez votre profil sur le site Web du cabinet de recrutement e-Match consulting en mentionnant la reference 73
Olivier Grootenboer +
cabinet de Recrutement & Chasse e-Match consulting

PostDoc in Spoken Language and Dialog Research (LUNA project) at AMI2 Lab

LUNA Project: The LUNA (SPOKEN LANGUAGE UNDERSTANDING IN MULTILINGUAL COMMUNICATION SYSTEMS) project is funded by the European Commission FP6 program. The project addresses the problem of real-time understanding of spontaneous speech in the context of advanced conversational systems for problem-solving tasks and/or cooperative work interfaces. The LUNA consortium and more information on the project can be found at the website.
Job description: The postdoc will be investigating machine learning techniques for adaptive spoken dialog prototyping. S/he will be responsible to design adaptive algorithms for language understanding and dialog modelling. S/he will work with the AMI2 team and lead the spoken dialog prototyping for such adaptive system.
Requirements: The ideal candidate we are looking for has a Ph.D. in Computer Science, two-year postdoc experience and background in human-machine dialog research for spoken and/or multimodal conversational systems. He/She will have strong programming skills and be team-work oriented.
AMI2 Lab: The Adaptive Multimodal Information and Interface ( AMI2 )Lab at University of Trento pursues excellence research in next-generation interfaces for human-machine and human-human communication. The AMI2 lab has a state-of-the-art speech and language technology infrastructure and collaborations with premiere international research centers and industry research labs.
How to apply: The University of Trento is an equal opportunity employer. Interested applicants should submit their CV along with their statement of research interest and three reference letters. The starting date for the position is as early as September 1st 2007. The duration of the contract will be 24 months, salary will depend on experience and qualifications. Please send your application to:
Prof. Ing. Giuseppe Riccardi
Director of AMI2 Lab
About University of Trento and Information and Communication Technology Department: The University of Trento is constantly ranked as premiere Italian graduate university institution. The DIT Department has a strong focus on Interdisciplinarity with professors from different faculties of the University (Physical Science, Electrical Engineering, Economics, Social Science, Cognitive Science, Computer Science) with international background. English is the official language.

Expert Linguistique, basé à Nancy, FRANCE

travaillant typiquement dans un laboratoire dédié TAL.
Profil recherché
- Formation d'origine: Ingénieur ou DEA/DESS informatique / Intelligence Artificielle de formation
- 5 à 10 ans d'expérience min
- Salaire jusqu'à 45/48K€ en fonction du profil
- Fibre ou expérience en R&D
- Compétences reconnues en Text Mining (étiquetage morpho syntaxique, Reconnaissance de forme, Analyse sémantique, Traduction automatique, Apprentissage naturel et Linguitique)
- Profil humain: MANAGER, leader d'équipe.
Comment candidater:
Entrez votre profil sur le site Web du cabinet de recrutement e-Match consulting
Olivier Grootenboer +
cabinet de Recrutement & Chasse e-Match consulting

Speech Software Engineer, CereProc, Edinburgh, UK

Fixed Term for 1 year to be reviewed after 9 months Salary 25,000 BP to 30,000 BP per annum dependent upon experience
CereProc is a Scottish company, based in Edinburgh, the home of advanced speech synthesis research, with a sales office in London. The CereProc team have extensive experience across the entire speech technology domain.
We are looking for a talented and skilled individual to undertake research on a SMART Feasibility Study awarded by the Scottish Executive to CereProc to investigate articulatory approaches to text to speech (TTS) and automatic speech recognition (ASR) associated with the needs of an ease-of-use audio wav file creation system for use commercially.
The ideal candidate will have a degree in Electronic Engineering or related discipline, a PhD in Speech Technology or a related topic, an understanding of the principles of digital signal processing with regard to ASR and TTS, an ability to write computer programs (e.g., in C/C++/Python), and will be self-motivated with strong organisational skills.
It would also be desirable for the candidate to have experience of working with a University spin out company, native level Japanese, familiarity with hidden Markov models, state-space methods or stochastic processes and experience in graphical user interface design.
Salary range 25,000BP to 30,000 BP dependent on experience including pension contributions, 37.5 Hour Week with a degree of flexibility, and up to 20 days annual leave plus statutory holidays per annum.
To apply for this position please email your CV (including 2 referees) and current salary details.

Senior Speech Engineer in UK

Presented by
This description is NOT exhaustive and we would welcome applications from any speech developers who feel they can add value to this company. Please forward your CV with salary expectations as we may have an alternative position for you.
Job Role:
To develop and deliver products, meeting customer needs and expectations.
To undertake areas of research and development on speech recognition technology within the constraints/direction of the Client Business development plan.
Duties and Responsibilities:
*To assist in the ongoing development of speech technologies for commercial deployment.
*To further improve existing client products
*Organise and manage own work to meet goal and objectives
*Use personal judgement and initiative to develop effective and constructive solutions to challenges and obstacles
*Provide technical support to customers of the company on projects and activities when required.
*Document and maintain such records as required by the company operating procedures.
*Maintain an understanding of Speech recognition technology in general and the client products and technology specifically
*Communicate and present to customers and others, information about the client product range and its technology.
*Providing technical expertise to other members of the team and staff at the client
*R&D in various aspects of automatic speech recognition and related speech technologies (e.g. speech data mining, Keyword/phrase spotting, multilingual speech recognition, LVCSR).
*Adhere to local and externally relevant health and safety laws and policies and bring any matters to the attention of local management without delay
*Take responsibility for self-development and continuing personal development
Person specification
Good degree in a relevant subject
Further degree in the Speech Recognition field or 2 years experience
Experience in a product based environment
Special Skills:
Product orientated
Experienced C++
Ability to work in an interdisciplinary team (Speech Scientists and software engineers)
Special Aptitudes:
Ability to put theory in to practice to apply knowledge to develop products
Can work and communicate effectively with colleagues outside their own area of expertise.
Quick to develop new skills where required
Analytical, with good problem solving ability.
High level of attention to detail
Experience of multi-threaded and computational intensive algorithms
Team Player but also able to work independently
Self - starter
Motivated & Enthusiastic
Results orientated
Location: UK
Salary : Dependent On experience
We are a specialist UK recruitment company seeking speech recognition developers to join one of our many UK speech companies. This description is NOT exhaustive and we would welcome applications from any speech developers who feel they can add value to this company. Please forward your CV with salary expectations as we may have an alternative position for you

R&D Engineer

Presented by
Job Description
As a member of a small team of research and development engineers you will work both independently and in teams developing algorithms and models for large vocabulary speech recognition. Good C and Linux shell scripting skills are essential, as is experience of either digital signal processing or statistical modelling. The ability to administer a network of Linux PCs is highly desirable, as are language/linguistic skills. Knowledge of Python and Perl would be advantageous.
In addition to the technical skills mentioned above, the successful candidate will have a proactive personality, excellent oral and written communication skills, and a desire to learn all aspects of speech recognition.
Key Skills
1st/2.1 degree in numerate subject.
2+ years of C and Linux shell programming.
Excellent communication skills.
Experience of digital signal processing or statistical modelling (this could be an undergraduate/masters project).
Additional Skills Linux system administration.
Experience with HTK.
Python and/or Perl programming.
Language/linguistic knowledge.
Salary : Commensurate with experience + discretionary share option scheme.

R&D Language Modeller in UK

An excellent opportunity presented by,we are looking for a self-motivated speech engineer, who enjoys working in the dynamic and flexible environment of a successful company.
*3+ years relevant industrial / commercial experience or very relevant academic experience
In-depth knowledge about probabilistic language modelling, including estimation methods, smoothing, pruning, efficient representation, interpolation, trigger-based models
*Experience in development and/or deployment of a speech recognition engines, with emphasis on efficient training techniques and large vocabulary systems
*Understanding of the technology and techniques used in ASR engines
*A good degree in an appropriate subject, PhD preferred
*Good software engineering skills, knowledge of scripting languages (e.g. shell scripting, Perl) and experience working under both Linux and Windows
*Experience working with HTK, Sphinx and/or Julius
*Experience with probabilistic lattice parsing
*Have worked on a live application or service that includes speech recognition
Salary : 40K BP
Any speech candidates who believe they can add value to our global clients , we would be happy to represent you.

Several openings at Nuance
Nuance is the leading provider of speech and imaging solutions for businesses and consumers around the world. Every day, millions of users and thousands of businesses experience Nuance by calling directory assistance, requesting account information, dictating patient records, telling a navigation system their destination, or digitally reproducing documents that can be shared and searched. With more than 2000 employees worldwide, we are committed to make the user experience more enjoyable by transforming the way people interact with information and how they create, share and use documents. Making each of those experiences productive and compelling is what Nuance is about.
Senior Engineer   As part of the team, the candidate will be creating speech technologies for embedded applications varying from simple command and control up to natural speech dialogue on mobile and automotive platforms.
The candidate will join an experienced international team with expertise in creating, optimizing and testing portable ASR software for embedded devices. The position can be located at our Aachen office (Germany), the Nuance International Headquarters in Merelbeke, near Ghent (Belgium), or our Tel-Aviv office (Israel).
· Design, implementation, evaluation, optimization and testing of new ASR algorithms and tools;
· Involvement in the creation of applications, demonstrators and evaluators of the developed technology;
· Occasionally provide support directly to customers and to our Embedded Professional Services Team.
Required Skills
·        Background in ASR technologies
· Knowledge of C, C++, Pearl or Python, Matlab and Mobile / Automotive
· Familiarity with Embedded Programming Techniques
·        Excellent English communication skills, written and spoken.
·        Positive "can-do" attitude, well organized
·        Customer Service attitude, excellent presentation and communication skills
·        Ability to manage projects and to deal with critical situations.
Ideal Skills
·          2-3 years work experience in relevant area
·          MS or BS in Computer Science, Computer Engineering or related Technical Degree
  Please send your CV to Deanna Roe .
We are looking forward to receiving your application!
  As part of the team, the candidate will be creating speech technologies for embedded applications varying from simple command and control up to natural speech dialogue on mobile and automotive platforms.
The candidate will join an experienced international team with expertise in creating, optimizing and testing portable ASR software for embedded devices. The position can be located at our Aachen office (Germany), the Nuance International Headquarters in Merelbeke, near Ghent (Belgium), or our Tel-Aviv office (Israel).
· Design, implementation, evaluation, optimization and testing of new algorithms and tools, with a focus on search, grammar processing, dictation and/or natural language understanding.
· Occasional involvement in the creation of demonstrators and evaluators of the developed technology.
· Occasionally provide support directly to customers and to our Embedded Professional Services Team
Required Skills
·        Background in ASR research is required: preferably with ASR search techniques, CFG parsing, FSM processing, dictation or natural language understanding technology.
· Knowledge of C, C++, Pearl or Python, Matlab
·        Excellent English communication skills, written and spoken.
·        Positive "can-do" attitude, well organized
·        Customer Service attitude, excellent presentation and communication skills
·        Ability to deal with critical situations.
Ideal Skills
·          2-3 years work experience in relevant area or Ph.D. in speech processing
·          MS or BS in Computer Science, Computer Engineering or related Technical Degree
  Please send your CV in English to : Deanna Roe
Key Responsibilities
In this role, you will be responsible for porting Nuance state-of-the-art speech recognition and text to speech products to different embedded platforms typically found in the automotive, mobile or game station markets. As development engineer in the Professional Services team, you will be in close contact with the research lab developing the technologies, and will be contributing to speech enabling mobile phones (dialing by voice, SMS reading), automotive hands-free car-kits (dialing by voice, acoustical echo canceller), automotive and mobile navigation systems (route guidance or traffic information through TTS), game stations, or similar kind of devices where speech is bringing clear added value. As experienced engineer, you will rapidly be working together with our customers’ integration teams to help them including our technologies into successful products in an efficient way, and managing from A to Z complete projects and customer interaction. While most of the job will take place in our offices in Merelbeke, a few short trips inside or outside Europe might be possible. Your Profile
- Requirements:
o Bachelors or Graduate University degree in Electrical Engineering, Computer Engineering, Computer Science or equivalent.
o Professional experience in the Embedded market
o Fluent in English, both written and spoken
o Strong customer communication skills
o Strong Ansi C programming skills
o Ability to travel for short trips
o Strong team player, capable to work independently or/and to manage other engineers when needed
o Self learner, with sense of initiative, and perseverance to deliver high quality work
- Nice to have
o Experience with embedded hardware platforms, operating systems, and software development
o Knowledgeable about audio streaming technologies, real-time protocols, codecs, signal processing
o Knowledgeable about WinCe, Linux
o Multi-lingual
Please send your CV in English to : Deanna Roe
Located in Aachen (Germany)
As a Mobile application developer you will be part of a team creating ground-breaking software for mobile devices (smartphones or personal navigation devices,). You will design, implement and maintain applications that make uses of Nuance’s state-of-the-art speech recognition and speech synthesis technologies to deliver new experience to Nuance customers: hands-free dialing and control of the phone, selection of songs on a media player by voice, voice entry of destination in a navigation device or dictation of SMS. You will follow up with latest industry developments for mobile platform and work with new mobile devices before they hit the market. You will work in a multi-cultural international team located in Belgium, Germany, Israel and the USA.
· Bachelor or Master Degree in Computer Science, Computer Engineering, Electrical Engineering of equivalent.
· Strong C/C++ programming skills
· Hands-on experience with object-oriented design & object-oriented programming
· Good software/system troubleshooting skills
· Familiarity with one of the following development platforms: Symbian, Windows Mobile, BREW, J2ME.
· Fluent in English, both written in spoken.
· Ability to work in a multinational team
· Self- & quick learner
· Ability to work under time pressure and take initiative
Strong candidates will also have the following;
· Understanding of speech recognition software.
· Knowledge of Java.
· Practical experience with Nokia’s S60 development platform.
· Familiarity with cross-platform software design principles
· Exposure to Agile software development methodologies
· Knowledge of other European or Asian languages
Please send your CV in English to Deanna Roe
If you have any questions, please contact her on +44 7717506622
Nuance Communications, Inc., a world-wide leader in speech technology, is seeking full-time temporary language specialist to develop our newest TTS technology into various languages.
Positions are available for the following languages:
· Finnish
· Dutch ( The Netherlands)
· Australian English
· Mandarin (People’s Republic of China)
· Mandarin (Taiwan)
· Cantonese
· Japanese
· Korean
· Danish
· Norwegian
· Egyptian Arabic
· Greek
· Hindi
· Portuguese (Iberian)
· Indian English
· Russian
· Swedish
· Turkish
· Spanish (Catalan)
· Helping in development of the linguistic processing components
· Design of text corpora for recording and testing
· Processing speech data
· Creating and tuning the TTS voice
· Testing and productization of the TTS voice
. Requirements:
· Native or near native speaker of one of them above-mentioned languagesbr> · Speak English fluently (working language)
· Have some experience working in speech/NLP/linguistics either in academia or in industry
· Have some computational ability – no programming is required, but you should be comfortable working with MS Windows
· If you know some AWK or Perl, it would be an advantage
· Willing to work in Merelbeke (near Ghent in Belgium) for the duration of the contract
-A 3-5 months contract
-Training in all aspects of the job, and you will be working with a talented and committed team
-An exciting job with an innovative and performing international company
-The opportunity to work in an enthusiastic, supportive team
-Competitive remuneration package
-Relocation, travel and accommodation assistance
Please send your CV in English to Deanna Roe
We are seeking a brilliant algorithms developer for research and implementation of core technology for speech recognition and synthesis on cellular handsets.
-Research and implementation of technology that is required for different applications of mobile devices, such as voice enabled dialing, application launching etc.
-Develop modeling methods and algorithms for enhanced quality and robustness considering limited computational resources
-Specific models and algorithms for different languages
-Define and implement schemes and tools that are required for turning technologies into products – such as creating specific models for project and tuning parameters
-Deliver models and algorithms to the product group – guide and support the absorption of new technologies.
-Development of methodology and tools for research and model creation
-M.Sc. in computer-science, software engineering, physics, or mathematics, B.Sc. with distinction will also be considered.
  Your profile:
- Completed the following courses with grade above 85: At least one C/C++ course with final project, Probability theory and statistics, Algorithms
- Excellent research and implementation skills are required with at least 1 year of proven experience in the following areas: Developing C/C++ code, Development, implementation and analysis of algorithms and mathematical models
- Knowledge of Perl, Python, Windows/Linux scripts, SQL
- Experience in the following areas is highly desirable: Speech– processing, Machine learning, Statistical modeling; Neural networks, Bioinformatics.
- Knowledge in one or more of the following an advantage:Digital signal processing, Image processing, Computational linguistics
- Advanced written and spoken English
  Locations: Israel, Tel aviv / Germany, Aachen / Belgium, Merelbeke
Please send your CV in English to Deanna Roe
The experience speaks for itself


Two positions for postdoctoral associates are available within the Sound Technology Group Sound Technology Group , Digital Media Division at the Department of Science and Technology (ITN), Linkoping Universityat Campus Norrkoping, Sweden.
Our research is focused on physical and perceptual models of sound sources, sound source separation and adapted signal representations.
Candidates must have a strong background in research and a completed Ph.D.
Programming skills (e.g. Matlab, C/C++ or Java) are very desirable, as well as expertise in conducting acoustic/auditory experiments.
We are especially interested in candidates with research background in the following areas:
. Auditory Scene Analysis
Classification of Sound Sources
. Sound Processing
. Spatial Audio and Hearing
. Time-Frequency and Wavelet Representations
. Acoustics (theoretical and experimental)
but those with related research interests are also welcome to apply.
Preferred starting date is September 2007.
Inquiries and CVs must be addressed to
Prof. G. Evangelista
Digital Media Division
Department of Science and Technology (ITN)
Linkoping Institute of Technology (LiTH) at Campus Norrkoping
Room: K5724 (K=E5kenhus, Bredgatan 33)
SE-60174 Norrkoping, Sweden
Phone: +46 11 36 31 01
Fax: +46 11 36 32 70

Cambridge University Engineering Department- Machine Intelligence Laboratory - Speech Group

Research Associate in Spoken Dialogue Systems
Applications are invited for a Research Associate position in the Machine Intelligence Laboratory to join a group lead by Professor Steve Young working in the area of Spoken Dialogue Systems. The technical focus is on the use of reinforcement learning within man-machine interfaces to enable automatic learning of dialogue behaviour and on-line adaptation. The work will involve statistical modelling, algorithm design, system development and user evaluation. The successful candidate will have a good first degree and preferably a higher degree in a relevant area. Good programming skills in C/C++ are essential.
The appointment will be for two years initially, starting as soon as possible. Salary is in the range 324,402 BP to 331,840 BP p.a. Further details and an application form can be found at our website. Informal enquiries should be addressed by email to Professor Young. Applicants should email their completed application form together with their CV and a covering letter describing their research experiences, interests and goals to Rachel Fogg Rachel Fogg.
The University is committed to equality of opportunity.

Opening on Speech recognition at Telefonica, Barcelona (Spain)

The speech Technology Group at Telefonica Investigacion y Desarrollo (TID) is looking for a highly qualified candidate for an engineering position on speech recognition and related technologies.
The selected person will become a part of a multidisciplinary team of young highly motivated people in an objective driven, friendly atmosphere located in a central area of Barcelona (Spain).
Minimum requirements ar:
Degree in Computer Science /Electrical Engineering/Computational Linguistics or similar with 2+ years of experience (Ph.D. preferred) on speech technology.
Good knowledge of speech recognition and speech synthesis.
Proven programming expertise in C++ and Java
Good level of English (required) and some knowledge of Spanish (preferred)
High motivation and teamwork spirit
Salary depending on the experience and value of the applicant
Starting date as soon as possible
The speech technology group is a well established group within TID with more than 15 years of experience in research and development of technology for internal use of Telefonica group as well as outside organizations. It is also a very active partner in many National and European projects. TID is the research and development company inside the Telefonica group, currently one of the biggest Telecom companies. It is the biggest private research center in Spain in number of employees and available resources.
Please send your resume and contact information to
Sonia Tejero
Tlf: +34 93 365 3024

Sound to Sense: 18 Fellowships in speech research

Sound to Sense (S2S) is a Marie Curie Research Training Network involving collaborative speech research amongst 13 universities in 10 countries. 18 Training Fellowships are available, of which 12 are predoctoral and 6 postdoctoral (or equivalent experience). Most but not all are planned to start in September or October 2007.
A research training network’s primary aim is to support and train young researchers in professional and inter-disciplinary scientific skills that will equip them for careers in research. S2S’s scientific focus is on cross-disciplinary methods for modelling speech recognition by humans and machines. Distinctive aspects of our approach include emphasis on richly-informed phonetic models that emphasize communicative function of utterances, multilingual databases, multiple time domain analyses, hybrid episodic-abstract computational models, and applications and testing in adverse listening conditions and foreign language learning.
Eleven projects are planned. Each can be flexibly tailored to match the Fellows’ backgrounds, research interests, and professional development needs, and will fall into one of four broad themes.
1: Multilinguistic and comparative research on Fine Phonetic Detail (4 projects)
2: Imperfect knowledge/imperfect signal (2 projects)
3: Beyond short units of speech (2 projects)
4: Exemplars and abstraction (3 projects)
The institutions and senior scientists involved with S2S are as follows:
* University of Cambridge, UK (S. Hawkins (Coordinator), M. Ford, M. Miozzo, D. Norris. B. Post)
* Katholieke Universiteit, Leuven, Belgium (D. Van Compernolle, H. Van Hamme, K. Demuynck)
* Charles University, Prague, Czech Republic (Z. Palková, T. Dub?da, J. Volín)
* University of Provence, Aix-en-Provence, France (N. Nguyen, M. d’Imperio, C. Meunier)
* University Federico II, Naples, Italy (F. Cutugno, A. Corazza)
* Radboud University, Nijmegen, The Netherlands (L. ten Bosch, H. Baayen, M. Ernestus, C. Gussenhoven, H. Strik)
* Norwegian University of Science and Technology (NTNU), Trondheim, Norway (W. van Dommelen, M. Johnsen, J. Koreman, T. Svendsen)
* Technical University of Cluj-Napoca, Romania (M. Giurgiu)
* University of the Basque Country, Vitoria, Spain (M-L. Garcia Lecumberri, J. Cenoz)
* University of Geneva, Switzerland (U. Frauenfelder)
* University of Bristol, UK (S. Mattys, J. Bowers)
* University of Sheffield, UK (M. Cooke, J. Barker, G. Brown, S. Howard, R. Moore, B. Wells)
* University of York, UK. (R. Ogden, G. Gaskell, J. Local)
Successful applicants will normally have a degree in psychology, computer science, engineering, linguistics, phonetics, or related disciplines, and want to acquire expertise in one or more of the others.
Positions are open until filled, although applications before 1 May 2007 are recommended for starting in October 2007.
Further details are available from the web about:
+ the research network (92kB) and how to apply, + the research projects(328 kB).

Research scientist- Speech Technology- Princeton, NJ, USA

Company Profile: Headquartered in Princeton, NJ, ETS (Educational Testing Service)is the world's premier educational measurement institution and a leader in educational research. As an innovator in developing achievement and occupational tests for clients in business, education, and government, we are determined to advance educational excellence for the communities we serve.
Job Description: ETS Research & Development has a Research Scientist opening in the Automated Scoring and Natural Language Processing Group. This group conducts research focusing on the development of new capabilities in automated scoring and NLP-based analysis and evaluation systems, which are used to improve assessments, learning tools and test development practices for diverse groups of users that include K-12 students, college students, English Language Learners and lifelong learners. The Research Scientist position involves applying scientific, technical and software engineering skills to designing and conducting research studies and developing capabilities in support of educational products and services. The job is a full-time job.
Required qualifications
· A Ph.D. in Natural Language Processing, Computational Linguistics, Computer Science, or Electrical Engineering with a focus on speech technology, particularly speech recognition. Knowledge of linguistics is a plus.
· Evidence of at least three years of independent substantive research experience and/or experience in developing and deploying speech technology capabilities, preferably in educational environments.
· Demonstrable contributions to new and/or modified theories of speech processing and their implementation in automated systems.
· Practical expertise with speech recognition systems and fluency in at least one major programming language (e.g., Java, Perl, C/C++, Python).
· Three years of independent substantive research experience and/or experience in developing and deploying speech technology capabilities, preferably in educational environments.
How to apply
Please send copy of your resume, along with cover letter stating salary requirements and job #2965, to e-mail
ETS offers competitive salaries, outstanding benefits, a stimulating work environment, and attractive growth potential. ETS is an Equal Opportunity, Affirmative Action Employer.
Web site

Software Engineer Position at Be Vocal, Mountain View, CA,USA

We are currently looking for a Software Engineer with previous exposure to Speech, to work in our Speech and Natural Language Technology group. This group’s mission is to be the center of excellence for speech and natural language technologies within BeVocal. Responsibilities include assisting in the development of internal tools and processes for building Natural Language based speech applications as well as on ongoing infrastructure/product improvements. The successful candidate must be able to take direction from senior members of the team and will also be given the opportunity to make original contributions to new and existing technologies during the application development process. As such, you must be highly motivated and have the ability to work well independently in addition to working as a team.
* Develop and maintain speech recognition/NLP tools and supporting infrastructure
* Develop and enhance component speech grammars
* Work on innovative solutions to improve overall Speech/NL performance across BeVocal’s deployments.
* BS in Computer Science, Electrical Engineering or Linguistics, an MS is a preferred.
* 2-5 years of software development experience in Perl, Java, C/C++. A willingness and ability to pick up additional software languages as needed is essential.
* Exposure or experience with speech recognition/pattern recognition either from an academic environment or directly related work experience.
* Experience working as part of a world-class speech and language group is highly desirable.
* Experience building natural language applications is preferred.
* Experience building LVCSR speech recognition systems is a plus.
For immediate consideration, please send your resume by email and include "Software Engineer, Speech" in the subject line of your email. Principals only please (no 3rd parties or agencies). Contact for details
BeVocal's policy is to comply with all applicable laws and to provide equal employment opportunity for all applicants and employees without regard to non-job-related factors such as race, color, religion, sex, national origin, ancestry, age, disability, veteran status, marital status or sexual orientation. This policy applies to all areas of employment, including recruitment, hiring, training, promotion, compensation, benefits, transfer, and social and recreational programs.

Positions at Saybot in Shanghai,China

About the company
Saybot is a leading speech technology platform provider for people interested in learning and improving their spoken English. Through Saybot’s innovative speech-based platform, English teachers, publishers, and schools can create and sell spoken English curricula to 180 million English learners in China. By focusing on spoken English learning and speech technology innovations, Saybot and its partners, such as the biggest English textbook publisher in China, are making English learning more efficient, effective, and fun.
The company was originally backed up by technology visionaries (Nicholas Negroponte and Leonard Kleinrock) and well funded by first tier VCs in Asia.

Positions: Speech Scientists.
Since 2005, we have been building software which features state-of-the-art speech technologies and innovative interactive lessons to help users practice speaking English. We are currently looking for talented speech scientists to help strengthen our R&D team and to develop our next-generation products. Successful candidates would have proven excellence and good work ethics in academic or industry context and demonstrated creativity in building speech systems with revolutionary designs. We seek motivated and enthusiastic colleagues, at both senior and junior level.
This position is based in Shanghai, China
* MS/PhD degree in speech technology (or related).
* Expertise in at least one of the following areas and basic knowledge of the others:
o acoustic model training and adaptation,
o natural language understanding and dialogue systems,
o prosody analysis,
o recognition on embedded platforms.
* Excellent programming skills in both object-oriented languages (C++, C# or Java) and scripting (Perl or Python).
* Excellent communication skills in written and oral English.
Contact: Sylvain Chevalier

2 Positions in Research and Development in "Audio description and indexing" at IRCAM-Paris

The goal of the Sample Orchestrator project is to develop and test new applications for managing and manipulating sound samples based on audio content. On the one hand the commercial availability of large databases of sound samples available on various supports (CD, DVD, online), are currently limited in their applications (synthesizers by sampling). On the other hand, recent scientific and technological development in audio indexing and database management allow the development of new musical functions: database management based on audio content, audio processing driven by audio content, development of orchestration tools.
Two positions are available from April 15th 2007 within the "Equipe Analyse/Synthese" of Ircam for (each) a 12 months total duration (possibility of extending the contracts). The main tasks to be done for the research and development positions are:
- Research and development of new audio features and algorithms for the description of instrumental, percussive and FX sounds.
- Research and development of new audio features and algorithms for the morphological description of sounds
- Research and development of new audio features and algorithms for sounds containing "loops"
- Research and development of algorithms for automatic audio indexing
- Research and development of algorithms for fast search by similarity in large databases
- Participation in the definition of the specification
- Participation in user evaluation and feedback
- Integration into the final application
- High skills in Audio indexing and signal processing
- High skills in Matlab programming
- High productivity, methodical work, excellent programming style.
- Good knowledge of UNIX, Mac and Windows environments
According to background and experience.
- Skills in Audio indexing and signal processing
- High skills in C/C++ programming
- High productivity, methodical work, excellent programming style.
- Good knowledge of UNIX, Mac and Windows environments
According to background and experience.
In order to start immediately, the candidate should preferably have EEC citizenship or already own valid EEC working papers.
The positions are available in the "Analysis/Synthesis" team in the R&D department from April 15th 2007 for (each) a duration of 12 months (possibility of extending the contracts).
Please send your resume with qualifications and informations addressing the above issues, preferably by email Xavier Rodet, Analyse/Synthese team manager).
or by fax at: (33 1) 44 78 15 40, care of Xavier.Rodet
or by surface mail to: Xavier Rodet, IRCAM, 1 Place Stravinsky, 75004 Paris.
IRCAM is a leading non-profit organization dedicated to musical production, R&D and education in acoustics and music, located in the center of Paris (France), next to the Pompidou Center. It hosts composers, researchers and students from many countries cooperating in contemporary music production, scientific and applied research. The main topics addressed in its R&D department are acoustics, psychoacoustics, audio synthesis and processing, computer aided composition, user interfaces, real time systems. Detailed activities of IRCAM and its groups are presented on our WWW server.


The goal of the MusicDiscover project is to give access to the contents of musical audios recordings (as it is the case, for example, for texts), i.e. to a structured description, as complete as possible, of the recordings: melody, genre/style, rate/rhythm, instrumentation, musical structure, harmony, etc. The principal objective is thus to develop and evaluate means directed towards the contents, which include techniques and tools for analysis, indexing, representation and search for information. These means will make it possible to build and use such a structured description. This project of the ACI "Masses of Data" is carried out in collaboration between Ircam (Paris), Get-Telecom (Paris) and the LIRIS (Lyon) since October 2004. The principal lines of research are :
- Rhythmic analysis and detection of ruptures
- Recognition of musical instruments and indexing
- Source Separation
- Structured Description
- Research of music by similarity
- Recognition of musical titles
- Classification of musical titles in genre and emotion.
The available position relates to the construction and the use of the Structured Description in collaboration with the other lines of research.
A position is available from December 1st 2006 within the "Equipe Analyse/Synthese" of Ircam for a 9 months total duration. The contents of work are as follows:
- Participation in the design of a Structured Description
- Software development for construction and use of Structured Descriptions
- Participation in the definition and development of the graphic interface
- Participation in the evaluations
- Experience of research in Audio Indexing and signal processing
- Experience in Flash, C and C++ and Matlab programming.
- High productivity, methodical work, excellent programming style.
- Good knowledge of UNIX and Windows environments.
- The position is available in the "Analysis/Synthesis" team in the R&D department from November 1st 2006 for a duration of 9 months.
- In order to start immediately, the candidate should preferably have EEC citizenship or already own valid EEC working papers.
- According to background and experience.
- Please send your resume with qualifications and informations adressing the above issues, preferably by email to Xavier Rodet, Analyse/Synthese team manager.
or by fax at: (33 1) 44 78 15 40, care of Xavier.Rodet
or by surface mail to: Xavier Rodet, IRCAM, 1 Place Stravinsky, 75004 Paris.
Introducing IRCAM
IRCAM is a leading non-profit organization dedicated to musical production, R&D and education in acoustics and music, located in the center of Paris (France), next to the Pompidou Center. It hosts composers, researchers and students from many countries cooperating in contemporary music production, scientific and applied research. The main topics addressed in its R&D departement are acoustics, psychoacoustics, audio synthesis and processing, computer aided composition, user interfaces, real time systems. Detailed activities of IRCAM and its groups are presented on our WWW server



Papers accepted for FUTURE PUBLICATION in Speech Communication

Full text available on for Speech Communication subscribers and subscribing institutions. Free access for all to the titles and abstracts of all volumes and even by clicking on Articles in press and then Selected papers.



Publication policy: Hereunder, you will find very short announcements of future events. The full call for participation can be accessed on the conference websites
See also our Web pages ( on conferences and workshops.


August 27-31,2007,Antwerp, Belgium
Chair: Dirk van Compernolle, K.U.Leuven and Lou Boves, K.U.Nijmegen
On-line Registration, Housing and Practical Information is now available at the conference website

ISCA, together with the Interspeech 2007 organizing committee, would like to invite you to participate in the upcoming conference.
Interspeech 2007 is organized by the Belgian and Dutch speech communities. It is the 10th biennial Eurospeech conference and the 8th in the annual series of Interspeech events organized by the International Speech Communication Association (ISCA), after Beijing, Aalborg, Denver, Geneva, Jeju, Lisbon and Pittsburgh.
The conference will be held August 27-31 in Antwerp, Belgium.
The main conference (August 28-31) will take place in the Flanders Congress and Concert Centre(FCCC) and the adjacent Astrid Park Plaza Hotel. The FCCC is one of the historic conference centres of Europe; it connects to the Antwerp zoo and is located right in the city centre. Antwerp is a lively midsize town with a well preserved medieval center, keeping conference centre and most hotels and social event venues within comfortable walking distance. Antwerp was the home town of the famous painter Rubens and is today well known as an international centre of diamonds. It also hosts one of the largest harbors in the world.
The Technical Program will include in parallel 3 oral, 3 poster and 1 special session, totaling 745 papers, selected from 1268 submitted full papers. For the first time in INTERSPEECH history, a standard of 3 reviews per paper was used to evaluate the submissions.
- Tuesday 28th: ISCA Medalist: Victor Zue (MIT) "On Organic Interfaces"
- Wednesday 29th: Sophie Scott (Univ. College London) "How the Brain Decodes Speech - Some Perspectives from Functional Imaging"
- Thursday 30th: Alex Waibel (CMU and Univ. of Karlsruhe) "Computer-Supported Human-Human Multilingual Communication"
- Friday 31th: TBA
Tutorials for Interspeech 2007 are organized by internationally recognized experts in their fields and will be held on Monday, August 27 on premises of the city centre campus of the University of Antwerp.
Morning tutorials:
- Voice Quality in Vocal Communication presenter: Christophe d'Alessandro
- The Modulation Spectrum and Its Application to Speech Science and Technology presenters: Les Atlas, Steven Greenberg and Hynek Hermansky
- Spoken Language Processing by Mind and Machine presenters: Roger K. Moore and Anne Cutler
- Processing Morphologically-Rich Languages presenters: Katrin Kirchhoff and Ruhi Sarikaya
Afternoon tutorials:
- Voice Transformation, presenter: Yannis Stylianou
- A Mathematical Theory of Speech Signals - Beyond the Linear Model presenters: Gernot Kubin and Erhard Rank
- Talking to Computers: from Speech Sounds to Human Computer Interaction presenters: Giuseppe Riccardi and Sebastian Varges
- Machine Learning for Text and Speech Processing, presenters: Antal van den Bosch and Walter Daelemans
The online registration for the conference is now open. Authors of accepted papers should make sure that at least one of its authors has registered by the early registration deadline of June 22, 2007. Papers without any author registration will be removed from the final program.
ACCOMMODATION: A large number of rooms has been reserved in several hotels of which most are within a 5-10' walking distance from the conference centre. An online hotel reservation form is available on the conference website.
Antwerp is easily reached from the international airports of Brussels and Amsterdam.

For up to date information on the technical and social program and for practical arrangements please visit the conference website.
If you can't find an essential piece of information there, please send an email .
We hope to welcome you in Antwerp in August !

Dirk Van Compernolle, General Chair
Lou Boves, General Chair
Jean-Pierre Martens, Technical Program Chair
Helmer Strik, Technical Program Chair

September 22-26, 2008, Brisbane, Queensland, Australia
Conference Website
Chairman: Denis Burnham, MARCS, University of West Sydney.

Brighton, UK,
Conference Website
Chairman: Prof. Roger Moore, University of Sheffield.

Chiba, Japan
Conference Website
ISCA is pleased to announce that INTERSPEECH 2010 will take place in Makuhari-Messe, Chiba, Japan, September 26-30, 2010. The event will be chaired by Keikichi Hirose (Univ. Tokyo), and will have as a theme "Towards Spoken Language Processing for All - Regardless of Age, Health Conditions, Native Languages, Environment, etc."



6th ISCA Speech Synthesis Research Workshop (SSW-6)

University of Bonn (Germany), August 22-24, 2007
A satellite of INTERSPEECH 2007 (Antwerp)in collaboration with SynSIG and IfK (University of Bonn)
Organized shortly after the 16th International Congress on Phonetic Sciences (Saarbrücken, Germany, August 6-10, 2007). Like its predecessors in Autrans (France) 1990, New Paltz (NY, USA) 1994, Jenolan (Australia) 1998, Pitlochry (UK) 2001, and Pittsburgh (PA, USA) 2004, SSW-6 will cover all aspects of speech synthesis and adjacent fields, such as:
TOPICS (updated list)
* Text processing for speech synthesis
* Prosody Generation for speech synthesis
* Speech modeling for speech synthesis applications
* Signal processing for speech synthesis
* Concatenative speech synthesis (diphones, polyphones, unit selection)
* Articulatory synthesis
* Statistical parametric speech synthesis
* Voice transformation/conversion/adaptation for speech synthesis
* Expressive speech synthesis
* Multilingual and/or multimodal speech synthesis
* Text-to-speech and content-to-speech
* Singing speech synthesis
* Systems and applications involving speech synthesis
* Techniques for assessing synthetic speech quality
* Language resources for speech synthesis
* Aids for the handicapped involving speech synthesis.
Deadlines (updated)
* Full-paper submission (up to 6 pages) - May 14, 2007 (EXTENDED DEADLINE!)
* Notification of acceptance - June 25, 2007
* Deadline for paper modification - July 15, 2007
Please send your papers, preferably as PDF files, as an e-mail attachment. Further information can soon be obtained from the website of the workshop,
Prof. Wolfgang Hess

8th Workshop on Discourse and Dialogue (SIGdial), Antwerp, Belgium

Antwerp, September 2-3, 2007
Held immediately following Interspeech 2007
Continuing with a series of successful workshops in Sydney, Lisbon, Boston, Sapporo, Philadelphia, Aalborg, and Hong Kong, this workshop spans the ACL and ISCA SIGdial interest area of discourse and dialogue. This series provides a regular forum for the presentation of research in this area to both the larger SIGdial community as well as researchers outside this community. The workshop is organized by SIGdial, which is sponsored jointly by ACL and ISCA.
Topics of Interest
We welcome formal, corpus-based, implementation or
analytical work on discourse and dialogue including but not restricted to the following three themes:
1. Discourse Processing and Dialogue Systems
Discourse semantic and pragmatic issues in NLP applications such as text summarization, question answering, information retrieval including topics like:
· Discourse structure, temporal structure, information structure
· Discourse markers, cues and particles and their use
. (Co-)Reference and anaphora resolution, metonymy and bridging resolution
· Subjectivity, opinions and semantic orientation
Spoken, multi-modal, and text/web based dialogue systems including topics such as:
· Dialogue management models;
· Speech and gesture, text and graphics integration;
· Strategies for preventing, detecting or handling miscommunication (repair and correction types, clarification and under-specificity, grounding and feedback strategies);
· Utilizing prosodic information for understanding and for disambiguation;
2. Corpora, Tools and Methodology
Corpus-based work on discourse and spoken, text-based and multi-modal dialogue including its support, in particular:
· Annotation tools and coding schemes;
· Data resources for discourse and dialogue studies;
· Corpus-based techniques and analysis (including machine learning);
· Evaluation of systems and components, including methodology, metrics and case studies;
The pragmatics and/or semantics of discourse and dialogue (i.e. beyond a single sentence) including the following issues:
· The semantics/pragmatics of dialogue acts (including those which are less studied in the semantics/pragmatics framework);
· Models of discourse/dialogue structure and their relation to referential and relational structure;
· Prosody in discourse and dialogue;
· Models of presupposition and accommodation; operational models of conversational implicature.
The program committee welcomes the submission of long papers for full plenary presentation as well as short papers and demonstrations. Short papers and demo descriptions will be featured in short plenary presentations, followed by posters and demonstrations.
· Long papers must be no longer than 8 pages, including title, examples, references, etc. In addition to this, two additional pages are allowed as an appendix which may include extended example discourses or dialogues, algorithms, graphical representations, etc.
· Short papers and demo descriptions should aim to be 4 pages or less (including title, examples, references, etc.). Please use the official ACL style files. Submission/Reviewing will be managed by the START system. Link to follow. Papers that have been or will be submitted to other meetings or publications must provide this information (see submission format). SIGdial 07 cannot accept for publication or presentation work that will be (or has been) published elsewhere. Authors are encouraged to make illustrative materials available, on the web or otherwise. For example, excerpts of recorded conversations, recordings of human-computer dialogues, interfaces to working systems, etc.
Important Dates (subject to change)
Submission May 2, 2007
Notification June 13, 2007
Final submissions July 6, 2007
Workshop September 2-3, 2007
Workshop website:To be announced
Submission website:To be announced
Sigdial website
Interspeech 2007 website
Program Committee (confirmed)
Harry Bunt, Tilburg University, Netherlands (co-chair)
Tim Paek, Microsoft Research, USA (co-chair)
Simon Keizer, Tilburg University, Netherlands (local chair)
Wolfgang Minker, University of Ulm, Germany
David Traum, USC/ICT, USA

SLaTE Workshop on Speech and Language Technology in Education
ISCA Tutorial and Research Workshop

The Summit Inn, Farmington, Pennsylvania USA October 1-3, 2007.
Speech and natural language processing technologies have evolved from being emerging new technologies to being reliable techniques that can be used in real applications. One worthwhile application is Computer-Assisted Language Learning. This is not only helpful to the end user, the language learner, but also to the researcher who can learn more about the technology from observing its use in a real setting. This workshop will include presentations of both research projects and real applications in the domain of speech and language technology in education.
Full paper deadline: May 1, 2007.
Notification of acceptance: July 1, 2007.
Early registration deadline: August 1, 2007.
Preliminary programme available: September 1, 2007.
Workshop will take place: October 1-3, 2007.
The workshop will be held in the beautiful Laurel Highlands. In early October the vegetation in the Highlands puts on a beautiful show of colors and the weather is still not too chilly. The event will take place at the Summit Inn, situated on one of the Laurel Ridges. It is close to the Laurel Caverns where amateur spelunkers can visit the underground caverns. The first night event will be a hayride and dinner at a local winery and the banquet will take place at Frank Lloyd Wright’s wonderful Fallingwater.
The workshop will cover all topics which come under the purlieu of speech and language technology for education. In accordance with the spirit of the ITRWs, the upcoming workshop will focus on research and results, give information on tools and welcome prototype demons
trations of potential future applications. The workshop will focus on research issues, applications, development tools and collaboration. It will be concerned with all topics which fit under the purview of speech and language technology for education. Papers will discuss theories, applications, evaluation, limitations, persistent difficulties, general research tools and techniques. Papers that critically evaluate approaches or processing strategies will be especially welcome, as will prototype demonstrations of real-world applications.
The scope of acceptable topic interests includes but is not limited to:
- Use of speech recognition for CALL
- Use of natural language processing for CALL
- Use of spoken language dialogue for CALL
- Applications using speech and/or natural language processing for CALL
- CALL tutoring systems
- Assessment of CALL tutors

The workshop is being organized by the new ISCA Special Interest Group, SLaTE. The general chair is Dr. Maxine Eskenazi from Carnegie Mellon University .
As per the spirit of ITRWs, the format of the workshop will consist of a non-overlapping mixture of oral, poster and demo sessions. Internationally recognized experts from pertinent areas will deliver several keynote lectures on topics of particular interest. All poster sessions will be opened by an oral summary by the session chair. A number of poster sessions will be succeeded by a discussion session focussing on the subject of the session. The aim of this structure is to ensure a lively and valuable workshop for all involved. Furthermore, the organizers would like to encourage researchers and industrialists to bring along their applications, as well as prototype demonstrations and design tools where appropriate. The official language of the workshop is English. This is to help guarantee the highest degree of international accessibility to the workshop. At the opening of the workshop hardcopies and CD-ROM of the abstracts and proceedings will be available.
We seek outstanding technical articles in the vein discussed above. For those who intend to submit papers, the deadline is May 1, 2007. Following preliminary review by the committee, notification will be sent regarding acceptance/rejection. Interested authors should send full 4 page camera-ready papers.
The fee for the workshop, including a booklet of Abstracts, the Proceedings on CD-ROM is:
- $325 for ISCA members and
- $225 for ISCA student members with valid identification
Registrations after August 1, 2007 cannot be guaranteed.
All meals except breakfast for the two and a half days as well as the two special events are included in this price. Hotel accommodations are $119 per night , and breakfast is about $10. Upon request we will furnish bus transport from the Greater Pittsburgh Airport and from Pittsburgh to Farmington at a cost of about $30. ISCA membership is 55 Euros. You must be a member of ISCA to attend this workshop.

ITRW Odyssey 2008

The Speaker and Language Recognition Workshop
21-25 January 2008, Stellenbosch, South Africa
* Speaker recognition(identification, verification, segmentation, clustering)
* Text dependent and independent speaker recognition
* Multispeaker training and detection
* Speaker characterization and adaptation
* Features for speaker recognition
* Robustness in channels
* Robust classification and fusion
* Speaker recognition corporaand evaluation
* Use of extended training data
* Speaker recognition with speaker recognition
* Forensics, multimodality and multimedia speaker recogntion
* Speaker and language confidence estimation
* Language, dialect and accent recognition
* Speaker synthesis and transformation
* Biometrics
* Human recognition
* Commercial applications
Paper submission
Proaspective authors are invited to submit papers written in English via the Odyssey website. The style guide, templates,and submission form can be downloaded from the Odyssey website. Two members of the scientific committee will review each paper. Each accepted paper must have at least one registered author. The Proceedings will be published on CD
Draft paper due July 15, 2007
Notification of acceptance September 15,2007
Final paper due October 30, 2007
Preliminary program November 30, 2007
Workshop January 21-25, 2008
Futher informations: venue, registation...
On the workshop website
Niko Brummer, Spescom Data Voice, South Africa
Johan du Preez.Stellenbosch University,South Africa

ISCA TR Workshop on Experimental Linguistics 2008

August 2008, Athens, Greece
Prof. Antonis Botinis

ITRW on Evidence-based Voice and Speech Rehabilitation in Head & Neck Oncology

May 2008, Amsterdam, The Netherlands,
Cancer in the head and neck area and its treatment can have debilitating effects on communication. Currently available treatment options such as radiotherapy, surgery, chemo-radiation, or a combination of these can often be curative. However, each of these options affects parts of the vocal tract and/or voice to a more or lesser degree. When the vocal tract or voice no longer functions optimally, this affects communication. For example, radiotherapy can result in poor voice quality, limiting the speaker’s vocal performance (fatigue from speaking, avoidance of certain communicative situations, etc.). Surgical removal of the larynx necessitates an alternative voicing source, which generally results in a poor voice quality, but further affects intelligibility and the prosodic structure of speech. Similarly, a commando procedure (resection involving portions of the mandible / floor of the mouth / mobile tongue) can have a negative effect on speech intelligibility. This 2 day tutorial and research workshop will focus on evidence-based rehabilitation of voice and speech in head and neck oncology. There will be 4 half day sessions, 3 of which will deal with issues concerning total laryngectomy. One session will be devoted to research on rehabilitation of other head and neck cancer sites. The chairpersons of each session will prepare a work document on the specific topic at hand (together with the two keynote lecturers assigned), which will be discussed in a subsequent round table session. After this there will be a 30’ poster session, allowing 9-10 short presentations. Each presentation consists of maximally 4 slides, and is meant to highlight the poster’s key points. Posters will be visited in the subsequent poster visit session. The final work document will refer to all research presently available, discuss its (clinical) relevance, and will attempt to provide directions for future research. The combined work document, keynote lectures and poster abstracts/papers will be published under the auspices of ISCA.
prof. dr. Frans JM Hilgers
prof. dr. Louis CW Pols,
dr. Maya van Rossum.
Sponsoring institutions:
Institute of Phonetic Sciences - Amsterdam Center for Language and Communication,
The Netherlands Cancer Institute – Antoni van Leeuwenhoek Hospital
Dates and submission details as well as a website address will be announced in a later issue.

Audio Visual Speech Processing Workshop (AVSP 2008)

Tentative location:Queensland coast near Brisbane (most likely South Stradbroke Island)
Tentative date: 27-29 September 2008 (immediately after Interspeech 2008)
Following in the footsteps of previous AVSP workshops / conferences, AVSP workshop (ISCA Research and Tutorial Workshop) will be hold concomitantly to Interspeech2008, Brisbane, Australia, 22-26 September 2008. The aim of AVSP2008 is to bring together researchers and practitioners in areas related to auditory-visual speech processing. These include human and machine AVSP, linguistics, psychology, and computer science. One of the aims of the AVSP workshops is to foster collaborations across disciplines, as AVSP research is inherently multi-disciplinary. The workshop will include a number of tutorials / keynote addresses by internationally renowned researchers in the area of AVSP.
Roland Goecke, Simon Lucey, Patrick Lucey
Australian National University,RSISE, Bldg. 115, Australian National University, Canberra, ACT 0200, Australia

Robust ASR Workshop

Santiago, Chile
October-November 2008
Dr. Nestor Yoma



AVSP 2007

International Conference on Auditory-Visual Speech Processing 2007,
August 31 - September 3, 2007
Kasteel Groenendael, Hilvarenbeek, The Netherlands
The next International Conference on Auditory-Visual Speech Processing (AVSP 2007) will be organised by different members of Tilburg University (The Netherlands). It will take place in Kasteel Groenendael in Hilvarenbeek (The Netherlands) from August 31, 2007 till September 3, 2007, immediately following Interspeech 2007 in Antwerp (Belgium). Hilvarenbeek is located at close distance from Antwerp, so that attendance at AVSP 2007 can easily be combined with participation in Interspeech 2007.
Auditory-visual speech production and perception by human and machine is an interdisciplinary and cross-linguistic field which has attracted speech scientists, cognitive psychologists, phoneticians, computational engineers, and researchers in language learning studies. Since the inaugural workshop in Bonas in 1995, Auditory-Visual Speech Processing workshops have been organised on a regular basis (see an overview at the avisa website). In line with previous meetings, this conference will consist of a mixture of regular presentations (both posters and oral), and lectures by invited speakers. All presentations will be plenary.
We are happy to announce that the following experts have agreed to give a keynote lecture at our conference: Sotaro Kita (Birmingham)
Asif Ghazanfar (Princeton)
More details about the conference can be found on the website
Further information

The 12th International Conference on Speech and Computer

October 15-18, 2007
Organized by Moscow State Linguistic University
General Chair:
Prof. Irina Khaleeva (Moscow State Linguistic University)
Prof. Rodmonga Potapova (Moscow State Linguistic University)
SPECOM'07 is the twelfth conference in the annual series of SPECOM events. It is organized by Moscow State Linguistic University and will be held in Moscow, Russia, under the sponsorship of Russian Foundation for Basic Research (RFBR), Ministry of Education and Science of Russian Federation, the International Speech Communication Association (ISCA) and others. SPECOM'07 will cover various aspects of speech science and technology. The program of the conference will include keynote lectures by internationally renowned scientists, parallel oral and poster sessions and an exhibition. The sci-tech exhibition that will be held during the conference will be open to companies and research institutions. The official language of the Conference will be English.
Important Dates (Extended)
Paper submission opening February 1, 2007
Full paper deadline (EXTENDED) June 20, 2007
Paper acceptance notification: July 1, 2007
Conference October 15-18, 2007
o Speech signal coding and decoding; multi-channel transmitted speech intelligibility; speech information security
o Speech production and perception modeling
o Automatic processing of multilingual, multimodal and multimedia information
o Linguistic, para- and extralinguistic communicative strategies
o Development and testing of automatic voice and speech systems for speaker verification; speaker psychoemotional state and native language identification
o Automatic speech recognition and understanding systems
o Language and speech information processing systems for robotechnics
o Automated translation systems
o New information technologies for spoken language acquisition, development and learning
o Text-to-speech conversion systems
o Spoken and written natural language corpora linguistics
o Multifunctional expert and information retrieval systems
o Future of multi-purpose and anti-terrorist speech technologies
The deadline for full paper submission (4-6 pages) is April 25, 2007. Papers are to be sent by e-mail to All manuscripts must be in English. Please note that the size of a single letter must not exceed 10 Megabytes (that is, the total size of all the attached files should not be greater than 7 Megabytes to leave room for recoding operations performed by the e-mail software). In case the paper files are larger than 7 Megabytes, it is recommended to pack them into a split WinRar or WinZip archive and send part by part in a series of letter.
All the papers will be reviewed by an international scientific committee. Each author will be notified by e-mail of the acceptance or rejection of her/his paper by May 30, 2007. Minor updates of accepted papers will be allowed during May 30 - June 15, 2007.
Submission of a paper or poster is more likely to be accepted if it is original, innovative, and contributes to the practice of worldwide scientific communication. Quality of work, clarity and completeness of the submitted materials will be considered.
Registration will be available at the Conference on arrival. Early registration deadline: July 10, 2007. The registration fees are planned to be approximately as follows:
Regular 500 EUR
Students/PG Students 200 EUR
NIS (New Independent States), Regular 300 EUR
NIS, Students/PG Students 100 EUR
Russia, Regular 150 EUR
Russia, Students/PG Students (no Proceedings) Free Extra Copy of Proceedings (hard copy) 20 EUR
Extra Proceedings CD/DVD 10 EUR
Information regarding accommodation costs will be available later. All the registration and accommodation payments will be accepted in cash during the registration procedure on arrival.
In the following you will find guidelines for preparing your full paper to SPECOM'07 electronically.
· To achieve the best viewing experience both for the Proceedings and the CD (or DVD), we strongly encourage you to use Times Roman font. This is needed in order to give the Proceedings a uniform look. Please use the attached printable version of this newsletter as a model.
· Authors are requested to submit PDF files of their manuscripts, generated from the original Microsoft Word sources. PDF files can be generated with commercially available tools or with free software such as PDFCreator.
· Paper Title - The paper title must be in boldface. All non-function words must be capitalized, and all other words in the title must be lower case. The paper title is centered.
· Authors' Names - The authors' names (italicized) and affiliations (not italicized) appear centered below the paper title.
· Abstract - Each paper must contain an abstract that appears at the beginning of the paper.
· Major Headings - Major headings are in boldface.
· Sub Headings - Sub headings appear like major headings, except that they are in italics and not bold face.
· References - Number and list all references at the end of the paper. The references are numbered in order of appearance in the document. When referring to them in the text, type the corresponding reference number in square brackets as shown at the end of this sentence [1].
· Illustrations - Illustrations must appear within the designated margins, and must be positioned within the paper margins. Caption and number every illustration. All half-tone or color illustrations must be clear when printed in black and white. Line drawings must be made in black ink on white paper.
· Do NOT include headers and footers. The page numbers, session numbers and conference identification will be inserted automatically in a post processing step, at the time of printing the Proceedings.
· Apart from the paper in PDF format, authors can upload multimedia files to illustrate their submission. Multimedia files can be used to include materials such as sound files or movies. The proceedings CD (DVD) will NOT contain readers or players, so only widely accepted file formats should be used, such as MPEG, Windows WAVE PCM (.wav) or Windows Media Video (.wmv), using only standard codecs to maximize compatibility. Authors must ensure that they have sufficient author rights to the material that they submit for publication. Archives (RAR, ZIP or ARJ format) are allowed. The archives will be unpacked on the CD (DVD), so that authors can refer to the file name of the multimedia illustration from within their paper. The submitted files will be accessible from the abstract card on the CD (DVD) and via a bookmark in the manuscript. We advise to use SHORT but meaningful file names. The total unzipped size of the multimedia files should be reasonable. It is recommended that they do not exceed 32 Megabytes.
· Although no copyright forms are required, the authors must agree that their contribution, when accepted, will be archived by the Organizing Committee.
· Authors must proofread their manuscripts before submission and they must proofread the exact files which they submit.
Only electronic presentations are accepted. PowerPoint presentations can be supplied on CD, DVD, FD or USB Flash drives. Designated poster space will be wooden or felt boards. The space allotted to one speaker will measure 100 cm (width) x 122 cm (height). Posters will be attached to the boards using pushpins. Pins will be provided. Thanks for following all of these instructions carefully! If you have any questions or comments concerning the submission, please don't hesitate to contact the conference organizers. Please address all technical issues or questions regarding paper submission or presentation to our technical assistant Nikolay Bobrov.


Automatic Speech Recognition and Understanding Workshop
The Westin Miyako Kyoto, Japan
December 9 -13, 2007
Conference website
The tenth biannual IEEE workshop on Automatic Speech Recognition and Understanding (ASRU) cooperated by ISCA will be held during December 9-13, 2007. The ASRU workshops have a tradition of bringing together researchers from academia and industry in an intimate and collegial setting to discuss problems of common interest in automatic speech recognition and understanding.
Papers in all areas of human language technology are encouraged to be submitted, with emphasis placed on:
- automatic speech recognition and understanding technology
- speech to text systems
- spoken dialog systems
- multilingual language processing
- robustness in ASR
- spoken document retrieval
- speech-to-speech translation
- spontaneous speech processing
- speech summarization,
- new applications of ASR.
The workshop program will consist of invited lectures, oral and poster presentations, and panel discussions. Prospective authors are invited to submit full-length, 4-6 page papers, including figures and references, to the ASRU 2007 website. All papers will be handled and reviewed electronically. The website will provide you with further details. There is also a demonstration session, which has become another highlight of the ASRU workshop. Demonstration proposals will be handled separately. Please note that the submission dates for papers are strict deadlines.
Paper submission deadline July 16, 2007
Paper acceptance/rejection notification September 3, 2007
Demonstration proposal deadline September 24, 2007
Workshop advance registration deadline October 15, 2007
Workshop December 9-13, 2007
Registration will be handled via the ASRU 2007 website .
General Chairs:
Sadaoki Furui (Tokyo Inst. Tech.)
Tatsuya Kawahara (Kyoto Univ.)
Technical Chairs:
Jean-Claude Junqua (Panasonic)
Helen Meng (Chinese Univ. Hong Kong)
Satoshi Nakamura (ATR)
Publication Chair:
Timothy Hazen, MIT, USA
Publicity Chair:
Tomoko Matsui, ISM, Japan
Demonstration Chair:
Kazuya Takeda, Nagoya U, Japan

CfP-3rd International Conference on Large-scale Knowledge Resources (LKR 2008)

3-5 March, 2008, Tokyo Institute of Technology, Tokyo Japan
Sponsored by: 21st Century Center of Excellence (COE) Program "Framework for Systematization and Application of Large-scale",Tokyo Institute of Technology
In the 21st century, we are now on the way to the knowledge-intensive society in which knowledge plays ever more important roles. Research interest should inevitably shift from information to knowledge, namely how to build, organize, maintain and utilize knowledge are the central issues in a wide variety of fields. The 21st Century COE program, "Framework for Systematization and Application of Large-scale Knowledge Resources (COE-LKR)" conducted by Tokyo Institute of Technology is one of the attempts to challenge these important issues. Inspired by this project, LKR2008 aims at bringing together diverse contribution in cognitive science, computer science, education and linguistics to explore design, construction, extension, maintenance, validation, and application of knowledge.
Topics of Interest to the conference includes:
Infrastructure for Large-scale Knowledge
Grid computing
Network computing
Software tools and development environments
Database and archiving systems
Mobile and ubiquitous computing
Systematization for Large-scale Knowledge
Language resources
Multi-modal resources
Classification, Clustering
Formal systems
Knowledge representation and ontology
Semantic Web
Cognitive systems
Collaborative knowledge
Applications and Evaluation of Large-scale Knowledge
Archives for science and art
Educational media
Information access
Document analysis
Multi-modal human interface
Web applications
Organizing committee General conference chair: Furui, Sadaoki (Tokyo Institute of Technology)
Program co-chairs: Ortega, Antonio (University of Southern California)
Tokunaga, Takenobu (Tokyo Institute of Technology)
Publication chair: Yonezaki, Naoki (Tokyo Institute of Technology)
Publicity chair: Yokota, Haruo (Tokyo Institute of Technology)
Local organizing chair: Shinoda, Koichi (Tokyo Institute of Technology)
Since we are aiming at an interdisciplinary conference covering wide range of topics concerning large-scale knowledge resources, authors are requested to add general introductory description in the beginning of the paper so that readers of other research area can understand the importance of the work. Note that one of the reviewers of each paper is assigned from other topic area to see if this requirement is fulfilled.
There are two categories of paper presentation: oral and poster. The category of the paper should be stated at submission. Authors are invited to submit original unpublished research papers, in English, up to 12 pages for oral presentation and 4 pages for poster presentation, strictly following the LNCS/LNAI format guidelines available at the Springer LNCS Web page. . Details of the submission procedure will be announced later on.
The reviewing of the papers will be blind and managed by an international Conference Program Committee consisting of Area Chairs and associated Program Committee Members. Final decisions on the technical program will be made by a meeting of the Program Co-Chairs and Area Chairs. Each submission will be reviewed by at least three program committee members, and one of the reviewers is assigned from a different topic area.
The conference proceedings will be published by Springer-Verlag in their Lecture Notes in Artificial Intelligence (LNAI), which will be available at the conference.
Important dates
Paper submission deadline: 30 August, 2007
Notification of acceptance: 10 October, 2007
Camera ready papers due: 10 November, 2007
e-mail correspondence

Call for Papers (Preliminary version) Speech Prosody 2008

Campinas, Brazil, May 6-9, 2008
Speech Prosody 2008 will be the fourth conference of a series of international events of the Special Interest Groups on Speech Prosody (ISCA), starting by the one held in Aix-en Provence, France, in 2002. The conferences in Nara, Japan (2004), and in Dresden, Germany (2006) followed the proposal of biennial meetings, and now is the time of changing place and hemisphere by trying the challenge of offering a non-stereotypical view of Brazil. It is a great pleasure for our labs to host the fourth International Conference on Speech Prosody in Campinas, Brazil, the second major city of the State of São Paulo. It is worth highlighting that prosody covers a multidisciplinary area of research involving scientists from very different backgrounds and traditions, including linguistics and phonetics, conversation analysis, semantics and pragmatics, sociolinguistics, acoustics, speech synthesis and recognition, cognitive psychology, neuroscience, speech therapy, language teaching, and related fields. Information: Web site: We invite all participants to contribute with papers presenting original research from all areas of speech prosody, especially, but nor limited to the following.
Scientific Topics
Prosody and the Brain
Long-Term Voice Quality
Intonation and Rhythm Analysis and Modelling
Syntax, Semantics, Pragmatics and Prosody
Cross-linguistic Studies of Prosody
Prosodic variability
Prosody in Discourse
Dialogues and Spontaneous Speech
Prosody of Expressive Speech
Perception of Prosody
Prosody in Speech Synthesis
Prosody in Speech Recognition and Understanding
Prosody in Language Learning and Acquisition
Pathology of Prosody and Aids for the Impaired
Prosody Annotation in Speech Corpora
Others (please, specify)
Organising institutions
Speech Prosody Studies Group, IEL/Unicamp | Lab. de Fonética, FALE/UFMG | LIACC, LAEL, PUC-SP
Important Dates
Call for Papers: May 15, 2007
Full Paper Submission: Sept. 30, 2007
Notif. of Acceptance: Nov. 30, 2007
Early Registration: Dec. 20, 2007
Conference: May 6-9, 2008



ICPhS 2007 Satellite Workshop on Speaker Age

Saarland University, Saarbruecken, Germany
August 4, 2007
Submission Deadline: April 15, 2007
This workshop is dedicated to current research on speaker age, a speaker-specific quality which is always present in speech. Although researchers have investigated several of aspects of speaker age, numerous questions remain, including (1) the accuracy by which human listeners and automatic recognizers are able to judge child and adult speaker age from speech samples of different types and lengths, (2) the acoustic and perceptual features (and combination of features) which contain the most important age-related information, and (3) the optimal methods for extracting age-related features and integrating speaker age into speech technology and forensic applications. The purpose of the workshop is to bring together participants from divergent backgrounds (e.g. forensics, phonetics, speech therapy and speech technology) to share their expertise and results. Further information can be found on the workshop website.
The topics cover, among others, the following issues:
- methods and tools to identify acoustic correlates of speaker age
- systems which automatically recognize (or estimate) speaker age
- studies on the human perception of speaker age
- projects on the synthesis of speaker age
If you are interested in contributing to the workshop, please send an extended abstract to both of the organizers Christian Mueller and Susanne Schotz by April 15, 2007. Contributions on work in progress are specifically encouraged. The abstract does not have to be formatted. Feel free to send .doc, .pdf, .txt or .tex files.

Interdisciplinary Workshop on "The Phonetics of Laughter

5 August 2007
Saarbrücken, Germany
Aim of the workshop
Research investigating the production, acoustics and perception of laughter is very rare. This is striking because laughter occurs as an everyday and highly communicative phonetic activity in spontaneous discourse. This workshop aims to bring researchers together from various disciplines to present their data, methods, findings, research questions, and ideas on the phonetics of laughter (and smiling).
The workshop will be held as a satellite event of the 16th International Congress of Phonetic Sciences in Saarbrücken, Germany.
We invite submission of short papers of approximately 1500 words length. Oral presentations will be 15 minutes plus 5 minutes discussion time. Additionally, there will be a poster session. All accepted papers will be available as on-line proceedings on the web, there will be no printed proceedings. We plan to publish selected
All submissions will be reviewed anonymously by two reviewers. Please send submissions by e-mail to specifying "short paper" in the subject line and providing
1. for each author: name, title, affiliation in the body of the mail
2. Title of paper
3. Preference of presentation mode (oral or poster)
4. Short paper as plain text
In addition you can submit audio files (as wav), graphical files (as jpg) and video clips (as mpg). All files together should not exceed 1 Mb.
Important dates
Submission deadline for short papers: March 16, 2007
Notification of acceptance: May 16, 2007
Early registration deadline: June 16, 2007
Workshop dates: August 5, 2007
Plenary lecture
Wallace Chafe (University of California, Santa Barbara)
Organisation Committee
Nick Campbell (ATR, Kyoto)
Wallace Chafe (University of California, Santa Barbara)
Jürgen Trouvain (Saarland University & Phonetik-Büro Trouvain, Saarbrücken)
The laughter workshop will take place in the Centre for Language Research and Language Technology on the campus of the Saarland University in Saarbrücken, Germany. The campus is located in the woods and is 5 km from the town centre of Saarbrücken.
Jürgen Trouvain Saarland University
FR. 4.7: Computational Linguistics and Phonetics
Building C7.4
Postfach 15 11 50
66041 Saarbrücken

16th International Congress of Phonetic Sciences

Saarland University, Saarbrücken,
6-10 August 2007.
The first call for papers will be made in April 2006. The deadline for *full-paper submission* to ICPhS 2007 Germany will be February 2007. Further information is available under conference website

ParaLing'07: International workshop on "Paralinguistic speech - between models and data"

Thursday 2 - Friday 3 August 2007
Saarbrücken, Germany
Workshop website
in association with the 16th International Conference on Phonetic Sciences,// Saarbrücken, Germany, 6-10 August 2007
Summary of the call for participation
This two-day workshop is concerned with the general area of paralinguistic speech, and will place special emphasis on attempts to narrow the gap between "models" (usually built making strong simplifying assumptions) and "real data" (usually showing a high degree of complexity).
Papers are invited in a broad range of topics related to paralinguistic speech. Papers can be submitted for oral or poster presentation; acceptance for oral presentation is more likely for papers that explicitly address the general theme of the workshop, i.e. "bridging" issues.
There are at least two different versions of bridging: a weak one and a strong one. The weak, more modest one aims at a better mutual understanding, the strong one at profiting from each other's work. We do not know yet whether after these two days, we really will be able to profit from each other in our own work; however, we do hope that we will have reached a level of mutual understanding that will make future co-operation easier.
Research on various aspects of paralinguistic and extralinguistic speech has gained considerable importance in recent years. On the one hand, models have been proposed for describing and modifying voice quality and prosody related to factors such as emotional states or personality. Such models often start with high-intensity states (e.g., full-blown emotions) in clean lab speech, and are difficult to generalise to everyday speech. On the other hand, systems have been built to work with moderate states in real-world data, e.g. for the recognition of speaker emotion, age, or gender. Such models often rely on statistical methods, and are not necessarily based on any theoretical models.
While both research traditions are obviously valid and can be justified by their different aims, it seems worth asking whether there is anything they can learn from each other. For example: "Can models become more robust by incorporating methods used for dealing with real-world data?"; "Can recognition rates be improved by including ideas from theoretical models?"; "How would a database need to be structured so that it can be used for both, research on model-based synthesis and research on recognition?" etc.
While the workshop will be open to any kind of research on paralinguistic speech, the workshop structure will support the presentation and creation of cross-links in several ways:
- papers with an explicit contribution to cross-linking issues will stand a higher chance to be accepted as oral papers;
- sessions and proceedings will include space for peer comments and answers from authors;
- poster sessions will be organised around cross-cutting issues rather than traditional research fields, where possible.
We therefore encourage prospective participants to place their research into a wider perspective. This can happen in many ways; as illustrations, we outline two possible approaches.
1. In application-oriented research, such as synthesis or recognition, a guiding principle could be the requirements of the "ideal" application: for example, the recognition of finely graded shades of emotions, for all speakers in all situations; or fully natural-sounding synthesis with freely specifiable expressivity; etc. This perspective is likely to highlight the hard problems of today's state of the art, and a cross-cutting perspective may lead to innovative approaches yielding concrete steps to reduce the distance towards the "ideal".
2. A second illustration of attaining a wider perspective would be to attempt to cross-link work in generative modelling (e.g., expressive speech synthesis) and analysis (e.g., recognition of expressivity from speech). Researchers on generation are invited to investigate the relevance of their work for analysis, and vice versa. What methodologies, corpora or descriptive inventories exist that could be shared between analysis and generation, or at least mapped onto each other? If certain parameters have proven to be relevant in one area, to what degree is it possible to transfer them to the other area? Issues of relevance in this area may include, among other things, personalisation, speaker dependency vs. independency, links between voice conversion in synthesis and speaker calibration in (automatic) recognition or (human) perception, etc.
Paper are invited in all areas related to paralinguistic speech, including, but not limited, to the following topics:
- prosody of paralinguistic speech
- voice quality and paralinguistic speech
- synthesis of paralinguistic speech (model-based, data-driven, ...)
- recognition/classification of paralinguistic properties of speech
- analysis of paralinguistic speech (acoustics, physiology, ...)
- assessment and perception of paralinguistic speech
- typology of paralinguistic speech (emotion, expression, attitude, physical states, ...)
While all papers must be related to paralinguistic speech, papers making the link with a related area, e.g. investigating the interaction of the speech signal with the meaning of the verbal content, are explicitly welcome.
1st call for papers 1 December 2006
2nd call for papers 1 February 2007
Deadline for full-paper submission 23April (extended deadline!)
Notification of acceptance 1 June
Final version of accepted papers 15 June
Workshop 2-3 August 2007
The workshop will take place at DFKI on the campus of Saarland University, Germany; on the same campus, the International Conference of Phonetic Sciences will take place during the following week.
Workshop registration fees: To be calculated, but will be around ~150 EUR
The workshop will consist of oral and poster presentations. Submitted papers will stand a higher chance of being accepted as oral presentations when the relevance to the workshop theme is evident.
Final submissions should be 6 pages long, and must be in English. Word+Latex+OpenOffice templates will be made available on the workshop website.
Marc Schröder, DFKI GmbH, Saarbrücken, Germany
Anton Batliner, University of Erlangen-Nürnberg, Germany
Christophe d'Alessandro, LIMSI, Paris, France

eNTERFACE summer workshops

Bogazici University,
Istanbul, Turkey.
These 4 week workshops, organized by the SIMILAR European Network of Excellence, "aim at establishing a tradition of collaborative, localized research and development work by gathering, in a single place, a group of senior project leaders, researchers, and (undergraduate) students, working together on a pre-specified list of challenges". The two previous workshops were held at Mons (05) and Dubrovnik (06), and ISCA was listed as a sponsor in both organizations.
Prof. Murat Saraclar

>SSP 2007 - IEEE Statistical signal processing workshop (SSP)

The Statistical Signal Processing (SSP) workshop, sponsored by the IEEE Signal Processing Society, brings members of the IEEE Signal Processing Society together with researchers from allied fields such as statistics and bioinformatics. The scope of the workshop includes basic theory, methods and algorithms, and applications of statistics in signal processing.
Theoretical topics:
- Adaptive systems and signal processing
- Monte Carlo methods
- Detection and estimation theory
- Learning theory and pattern recognition
- Multivariate statistical analysis
- System identification and calibration
- Time-frequency and time-scale analysis
Application areas:
- Bioinformatics and genomic signal processing
- Automotive and industrial applications
- Array processing, radar, and sonar
- Communication systems and networks
- Sensor networks
- Information forensics and security
- Biosignal processing and medical imaging
- New methods, directions, and applications
Date and venue
The workshop will be held on August 26-29, 2007, in Madison, Wisconsin, a vibrant city situated on a narrow isthmus between two large lakes. The workshop will be co-located at the spectacular Frank Lloyd Wright inspired Monona Terrace Convention Center. Plenary lecturers include:
- William Freeman (MIT)
- Emmanuel Candes (Caltech)
- George Papanicolaou (Stanford)
- Nir Friedman (Hebrew University)
- Richard Davidson (Univ. of Wisconsin)
How to submit
Paper submission: Prospective authors are invited to submit extended summaries of not more than three (3) pages including results, figures, and references. Papers will be accepted only by electronic submission by EMAIL starting March 1, 2007.
Important dates
Submission of 3-page extended summary: April 1, 2007
Notification of acceptance: June 1, 2007
Submission of 5-page camera-ready papers: July 1, 2007
The workshop will include a welcoming reception, banquet, technical poster sessions, several special invited sessions, and several plenary lectures.
Further information available on conference web.

Speech and Audio Processing in Intelligent Environments

Special Session at Interspeech 2007, Antwerp, Belgium
Ambient Intelligence (AmI) describes the vision of technology that is invisible, embedded in our surroundings and present whenever we need it. Interacting with it should be simple and effortless. The systems can think on their own and can make our lives easier with subtle or no direction. Since the early days of this computing and interaction paradigm speech has been considered a major building block of AmI. The purpose of speech and audio processing is twofold:
• Support of explicit interaction: Speech as an input/output modality that facilitates the aforementioned simple and effortless interaction, preferably in cooperation with other modalities like gesture.
• Support of implicit interaction: Speech, and acoustic signals in general, as a source of context information which provide valuable information, such as “who speaks when and where”, to be utilized in systems that are context-aware, personalized, adaptive, or even anticipatory.
The goal of this special session is to give an overview of major achievements, but also to highlight major challenges. Does state-of-the-art of speech and audio processing meet the high expectations expressed in the scenarios of AmI, will it ever do? We would also like to address in this special session what are the perspectives and promising concepts for the future. The session will consist of an introduction in the form of a short tutorial, followed by the presentation of contributed papers and the session will conclude with a panel discussion.
Researchers who are interested in contributing to this special session are invited to submit a paper according to the regular submission procedure of INTERSPEECH 2007, and to select “Speech and Audio Processing in Intelligent Environments” as the topic of their first choice. The paper submission deadline is March 23, 2007.
The subjects to be covered include, but are not restricted to:
- Speech and audio processing for context acquisition (e.g. online speaker change detection and tracking, acoustic scene analysis, audio partitioning and labelling)
- Ubiquitous speech recognition (e.g. ASR with distant microphones, distributed speech recognition)
- Context-aware and personalized speech processing (e.g. in spoken dialogue processing, acoustic and language modelling)
- Speech processing for intelligent systems (e.g. descriptions of prototypes, projects)
Session organizers:
Prof. Dr. Reinhold Haeb-Umbach
Department of Communications Engineering
University of Paderborn, Germany
Prof. Dr. Zheng-Hua Tan
Department of Electronic Systems
Aalborg University, Denmark

The 2007 International Workshop on Intelligent Systems and Smart Home (WISH-07)

Conference webite
Niagara Falls, Canada, August 28-September 1, 2007
In Conjunction with The 5th International Symposium on Parallel and Distributed Processing and Applications (ISPA-07)
Workshop Overview
Smart Home Environments (SHE) are emerging rapidly as an exciting new paradigm including ubiquitous, grid, and peer-to-peer computing to provide computing and communication services any time and anywhere. Our workshop is intended to foster the dissemination of state-of-the-art research in the area of SHE including intelligent system, security services, business models, and novel applications associated with its utilization. The goal of this Workshop is to bring together the researchers from academia and industry as well as practitioners to share ideas, problems and solutions relating to all aspects of intelligent systems and smart home.
We invite authors to submit papers on any aspect of intelligence systems / smart home research and practice. All papers will be peer reviewed, and those accepted for the workshop will be included in a proceedings volume published by Springer-Verlag.
Workshop Topics (include but are not limited to the following)
I. Intelligent Systems
- Ubiquitous and Artificial Intelligent
- Environment sensing / understanding
- Information retrieval and enhancement
- Intelligent data analysis and e-mail processing
- Industrial applications of AI
- Knowledge acquisition, engineering, discovery and representation
- Machine learning and translation
- Mobile / Wearable intelligence
- Natural language processing
- Neural networks and intelligent databases
- Data mining and Semantic web
- Computer-aided education
- Entertainment
- Metrics for evaluating intelligent systems
- Frameworks for integrating AI and data mining
II. Smart Home
- Wireless sensor networks (WSN) / RFID application for SH
- Smart Space (Home, Building, Office) applications and services
- Smart Home network middleware and protocols
- Context Awareness for Smart Home Services
- Multimedia Security and Services in SH
- Security Issues for SHE
- Access control and Privacy Protection in SH
- Forensics and Security Policy in SH
- WSN / RFID Security in SH
- Commercial and industrial system & application for SH
Important Dates
Paper Submission deadline March 31, 2007
Acceptance notification May 21, 2007
Camera-ready due June 01, 2007
Workshop date August 28th-September 1th, 2007
* Steering Co-Chairs
Laurence T. Yang, St Francis Xavier University, Canada
Minyi Guo, University of Aizu, Japan
* General Co-chairs
Ching-Hsien Hsu, Chung Hua University, Taiwan
Jong Hyuk Park Hanwha S&C Co., Ltd., Korea * Program Co-chairs
Cho-Li Wang The University of Hong Kong, Hong Kong
Gang Pan Zhijiang University, China
Proceeding & Special Issue
The workshop proceedings will be published by Lecture Notes in Computer Science (LNCS).
Papers not exceed 12 pages with free layout style should be submitted through the website. Submission of a paper should be regarded as a commitment that, if the paper be accepted, at least one of the authors will register and attend the conference. Otherwise papers will be removed from the LNCS digital library. A selection of the best papers will be published in a special issue of Information Systems Frontiers (ISF) and International Journal of Smart Home (IJSH), respectively.
For further information regarding the WISH-07 and paper submission, please contact ASWAN '07 Cyber-chair or Prof. Hsu.

Special session at Interspeech 2007: Novel techniques for the NATO non-native military air traffic controller database (nn-matc)

Following a series of special interest sessions and (satellite) workshops, at Lisbon (1995), Leusden (NL, 1999) and Aalborg (2001), the NATO research task group on speech and language technology, RTO IST031-RTG013, organizes a special session at Interspeech 2007. After having studied various aspects of speech in noise, speech under stress, and non-native speech, the research task group has been studying the effects of all of these factors on various speech technologies.
To this end, the task group has collected a corpus of military Air Traffic Control communication in Belgian air space. This speech material consists predominantly of non-native English speech, under varying noise and channel conditions. The database has been annotated at several levels:
* word transcriptions, which allow research to be conducted on automatic speech recognition and named entity extraction,
* Speaker turns, identified by call signs, allowing for research in speaker recognition and clustering and tracking of conversations.
The database consists of 16 hours of training speech, plus one hour of development and evaluation test sets.
The NATO research task group is making this annotated speech database available for speech researchers, who want to develop novel algorithms for this challenging material. These new algorithms could include noise-robust speaker recognition, robust speaker and accent adaptation for ASR, and context driven named entity detection. In order to facilitate a common task, we have written a suggested test and evaluation plan to guide researchers. At the special session we will discuss research results on this common data set.
More information on the special session, the database and the evaluation plan can be found on the web-site
Researchers who are interested in contributing to this special session are invited to submit a paper according to the regular submission procedure of INTERSPEECH 2007, and to select `Novel techniques for the NATO non-native Air Traffic Control database' in the special session field of the paper submission form. The paper submission deadline is March 23, 2007.
Session organizer: David van Leeuwen
TNO Human Factors
P. O. Box 23
3769 ZG Soesterberg
The Netherlands

Structure-Based and Template-Based Automatic Speech Recognition - Comparing parametric and non-parametric approaches

Special Session at INTERSPEECH 2007, Antwerp, Belgium
While hidden Markov modeling (HMM) has been the dominant technology for acoustic modeling in automatic speech recognition today, many of its weaknesses have also been well known and they have become the focus of much intensive research. One prominent weakness in current HMMs is the handicap in representing long-span temporal dependency in the acoustic feature sequence of speech, which, nevertheless, is an essential property of speech dynamics. The main cause of this handicap is the conditional IID (Independent and Identical Distribution) assumption inherit in the HMM formalism. Furthermore, in the standard HMM approach the focus is on verbal information. However, experiments have shown that non-verbal information also plays an important role in human speech recognition which the HMM framework has not attempted to address directly. Numerous approaches have been taken over the past dozen years to address the above weaknesses of HMMs. These approaches can be broadly classified into the following two categories.
The first, parametric, structure-based approach establishes mathematical models for stochastic trajectories/segments of speech utterances using various forms of parametric characterization, including polynomials, linear dynamic systems, and nonlinear dynamic systems embedding hidden structure of speech dynamics. In this parametric modeling framework, systematic speaker variation can also be satisfactorily handled. The essence of such a hidden-dynamic approach is that it exploits knowledge and mechanisms of human speech production so as to provide the structure of the multi-tiered stochastic process models. A specific layer in this type of models represents long-range temporal dependency in a parametric form.
The second, non-parametric and template-based approach to overcoming the HMM weaknesses involves direct exploitation of speech feature trajectories (i.e., 'template') in the training data without any modeling assumptions. Due to the dramatic increase of speech databases and computer storage capacity available for training, as well as the exponentially expanded computational power, non-parametric methods using the traditional pattern recognition techniques of kNN (k-nearest-neighbor decision rule) and DTW (dynamic time warping) have recently received substantial attention. Such template-based methods have also been called exemplar-based or data-driven techniques in the literature.
The purpose of this special session is to bring together researchers who have special interest in novel techniques that are aimed at overcoming weaknesses of HMMs for acoustic modeling in speech recognition. In particular, we plan to address issues related to the representation and exploitation of long-range temporal dependency in speech feature sequences, the incorporation of fine phonetic detail in speech recognition algorithms and systems, comparisons of pros and cons between the parametric and non-parametric approaches, and the computation resource requirements for the two approaches.
This special session will start with an oral presentation in which an introduction of the topic is provided, a short overview of the issues involved, directions that have already been taken, and possible new approaches. At the end there will be a panel discussion, and in between the contributed papers will be presented.
. Session organizers:
Li Deng
Helmer Strik
Information about this special session can also be found at the Interspeech Website
or at the Special session website

2007 Young Researchers' Roundtable on Spoken Dialog Systems (YRRSDS)

(Interspeech 2007 Satellite Event)
Antonio Roque - PhD student, USC/ICT -

Machine Learning for Spoken Dialogue Systems: Special Session at INTERSPEECH 2007, Antwerp, Belgium

Submission deadline: 23rd March
Interspeech 2007 website
During the last decade, research in the field of Spoken Dialogue Systems (SDS) has experienced increasing growth. Yet the design and optimization of SDS does not simply involve combining speech and language processing systems such as Automatic Speech Recognition (ASR), parsers, Natural Language Generation (NLG), and Text-to-Speech (TTS) synthesis. It also requires the development of dialogue strategies taking into account the performances of these subsystems, the nature of the dialogue task (e.g. form filling, tutoring, robot control, or search), and the user's behaviour (e.g. cooperativeness, expertise). Currently, statistical learning techniques are emerging for training and optimizing speech recognition, parsing, and generation in SDS, depending on representations of context. Automatic learning of optimal dialogue strategies is also a leading research topic.
Among machine learning techniques for dialogue strategy optimization, Reinforcement Learning using Markov Decision Processes (MDPs) and Partially Observable MDP (POMDPs) has become a particular focus. One concern for such approaches is the development of appropriate dialogue corpora for training and testing.
Dialogue simulation is often required to expand existing corpora and so spoken dialogue simulation has become a research field in its own right. Other areas of interest are statistical approaches in context-sensitive speech recognition, trainable NLG, and statistical parsing for dialogue.
The purpose of this special session is to offer the opportunity to the international community concerned with these topics to share ideas and have constructive discussions in a single, focussed, special conference session.
Submission instructions
Researchers who are interested in contributing to this special session are invited to submit a paper according to the regular submission procedure of INTERSPEECH 2007, and to select "Machine Learning for Spoken Dialogue Systems" in the special session field of the paper submission form. The paper submission deadline is March 23, 2007.
The subjects to be covered include, but are not restricted to:
* Reinforcement Learning of dialogue strategies
* Partially Observable MDPs in dialogue strategy optimization
* Statistical parsing in dialogue systems
* Machine learning and context-sensitive speech recognition
* Learning and NLG in dialogue
* User simulation techniques for strategy learning and testing
* Corpora and annotation for machine learning approaches to SDS
* Machine learning for multimodal interaction
* Evaluation of statistical approaches in SDS
Session organizers:
Oliver Lemon, Edinburgh University School of Informatics
Olivier Pietquin SUPELEC - Metz Campus IMS Research Group Metz

Speech and language technology for less-resourced languages

Two-hour Special Session at INTERSPEECH 2007, Antwerp, Belgium
Interspeech website
Special Session website
Speech and language technology researchers who work on less-resourced languages often have very limited access to funding, equipment and software.
This makes it all the more important for them to come together to share best practice, in order to avoid a duplication of effort. This special session will therefore be devoted to speech and language technology for less-resourced languages.
In view of the limited resources available to the targeted researchers, there will be a particular emphasis on "free" software, which may be either open-source or closed-source. However, submissions are also invited from those using commercial software.
Topics may include (but are not limited to) the following:
* Examples of systems built using free or purpose-built software (possibly with demonstrations).
* Presentations of bugs in free software, and strategies for dealing with them.
* Presentations of additions and enhancements made to the software by a research group.
* Presentations of linguistic challenges for a particular less-resourced language.
* Descriptions of desired features for possible future implementation.
Researchers who are interested in contributing to this special session are invited to submit either a paper or a demo or both, as follows.
1. Papers can be submitted by proceeding according to the regular submission procedure of Interspeech 2007 and selecting "Speech and language technology for less-resourced languages" as the topic of your first choice. The paper submission deadline is March 23, 2007.
2. We offer a light submission procedure for demos. (Please note: unlike regular papers, texts submitted with a demo will NOT be published in the proceedings, but will be made available for download from the SALTMIL website ). In this case, please submit a short description of the system demonstrated, the demo, required materials for the demo, and references, to the first of the session organisers (see below) and to before April 27, 2007. Demo submission texts should be formatted in accordance with the Interspeech 2007 author kit, and should be between 1 and 4 pages in length.
Session organisers
Dr Briony Williams
Language Technologies Unit, Canolfan Bedwyr, University of Wales, Bangor, UK Email:
Dr Mikel Forcada
Departament de Llenguatges i Sistemes Informrtics, Universitat d'Alacant, E-03071 Alacant, Spain Email:
Dr Kepa Sarasola
Dept of Computer Languages, Univ of the Basque Country, PK 649 20080 Donostia, Basque Country, Spain Email:
Important dates
Four-page paper deadline: March 23, 2007
Demo submission deadline: April 27, 2007
Notification of acceptance: May 25, 2007
Early registration deadline: June 22, 2007 Main Interspeech conference: August 28-31, 2007


Special Session at INTERSPEECH 2007, Antwerp, Belgium
Tuesday afternoon, August 28, 2007
Webpage > Special Sessions > Synthesis of Singing Challenge
Organized by Gerrit Bloothooft, Utrecht University, The Netherlands
Singing is perhaps the most expressive usage of human voice and speech. An excellent singer, whether in classical opera, musical, pop, folk music, or any other style, can express a message and emotion so intensely that it moves and delights a wide audience. Synthesizing singing may be considered therefore as the ultimate challenge to our understanding and modeling of human voice. In this two hours interactive special session of INTERSPEECH 2007 on synthesized singing, we hope to present an enjoyable demonstration of the current state of the art, and we challenge you to contribute!
The session will be special in many ways:
* Participants have to submit a composition of their own choice, and they have to produce their own version of a compulsory musical score.
* During the special session, each participant will be allowed to demonstrate the free and compulsory composition, with additional explanation.
* The contribution will be commented by a panel consisting of synthesis experts and singers, and the audience.
* Evaluative statements will be voted for by everyone, if possible by a voting box system.
* The most preferred system will be allowed to play the demonstration during the closing session of the conference.
If you are interested to join the challenge, you are invited to submit a paper on your system and to include an example composition of your own choice (in .wav format) within the regular submission procedure of INTERSPEECH 2007, and to select "Synthesis of Singing Challenge" for Special Session. The deadline is March 23, 2007.
We also offer a light submission procedure that will not result in a regular peer reviewed paper in the Proceedings. In that case you can submit the composition of your own choice in .wav format /to the session organizer/ (see below) before April 27, 2007. See the website for more details.
The composition may have a maximum duration of two minutes; no accompaniment is allowed. There are no restrictions with respect to the synthesis method used, which may range from synthesis of singing by rule, articulatory modelling, sinusoidal modelling, unit selection, to voice conversion (include the original in your two minutes demo as well).
All accepted contributors (notification on May 25) will be required to produce their own version of a musical score published by July 1, 2007. The corresponding sound file should be sent as a .wav file /to the session organizer/ (see below) before August 21, 2007.
Session organiser:
Gerrit Bloothooft
UiL-OTS, Utrecht University, The Netherlands
Webpage > Special Sessions > Synthesis of Singing Challenge

Workshop on Speech in Mobile and Pervasive Environments

(in conjunction with ACM Mobile HCI '07)
September 9, 2007
Website Organisers
* Amit A. Nanavati, IBM India Research Laboratory, India.
* Nitendra Rajput, IBM India Research Laboratory, India.
* Alexander I. Rudnicky, Carnegie Mellon University, USA.
* Markku Turunen, University of Tampere, Finland.
Programme Committee
* Abeer Alwan, UCLA, USA.
* Peter Boda, Nokia Research Center, Finland.
* Nick Campbell, ATR, Japan.
* Nobuo Hataoka, Tohuko Inst. of Tech., Japan.
* Matt Jones, Swansea University, UK.
* Gary Marsden, Univ. of Cape Town, South Africa.
* David Pearce, Motorola, UK.
* Shrikanth S. Narayanan, USC, USA.
* Yaxin Zhang, Motorola, China.
Traditionally, voice-based applications have been accessed using unintelligent telephone devices through Voice Browsers that reside on the server. The proliferation of pervasive devices and the increase in their processing capabilities, client-side speech processing is emerging as a viable alternative. As in SiMPE 2006, we will further explore the various possibilities and issue s that arise while enabling speech processing on resource-constrained, possibly mobile devices.
In particular, this year's theme will be SiMPE for developing regions. There are three compelling reasons for this:
(1) The penetration of mobile phone in emerging economies,
(2) The importance of speech for semi-literate and illiterate users, an d,
(3) The completely novel HCI issues that arise when the target populati on is not tech savvy.
The workshop will highlight the many open areas that require research attention, identify key problems that need to be addressed, and also discuss a few approaches for solving some of them --- not only to build the next generation of conversational systems, but also help create the next generation of IT users, thus extending the benefits of technology to a much wider populace.
Topics of Interest
All areas that enable, optimise or enhance Speech in mobile and pervasi ve environments and devices. Possible areas include, but are not restricte d to: * Speech interfaces/applications for Developing Regions
* Multilingual Speech Recognition
* Robust Speech Recognition in Noisy and Resource-constrained Environm ents
* Memory/Energy Efficient Algorithms
* Multimodal User Interfaces for Mobile Devices
* Protocols and Standards for Speech Applications
* Distributed Speech Processing
* Mobile Application Adaptation and Learning
* Prototypical System Architectures
* User Modelling
Intended Audience
This cross-disciplinary burgeoning area requires that the people from various disciplines -- speech, mobile applications, user interface design, solutions for emerging economies -- meet and discuss the way forward. It would be particularly relevant and interesting to have lively discussions betwee n the two communities -- researchers working on technologies for developing regions and those working on SiMPE areas. We hope that a fruitful collaboration will take place: the former will articulate the needs of the population for the latter to address and jointly solve.
SiMPE 2006 was a common meeting ground to bring together small isolated groups of people working in SiMPE-related areas. To continue exchanges further beyond SiMPE 2006, we created the SiMPE wiki, which currently has 27 participants. A follow-up in SiMPE 2007 will further strengthen this community and pa ve the way for more fruitful exchanges and collaborations. The focus on develo ping regions and its relevance to SiMPE is timely and compelling.
We invite position papers (upto 8 pages - shorter papers are also welco me). Electronic submission is required. Submissions should be formatted according to ACM SIG style. All submissions should be in PDF (preferred) or Postscript format. If any of these requirements is a problem for you, please feel free to contact the workshop organisers.
We also welcome participation without paper submission.
Position papers must be submitted via the conference submission web site.
We are in the process of trying to publish the proceedings of the workshop as a special issue. For any comments regarding submissions and participation, contact ,or website
Key Dates
* Position Paper Submission Deadline: July 1, 2007 (11:59 PM Singapore Time)
* Notification of Acceptance: July 15, 2007
* Early Registration Deadline: July 31, 2007
* Workshop: 8:45AM -- 5:00PM, September 9, 2007.
* SiMPE Workshop
* ACM Mobile HCI '07


Borovets, Bulgaria
September 26, 2007
Workshop site
RANLP'2007 site
Several initiatives have been launched in the area of Computational Linguistics, Language Resources and Knowledge Representation both at the national and international level aiming at the development of resources and tools. Unfortunately, there are few initiatives that integrate these results within eLearning. The situation is slightly better with respect to the results achieved within Knowledge Representation since ontologies are being developed which describe not only the content of the learning material but crucially also its context and the structure. =46urthermore, knowledge representation techniques and natural language processing play an important role in improving the adaptivity of learning environments even though they are not fully exploited yet. On the other hand, eLearning environments constitute valuable scenarios to demonstrate the maturity of computational linguistic methods as well as of natural language technologies and tools. This kind of task-based evaluation of resources, methods and tools is a crucial issue for the further development of language and information technology. The goal of this workshop is to discuss:
*the use of language and knowledge resources and tools in eLearning;
* requirements on natural language resources, standards, and applications originating in eLearning activities and environments;
* the expected added value of natural language resources and technology to learning environments and the learning process;
* strategies and methods for the task based evaluation of Natural Language Processing applications.
The workshop will bring together computational linguists, language resources developers, knowledge engineers, researchers involved in technology-enhanced learning as well as developers of eLearning material, ePublishers and eLearn ing practitioners. It will provide a forum for interaction among members of di=0Berent research communities, and a means for attendees to increase their knowledge and understanding of the potential of computational resources in eLearning.
Topics of interest include, but are not limited to:
* ontology modelling in the eLearning domain;
* Natural Language Processing techniques for supplying metadata for learning objects on a (semi)-automatic basis, e.g. for the automatic extraction of key terms and their definitions;
* techniques for summarization of discussion threads and support of discourse coherence in eLearning;
* improvements on (semantic, cross-lingual) search methods to in learning environments;
* techniques of matching the semantic representation of learning objects with the users knowledge in order to support personalized and adaptive learning;
* adaptive information filtering and retrieval (content-based filtering and retrieval, collaborative filtering)
* intelligent tutoring (curriculum sequencing, intelligent solution analysis , problem solving support)
* intelligent collaborative learning (adaptive group formation and peer help, adaptive collaboration)
Submissions by young researchers are especially welcomed.
* Format. Authors are invited to submit full papers on original, unpublished work in the topic area of this workshop. Papers should be submitted as a PDF file, formatted according to the RANLP 2007 stylefiles and not exceeding 8 pages. The RANLP 2007 stylefiles are available at:
* Demos. Submission of demos are also welcome. Papers submitted as demo should not exceed 4 pages and should describe extensively the system to be presented.
* Submission procedure. Submission of papers will be handled using the START system, through the RANLP Conference. Specific submission guidelines will be posted on the workshop site shortly.
* Reviewing. Each submission will be reviewed at least by two members of the Program Committee.
* Accepted papers policy. Accepted papers will be published in the workshop proceedings. By submitting a paper at the workshop the authors agree that, in case the paper is accepted for publication, at least one of the authors will attend the workshop; all workshop participants are expected to pay the RANLP-2007 workshop registration fee.
Paper submission deadline: June 15, 2007
Paper acceptance notification: July 25, 2007
Camera-ready papers due: August 31, 2007
Workshop date: September 26, 2007
Keynote speakers will be announced shortly before the workshop.
Paola Monachesi University of Utrecht, The Netherlands
Lothar Lemnitzer University of Tuebingen, Germany
Cristina Vertan University of Hamburg, Germany
Dr. Cristina Vertan
Natural Language Systems Division
Computer Science Department
University of Hamburg
Vogt-Koelln-Str. 30
22527 Hamburg GERMANY
Tel. 040 428 83 2519
Fax. 040 428 83 2515

International Conference: "Where Do Come From ? Phonological Primitives in the Brain, the Mouth, and the Ear"

Universite Paris-Sorbonne (1, rue Victor Cousin 75230 Paris cedex)
Deadline: May 6, 2007!
Speech sounds are made up of atomic units termed "distinctive features", "p honological features" or "phonetic features", according to the researcher. These units, which have achieved a notable success in the domain of phonolo gical description, may also be central to the cognitive encoding of speech, which allows the variability of the acoustic signal to be related to a sma ll number of categories relevant for the production and perception of spoke n languages. In spite of the fundamental role that features play in current linguistics, current research continues to raise many basic questions conc erning their cognitive status, their role in speech production and percepti on, the relation they have to measurable physical properties in the articul atory and acoustic/auditory domains, and their role in first and second lan guage acquisition. The conference will bring together researchers working i n these and related areas in order to explore how features originate and ho w they are cognitively organized and phonetically implemented. The aim is t o assess the progress made and future directions to take in this interdisci plinary enterprise, and to provide researchers and graduate students from d iverse backgrounds with a stimulating forum for discussion.
How to submit
Authors are invited to submit an anonymous two-page abstract (in English or French) by April 30, 2007 to Rachid Ridouane , accompanied by a separate page stating name(s) of author(s), contact information, and a pref erence for oral paper vs. poster presentation. Contributions presenting new experimental results are particularly welcome. Notification e-mails will b e sent out by June 15, 2007. Publication of selected papers is envisaged.
Conference topics include, but are not limited to:
Phonetic correlates of distinctive features
Acoustic-articulatory modeling of features
Quantal definitions of distinctive features
Role of subglottal and/or side-cavity resonances in defining feature boundaries
Auditory/acoustic cues to acoustic feature correlates
Visual cues to distinctive features
Within- and across-language variability in feature realizati on
Enhancement of weak feature contrasts
Phonological features and speech motor commands
Features and the mental lexicon
Neurological representation of features
Features in early and later language acquisition
Features in the perception and acquisition of non-native lan guages
Features in speech disorders

The two-day conference (October 4-5, 2007) will consist of four invited tal ks, four half-day sessions of oral presentations (30 minutes including disc ussion), and one or two poster sessions.
Important dates
April 30, 2007 Submission deadline
June 15, 2007 Acceptance notification date
October 4-5, 2007 Conference venue
Rachid Ridouane (Laboratory of Phonetics and Phonology, Paris)
Nick Clements (Laboratory of Phonetics and Phonology, Paris)
Rachid Ridouane
Ce colloque est finance par le Ministere Delegue Francais de la la Recherche sous le programme "Action Concertee Incitative PROSODIE " (Programme de soutien dans l'innovation et l'excellence en sciences humai nes et sociales).

3rd Language and Technology Conference (LTC2007): Human Language Technologies as a Challenge for Computer Science and Linguistics

October 5-7, 2007,
Faculty of Mathematics and Computer Science of the Adam Mickiewicz University,
Poznan, Poland,
The conference program will include the following topics:
* electronic language resources and tools
* formalisation of natural languages
* parsing and other forms of NL processing
* computer modelling of language competence
* NL user modelling
* NL understanding by computers
* knowledge representation
* man-machine NL interfaces
* Logic Programming in Natural Language Processing
* speech processing
* NL applications in robotics
* text-based information retrieval and extraction, question answering
* tools and methodologies for developing multilingual systems
* translation enhancement tools
* methodological issues in HLT
* prototype presentations
* intractable language-specific problems in HLT (for languages other than English)
* HLT standards
* HLT as foreign language teaching support
* new challenge: communicative intelligence
* vision papers in the field of HLT
* HLT related policies
This list is not closed and we are open to further proposals.
The Program Committee is also open to suggestions concerning accompanying events (workshops, exhibits, panels, etc). Suggestions, ideas and observations may be addressed directly to the LTC Chair.
Deadline for submission of papers for review - May 20, 2007
Acceptance/Rejection notification - June 15, 2007
Submission of final versions of accepted papers. - July 15, 2007
Further details will be available soon. The call for papers will be distributed by mail and published on the conference site . The site currently contains information about LTC’05 including freely-downloadable abstracts of the papers presented.
Zygmunt Vetulani
LTC’07 Chair

2007 IEEE International Conference on Signal Processing and Communications, United Arab Emirates

24–27 November 2007
Dubai, United Arab Emirates
The IEEE International Conference on Signal Processing and Communications (ICSPC 2007) will be held in Dubai, United Arab Emirates (UAE) on 24–27 November 2007. The ICSPC will be a forum for scientists, engineers, and practitioners throughout the Middle East region and the World to present their latest research results, ideas, developments, and applications in all areas of signal processing and communications. It aims to strengthen relations between industry, research laboratories and universities. ICSPC 2007 is organized by the IEEE UAE Signal Processing and Communications Joint Societies Chapter. The conference will include keynote addresses, tutorials, exhibitions, special, regular and poster sessions. All papers will be peer reviewed. Accepted papers will be published in the conference proceedings and will be included in IEEE Explore. Acceptance will be based on quality, relevance and originality.
Topics will include, but are not limited to, the following:
• Digital Signal Processing
• Analog and Mixed Signal Processing
• Audio/Speech Processing and Coding
• Image/Video Processing and Coding
• Watermarking and Information Hiding
• Multimedia Communication
• Signal Processing for Communication
• Communication and Broadband Networks
• Mobile and Wireless Communication
• Optical Communication
• Modulation and Channel Coding
• Computer Networks
• Computational Methods and Optimization
• Neural Systems
• Control Systems
• Cryptography and Security Systems
• Parallel and Distributed Systems
• Industrial and Biomedical Applications
• Signal Processing and Communications Education
Prospective authors are invited to submit full-length (4 pages) paper proposals for review. Proposals for tutorials, special sessions, and exhibitions are also welcome. The submission procedures can be found on the conference web site:
All submissions must be made on-line and must follow the guidelines given on the web site.
ICSPC 2007 Conference Secretariat,
P. O. Box: 573, Sharjah, United Arab Emirates (U.A.E.),
Fax: +971 6 5611789
Honorary Chair
Arif Al-Hammadi, Etisalat University College, UAE
General Chair
Mohammed Al-Mualla Etisalat University College, UAE
Submission of proposals for tutorials, special sessions, and exhibitions March 5th, 2007
Submission of full-paper proposals April 2nd, 2007
Notification of acceptance June 4th, 2007
Submission of final version of paper October 1st, 2007

5th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications MAVEBA 2007

December 13 - 15, 2007
Conference Hall - Ente Cassa di Risparmio di Firenze
Via F. Portinari 5r, Firenze, Italy
EXTENDED DEADLINE: 15 June 2007 - Submission of extended abstracts (1-2 pages, 1 column), special session proposal
30 July, 2007 - Notification of paper acceptance
30 September 2007 - Final full paper submission (4 pages, 2 columns, pdf format) and early registration
13-15 December 2007
- Conference venue
Dr. Claudia Manfredi - Conference Chair
Dept. of Electronics and Telecommunications
Universita degli Studi di Firenze
Via S. Marta 3
50139 Firenze, Italy
Phone: +39-055-4796410
Fax: +39-055-494569


- XXVIIemes Journees d'Etude sur la Parole (JEP'08)
- 15eme conference sur le Traitement Automatique des Langues Naturelles (TALN'08)
- 10eme Rencontre des Etudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL'08)

Universite d'Avignon et des Pays de Vaucluse
Avignon du 9 au 13 Juin 2008
Site web
Date limite de soumission: 11 fevrier 2008
Notification aux auteurs: 28 mars 2008
Conference: 9-13 juin 2008
Organise par le LIA (Laboratoire Informatique d'Avignon), JEP-TALN-RECITAL'08 regroupe la 27eme edition des Journees d'Etude sur la Parole (JEP'08), la 15eme edition de la conference sur le Traitement Automatique des Langues Naturelles (TALN'08) et la 10eme edition des Rencontres des Etudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL'08).
Pour la troisieme fois, apres Nancy en 2002 et Fes en 2004, l'AF CP (Association Francophone pour la Communication Parlee) et l'ATALA (Association pour le Traitement Automatique des Langues) organisent conjointement leur principale conference afin de reunir en un seul lieu les deux communautes du traitement de la langue orale et ecrite.
Les sessions plenieres seront communes aux trois conferences ainsi qu'une session orale thematique.
Des appels a communication precisant les themes ainsi que les modalites de soumission pour chaque conference suivront ce premier appel.