Contents

1 . Editorial

Dear Members,

The Board has taken an important decision:  INTERSPEECH 2011 will take place in Florence,Italy. I am sure that it will be successful and will attract a lot of you into this wonderful city, cradle of the European Renaissance.

Meanwhile, life is going on.  You have to prepare your trip to Brisbane.  But do not forget  all the appealing workshops listed below.

I still receive interesting job offers: I draw the attention of our young members to the possibilities of thesis funding and postdoc positions.

We are still working to improve ISCApad with the efficient help of Laurence Liu, a student of Helen Meng's from Hong Kong.

Please pay attention to our section ISCA  News: the association needs your help.

Prof. em. Chris Wellekens

Institut Eurecom

France 

 

Back to Top

2 . ISCA News

2-1 . ISCA Scientific Achievement Medalist 2008

ISCA Scientific Achievement Medal for 2008 It is with great pleasure that I announce the ISCA Medalist for 2008 - Hiroya Fujisaki. Prof. Fujisaki has contributed to the speech research community in so many aspects, in speech analysis, synthesis and prosody, that it will be a very hard task for me to summarize his long list of achievements. He is also the founder of the ICSLP series of conferences which, being now fully integrated as one of ISCA's yearly conferences, will have its 10th anniversary this year.

Back to Top

2-2 . INTERSPEECH 2011 in Florence

ISCA announces with great pleasure that the venue for
Interspeech 2011 will be FLORENCE.

Back to Top

2-3 . Help ISCA serve you better

The ISCA board is always interested in improving its activities and the membership services it provides. To help us with this, could you please send us your ideas/comments/suggestions/impressions? We would be most grateful if you could take a moment to complete the form on the ISCA website : http://www.isca-speech.org/index.php and send us your feedback.

Your message will be sent to the ISCA secretariat : secretariat@isca-speech.org

Please enter ideas/comments/suggestions/impressions you may have on any new (or old) activities and membership services.

Please note: you can send us your comments anonymously, if you so wish.

Eva Hajicova - Membership Services 

 Emmanuelle Foxonet - ISCA Secretariat  for the ISCA board 

Back to Top

3 . SIG's activities

3-1 . SLaTE

The International Speech Communication Association Special Interest Group (ISCA SIG) on

Speech and Language Technology in Education

A special interest group was created in mid-September 2006 at the Interspeech 2006 conference in Pittsburgh. This is its official website. On this site you can find information about the SIG.

The next SLaTE ITRW will be in 2009 in England.

OUR STATEMENT OF PURPOSE

The purpose of the International Speech Communication Association (ISCA) Special Interest Group on Speech and Language Technology in Education (SLaTE) shall be to promote interest in the use of speech and natural language processing for education; to provide members of ISCA with a special interest in speech and language technology in education with a means of exchanging news of recent research developments and other matters of interest in Speech and Language Technology in Education; to sponsor meetings and workshops on that subject that appear to be timely and worthwhile, operating within the framework of ISCA's by-laws for SIGs; and to provide and make available resources relevant to speech and language technology in education, including text and speech corpora, analysis tools, analysis and generation software, research papers and generated data.

Back to Top

4 . Future ISCA Conferences and Workshops(ITRW)

4-1 . INTERSPEECH 2008

INTERSPEECH 2008 incorporating SST 08

September 22-26, 2008

Brisbane Convention & Exhibition Centre

Brisbane, Australia

http://www.interspeech2008.org/

 

Interspeech is the world's largest and most comprehensive conference on Speech

Science and Speech Technology. We invite original papers in any related area,

including (but not limited to):

             Human Speech Production, Perception and Communication; 

             Speech and Language Technology; 

             Spoken Language Systems; and 

 

            Applications, Resources, Standardisation and Evaluation

  • In addition, a number of Special Sessions on selected topics have been organised and we invite you to submit for these also (see website for a complete list).

    Interspeech 2008 has two types of submission formats: Full 4-page Papers and

     Short 1-page Papers. Prospective authors are invited to submit papers in either

     format via the conference website by 7 April 2008. 

     

    Important Dates 

    Paper Submission: Monday, 7 April 2008, 3pm GMT 

    Notification of Acceptance/Rejection: Monday, 16 June 2008, 3pm GMT 

    Early Registration Deadline: Monday, 7 July 2008, 3pm GMT 

    Tutorial Day: Monday, 22 September 2008 

    Main conference: 23-26 September 2008 

     For more information please visit the website http://www.interspeech2008.org

     

    Chairman: Denis Burnham, MARCS, University of West Sydney.  

Back to Top

4-2 . INTERSPEECH 2009

Brighton, UK,
Conference Website
Chairman: Prof. Roger Moore, University of Sheffield.

Back to Top

4-3 . INTERSPEECH 2010

Chiba, Japan
Conference Website
ISCA is pleased to announce that INTERSPEECH 2010 will take place in Makuhari-Messe, Chiba, Japan, September 26-30, 2010. The event will be chaired by Keikichi Hirose (Univ. Tokyo), and will have as a theme "Towards Spoken Language Processing for All - Regardless of Age, Health Conditions, Native Languages, Environment, etc."

Back to Top

4-4 . ITRW on experimental linguistics

August 2008, Athens, Greece
Website
Prof. Antonis Botinis


Back to Top

4-5 . International Conference on Auditory-Visual Speech Processing AVSP 2008

Dates: 26-29 September 2008

Location: Moreton Island, Queensland, Australia
Website: http://express.hid.ri.cmu.edu/AVSP2008/Main.html

AVSP 2008 will be held as an ISCA Tutorial and Research Workshop at
Tangalooma Wild Dolphin Resort on Moreton Island from the 26-29
September 2008. AVSP 2008 is a satellite conference to Interspeech 2008,
being held in Brisbane from the 22-26 September 2008. Tangalooma is
located at close distance from Brisbane, so that attendance at AVSP 2008
can easily be combined with participation in Interspeech 2008.

Auditory-visual speech production and perception by human and machine is
an interdisciplinary and cross-linguistic field which has attracted
speech scientists, cognitive psychologists, phoneticians, computational
engineers, and researchers in language learning studies. Since the
inaugural workshop in Bonas in 1995, Auditory-Visual Speech Processing
workshops have been organised on a regular basis (see an overview at the
avisa website). In line with previous meetings, this conference will
consist of a mixture of regular presentations (both posters and oral),
and lectures by invited speakers.

Topics include but are not limited to:
- Machine recognition
- Human and machine models of integration
- Multimodal processing of spoken events
- Cross-linguistic studies
- Developmental studies
- Gesture and expression animation
- Modelling of facial gestures
- Speech synthesis
- Prosody
- Neurophysiology and neuro-psychology of audition and vision
- Scene analysis

Paper submission:
Details of the paper submission procedure will be available on the
website in a few weeks time.

Chairs:
Simon Lucey
Roland Goecke
Patrick Lucey


Back to Top

4-6 . Christian Benoit workshop on Speech and Face to Face Communication

NEW Deadline for sending one page abstract = JUNE 9TH


Ten years after our colleague Christian Benoît departed, the mark that
he left is still very vivid in the international community. There will
soon be several occasions to honour his memory: during the next
Interspeech conference (Christian was secretary of the ESCA, future
ISCA, for a long time, the association is a French association of the
type described in the 1901 law and its official headquarters are still
in Grenoble), as well as during the next AVSP workshop (workshop of
which he was one of the creators). The Christian Benoît Association was
created in 1999 and regularly awards young researchers the "Christian
Benoît prize" to promote their research (the 4^th prize was awarded to
the phonetician Susanne Fuchs in 2007). The Christian Benoît association
http://www.icp.inpg.fr/ICP/_communication.fr.html#prixcb), along with
ICP, now Speech and Cognition Department of Gipsa-lab
(http://www.gipsa-lab.inpg.fr <http://www.gipsa-lab.inpg.fr/>), are
organizing a workshop/summer school to Christian Benoît’s memory, in the
line of his innovative and enthusiastic research style and aiming at
exploring the topic of "Speech and Face to Face Communication" in a
pluridisciplinary perspective: neuroscience, cognitive psychology,
phonetics, linguistics and computer modelling. The workshop "Speech and
Face to Face Communication" will be organized around 11 invited
conferences. All researchers from the field are invited to participate
through a call for papers and students will be encouraged to widely
attend the workshop and present their work.

Website: http://www.icp.inpg.fr/~dohen/face2face/

Deadline for sending one page abstracts: June 9th (see Call for Papers
<http://ww.icp.inpg.fr/%7Edohen/face2face/CallForPapers.html>)

You can subscribe to the Christian Benoît Association by sending 15
euros (active member; 45 euros or more, benefactors) to Pascal Perrier,
secretary of the association: Pascal.Perrier@gipsa-lab.inpg.fr
<mailto:Pascal.Perrier@gipsa-lab.inpg.fr>.

Back to Top

4-7 . CfP Second IEEE Spoken Language Technology Workshop Goa

Call for Papers:
Second IEEE Spoken Language Technology Workshop
Goa, India
December 15-18, 2008

The Second IEEE Spoken Language Technology (SLT) workshop will be held from December 15 to December 18, 2008 in Goa, India. The goal of this workshop is to bring both the speech processing and natural language processing communities together to share and present recent advances in various areas of spoken language technology, with the expectation that such a confluence of the researchers from both communities will foster new ideas, collaborations and new research directions in this area. The SLT 2008 workshop is endorsed by both ISCA and ACL organizations and eligible participants can apply for ISCA grants (http://www.isca-speech.org/grants.html).

Spoken language technology is a vibrant research area, with the potential for significant impact on government and industrial applications especially with the diversity and challenges offered by the multilingual business climates of today's world.

The workshop solicits papers on all aspects of spoken language technology:

 o Spoken language understanding
 o Spoken document summarization
 o Machine translation for speech
 o Spoken dialog systems
 o Spoken language generation
 o Spoken document retrieval
 o Human computer Interactions (HCI)
 o Speech data mining
 o Information extraction from speech
 o Question answering from speech
 o Multimodal processing
 o Spoken language based assistive technologies
 o Spoken language systems and applications
 o Spoken language databases and standards

In addition, this year's workshop will feature three special sessions:

 1) Challenges in Asian spoken language processing with special emphasis on Indian languages
 2) Mining human-human conversations: A resource for building efficient human-machine dialogs
 3) Spoken Language on the go: Challenges and Opportunities for spoken language processing on mobile devices

Submissions for the Technical Program
-------------------------------------
The workshop program will consist of tutorials, oral and poster presentations, and panel discussions. Attendance will be limited with priority for those who will present technical papers; registration is required of at least one author for each paper. Submissions are encouraged on any of the topics listed above. The style guide, templates, and submission form will follow the IEEE ICASSP style. Three members of the Scientific Committee will review each paper. The workshop proceedings will be published on a CD-ROM.

Important Dates
---------------
*Camera-ready paper submission deadline: August 8, 2008
Hotel Reservation and Workshop registration opens: August 8, 2008
Paper Acceptance / Rejection: September 15, 2008
Hotel Reservation and Early Registration closes: October 5, 2008
Workshop: December 15-18, 2008*

For more information visit the SLT 2008 website http://slt2008.org or contact the organizing committee at info@slt2008.org <mailto:info@slt2008.org> if you have any questions.

Back to Top

5 . Books, databases and softwares

5-1 . Books

La production de la parole
Author: Alain Marchal, Universite d'Aix en Provence, France
Publisher: Hermes Lavoisier
Year: 2007

Speech enhancement-Theory and Practice
Author: Philipos C. Loizou, University of Texas, Dallas, USA
Publisher: CRC Press
Year:2007

Speech and Language Engineering
Editor: Martin Rajman
Publisher: EPFL Press, distributed by CRC Press
Year: 2007

Human Communication Disorders/ Speech therapy
This interesting series can be listed on Wiley website

Incurses em torno do ritmo da fala
Author: Plinio A. Barbosa
Publisher: Pontes Editores (city: Campinas)
Year: 2006 (released 11/24/2006)
(In Portuguese, abstract attached.) Website

Speech Quality of VoIP: Assessment and Prediction
Author: Alexander Raake
Publisher: John Wiley & Sons, UK-Chichester, September 2006
Website

Self-Organization in the Evolution of Speech, Studies in the Evolution of Language
Author: Pierre-Yves Oudeyer
Publisher:Oxford University Press
Website

Speech Recognition Over Digital Channels
Authors: Antonio M. Peinado and Jose C. Segura
Publisher: Wiley, July 2006
Website

Multilingual Speech Processing
Editors: Tanja Schultz and Katrin Kirchhoff ,
Elsevier Academic Press, April 2006
Website

Reconnaissance automatique de la parole: Du signal a l'interpretation
Authors: Jean-Paul Haton
Christophe Cerisara
Dominique Fohr
Yves Laprie
Kamel Smaili
392 Pages Publisher: Dunod

 

*Automatic Speech Recognition on Mobile Devices and over Communication 
Networks
*Editors: Zheng-Hua Tan and Børge Lindberg
Publisher: Springer, London, March 2008
website <http://asr.es.aau.dk/>
 
About this book
The remarkable advances in computing and networking have sparked an 
enormous interest in deploying automatic speech recognition on mobile 
devices and over communication networks. This trend is accelerating.
This book brings together leading academic researchers and industrial 
practitioners to address the issues in this emerging realm and presents 
the reader with a comprehensive introduction to the subject of speech 
recognition in devices and networks. It covers network, distributed and 
embedded speech recognition systems, which are expected to co-exist in 
the future. It offers a wide-ranging, unified approach to the topic and 
its latest development, also covering the most up-to-date standards and 
several off-the-shelf systems.
 
Latent Semantic Mapping: Principles & Applications
Author: Jerome R. Bellegarda, Apple Inc., USA
Publisher: Morgan & Claypool
Series: Synthesis Lectures on Speech and Audio Processing
Year: 2007
Website: http://www.morganclaypool.com/toc/sap/1/1
 

The Application of Hidden Markov Models in Speech Recognition By Mark Gales and Steve Young (University of Cambridge)
http://dx.doi.org/10.1561/2000000004
 
in Foundations and Tr=nds in Signal Processing (FnTSIG)
www.nowpublishers.com/SIG 
 
 
Proceedings of the IEEE
 
Special Issue on ADVANCES IN MULTIMEDIA INFORMATION RETRIEVAL
 
Volume 96, Number 4, April 2008
 
Guest Editors:
 
Alan Hanjalic, Delft University of Technology, Netherlands
Rainer Lienhart, University of Augsburg, Germany
Wei-Ying Ma, Microsoft Research Asia, China
John R. Smith, IBM Research, USA
 
Through carefully selected, invited papers written by leading authors and research teams, the April 2008 issue of Proceedings of the IEEE (v.96, no.4) highlights successes of multimedia information retrieval research, critically analyzes the achievements made so far and assesses the applicability of multimedia information retrieval results in real-life scenarios. The issue provides insights into the current possibilities for building automated and semi-automated methods as well as algorithms for segmenting, abstracting, indexing, representing, browsing, searching and retrieving multimedia content in various contexts. Additionally, future challenges that are likely to drive the research in the multimedia information retrieval field for years to come are also discussed.
 
 
 Computeranimierte Sprechbewegungen in realen Anwendungen
Authors: Sascha Fagel and Katja Madany
102 pages
Publisher: Berlin Institute of Technology
Year: 2008
Website http://www.ub.tu-berlin.de/index.php?id=1843
To learn more, please visit the corresponding IEEE Xplore site at

Usability of Speech Dialog Systems

Listening to the Target Audience
Series:
Signals and Communication Technology

Hempel, Thomas (Ed.)

2008, X, 175 p. 14 illus., Hardcover

ISBN: 978-3-540-78342-8

Speech and Language Processing, 2nd Edition

By Daniel Jurafsky, James H. Martin

  • Published May 16, 2008 by Prentice Hall.
  • More Info
    • Copyright 2009
    • Dimensions 7" x 9-1/4"
    • Pages: 1024
    • Edition: 2nd.
    • ISBN-10: 0-13-187321-0
    • ISBN-13: 978-0-13-187321-6
    • Request an Instructor or Media review copy

An explosion of Web-based language techniques, merging of distinct fields, availability of phone-based dialogue systems, and much more make this an exciting time in speech and language processing. The first of its kind to thoroughly cover language technology – at all levels and with all modern technologies – this book takes an empirical approach to the subject, based on applying statistical and other machine-learning algorithms to large corporations. KEY TOPICS: Builds each chapter around one or more worked examples demonstrating the main idea of the chapter, usingthe examples to illustrate the relative strengths and weaknesses of various approaches. Adds coverage of statistical sequence labeling, information extraction, question answering and summarization, advanced topics in speech recognition, speech synthesis. Revises coverage of language modeling, formal grammars, statistical parsing, machine translation, and dialog processing. MARKET: A useful reference for professionals in any of the areas of speech and language processing.

  

 
Back to Top

5-2 . LDC News

Membership Mailbag - 'Penn' Treebanks and Recent Directions in English Treebanking

LDC2008T07

 

 

LDC2008L02
 
LDC2008S04
 

 

 

In this month's newsletter, the Linguistic Data Consortium (LDC) would like to introduce our new Membership Mailbag series of newsletter articles and announce the availability of three new publications.
 

 


 

Membership Mailbag - 'Penn' Treebanks and Recent Directions in English Treebanking
 

The LDC Membership Office responds to over 4000 emailed queries a year, and, over time, we've noticed that some questions tend to crop up with regularity.  To address the questions that you, our data users, have asked, we'd like to introduce our new Membership Mailbag series of newsletter articles.  This month we will look into the differences between the 'Penn' Treebanks and review recent directions in English treebanking.

Treebank-2 and Treebank-3 both contain 1 million words of Wall Street Journal (WSJ) text  and a small sample of ATIS-3 data that have been annotated using a Treebank II annotation-style, plus a part-of-speech tagged version of the Brown corpus.  Treebank-3 is considered a super-set of Treebank-2.  That is, if you are undecided between Treebank-2 and -3, in most instances, the best choice would be Treebank-3. Treebank-3 corrects known technical errors in Treebank-2 plus it contains Switchboard data which has been tagged, dysfluency-annotated, and a small portion of the Brown corpus which has been parsed in the Treebank II annotation-style.   

Note, however, that there are a few items missing from Treebank-3 that are found in Treebank-2.  Treebank 3 does not contain the complete parsed Brown corpus.  Treebank-2 contains the complete parsed Brown corpus done in the older Treebank I annotation-style; this is not contained in Treebank-3. Also, Treebank-3 does not include the tgrep software for extracting data, but tgrep and a newer version, tgrep2, are freely available online.  Finally, Treebank-3 does not contain the raw Wall Street Journal (WSJ) text, but organizations can obtain this by request.

Much recent treebanking has focused on languages other than English, but English treebanking efforts did not come to an end with the release of Treebank-3.  Ongoing work uses an updated Treebank II annotation-style and consists of two types of annotation; straight treebanking and treebanking in combination with another kind of annotation.  Straight treebank annotation can be found in corpora such as English Chinese Translation Treebank v 1.0 and English-Arabic Treebank v 1.0.  In these corpora, the Chinese or Arabic source texts have been translated into English, then POS-tagged and treebanked, thus making them suitable for machine translation work as well.  Additional translation treebanks are planned for release and will feature cleaner translation and contain substantially more data. 

Corpora which combine treebanking with another type of annotation include the English Conversational Telephone Speech Treebank with Structural Metadata, to be released later this year.  This treebank is annotated for structural metadata including fillers, disfluencies and sentence/semantic units, and also tagged for syntactic structure, and so, evaluates the impact of metadata extraction (MDE) on parsing information.  While these newer releases are smaller than the Penn Treebanks, the improved Treebank II annotation-style has a very high rate of inter-annotator agreement..  Additionally, the source texts are more varied in both domain and style than the WSJ texts that constitute the bulk of Penn Treebank.

Got a question?  About LDC data?  Forward it to ldc@ldc.upenn.edu.  The answer may appear in a future Membership Mailbag article.


New Publications


(1) Chinese Proposition Bank 2.0 (CPB2.0) is a continuation of the Chinese Proposition Bank project, which aims to create a corpus of Chinese text annotated with information about basic semantic propositions. Chinese Proposition Bank 1.0 consists of predicate-argument annotation on 250,000 words from Chinese Treebank 5.0. Chinese Proposition Bank 2.0 adds predicate-argument annotation on 500,000 words from Chinese Treebank 6.0. The data sources include newswire from Xinhua News Agency, articles from Sinorama Magazine, news from the website of the Hong Kong Special Administrative Region and transcripts from various Chinese broadcast news programs.

This release contains the predicate-argument annotation of 81,009 verb instances (11,171 unique verbs) and 14,525 noun instances (1,421 unique nouns). The annotation of nouns is limited to nominalizations that have a corresponding verb. The general annotation guidelines and the lexical guidelines (called frame files) for each verbal and nominal predicate are included in this release.  Chinese Proposition Bank 2.0 is distributed via web download.

2008 Subscription Members will automatically receive two copies of this corpus on disc. 2008 Standard Members may request a copy as part of their 16 free membership corpora. Nonmembers may license this data for US$850.

*

(2)  Hindi WordNet was developed by researchers at the Center for Indian Language Technology, Computer Science and Engineering Department, IIT Bombay.  Wordnets are systems for analyzing the different lexical and semantic relations between words. Specifically, a wordnet is a word sense network in which words are grouped into semantically equivalent units called synsets. Each synset represents a lexical concept, and synsets are linked to each other by semantic relations (between synsets) and lexical relations (between words). Similar in design to the Princeton Wordnet for English, Hindi Wordnet incorporates additional features to capture the complexities of Hindi. This release of Hindi Wordnet consists of 56,928 unique words and 26,208 synsets.

Additional information about the development of Hindi Wordnet is available at the Hindi WordNet web site.

Hindi WordNet contains nouns, verbs, adjectives and adverbs. Each entry consists of the following elements:

1.      Synset: a set of synonymous words. The words in the synset are arranged according to the frequency of usage.

2.      Gloss: the concept. It consists of two parts:

Text definition: explains the concept denoted by the synset. 

Example sentence: gives the usage of the words in the sentence.

3.      Position in Ontology: An ontology is a hierarchical organization of concepts, or more specifically, a categorization of entities and actions. A separate ontological hierarchy exists for each syntactic category (noun, verb, adjective adverb). Each synset is mapped into some place in the ontology..

This release of Hindi WordNet is made available as a complete Java application along with an API to facilitate further development.  Hindi WordNet is distributed via web download. 

2008 Subscription Members will automatically receive two copies of this corpus on disc, provided that they have submitted a signed copy of the User License Agreement for Hindi WordNet (LDC2008L02).  2008 Standard Members may request a copy as part of their 16 free membership corpora. Nonmembers may license this data for US$300.

*

(3) West Point Brazilian Portuguese Speech is a database of digital recordings of spoken Brazilian Portuguese designed and collected by staff and faculty of the Department of Foreign Languages (DFL) and Center for Technology Enhanced Language Learning (CTELL) to develop acoustic models for speech recognition systems. The U.S. government uses such systems to provide speech-recognition enhanced language learning course ware to government linguists and students enrolled in various government language programs.

The data in this corpus was collected in March 1999 in Brasilia, Brazil using informants from a Brazilian military academy. The corpus consists of read speech from 60 female and 68 male native and non-native speakers.  The speech was elicited from a prompt script containing 296 sentences and phrases typically used in language learning situations.

The speech was collected using four laptop computers running MS Windows. Three of the computers recorded with a 16 bit data size and sampling rate of 22050 Hz, the other laptop recorded with an 8 bit data size at a sampling rate of 11025 Hz. The recording script presented a visual display of the sentence to be recorded. The informant pressed a key and spoke the sentence. The recording was played back for review, allowing the utterance to be re-recorded. West Point Brazilian Portuguese Speech is distributed on one DVD-ROM.

2008 Subscription Members will automatically receive two copies of this corpus. 2008 Standard Members may request a copy as part of their 16 free membership corpora. Nonmembers may license this data for US$500. l
Back to Top

5-3 . Question Answering on speech transcripts (QAst)

The QAst organizers are pleased to announce the release of the development dataset for
the CLEF-QA 2008 track "Question Answering on Speech Transcripts" (QAst).
We take this opportunity to launch a first call for participation in
this evaluation exercise.

QAst is a CLEF-QA track that aims at providing an evaluation framework
for QA technology on speech transcripts, both manual and automatic.
A detailed description of this track is available at:
http://www.lsi.upc.edu/~qast <http://www.lsi.upc.edu/~qast>

It is the second evaluation for the QAst track.
Last year (QAst 2007), factual questions had been generated for two
distinct corpora (in English language only). This year, in addition to
factual questions,
some definition questions are generated, and five corpora covering three
different languages are used (3 corpora in English, 1 in Spanish and 1
in French).

Important dates:

# 15 June 2008: evaluation set released
# 30 June 2008: submission deadline

The pilot track is organized jointly by the Technical University of
Catalonia (UPC), the Evaluations and Language resources Distribution
Agency (ELDA) and Laboratoire d'Informatique pour la Mécanique et les
Sciences de l'Ingénieur (LIMSI).

If you are interested in participating please send an email to Jordi
Turmo (turmo_AT_lsi.upc.edu) with "QAst" in the subject line.

Back to Top

5-4 . ELRA- Language Resources Catalogue-Update

ELRA is happy to announce that 1 new Speech Resource, produced within
the Technolangue programme, is now available in its catalogue.
*ELRA-S0272 MEDIA speech database for French
*The MEDIA speech database for French was produced by ELDA within the
French national project MEDIA (Automatic evaluation of man-machine
dialogue systems), as part of the Technolangue programme funded by the
French Ministry of Research and New Technologies (MRNT). It contains
1,258 transcribed dialogues from 250 adult speakers. The method chosen
for the corpus construction process is that of a =91Wizard of Oz=92 (WoZ)
 
system. This consists of simulating a natural language man-machine
dialogue. The scenario was built in the domain of tourism and hotel
reservation.
The semantic annotation of the corpus is available in this catalogue and
referenced ELRA-E0024 (MEDIA Evaluation Package).
For more information, see:=20
http://catalog.elra.info/product_info.php?products_id=3D1057
 
For more information on the catalogue, please contact Val=E9rie Mapelli
mailto:mapelli@elda.org
 
Visit our on-line catalogue: http://catalog.elra.info
<http://catalog.elra.info/>.
 
Back to Top

5-5 . MusicSpeech group

Music and speech share numerous aspects (language, structural, acoustics, cognitive), as long in their production, that in their representation and their perception. This list has for object to warn its users, various events dealing with the study of the links between music and speech. It thus intends to connect several communities, their allowing each to take advantage of a stimulating interaction.

As a member of the speech or music community, you are invited to
subscribe to musicspeech group. The group will be moderated and
maintained by IRCAM.

Group details:
* Name: musicspeech
* Home page: http://listes.ircam.fr/wws/info/musicspeech
* Email address: musicspeech@ircam.fr

Greg Beller, IRCAM,
moderator, musicspeech list

Back to Top

6 . Jobs openings

Back to Top

6-1 . ATT - Labs Research: Research Staff Positions - Florham Park, NJ

ATT - Labs Research is seeking exceptional candidates for Research Staff positions. AT&T is the premiere broadband, IP, entertainment, and wireless communications company in the U.S. and one of the largest in the world. Our researchers are dedicated to solving real problems in speech and language processing, and are involved in inventing, creating and deploying innovative services. We also explore fundamental research problems in these areas. Outstanding Ph.D.-level candidates at all levels of experience are encouraged to apply. Candidates must demonstrate excellence in research, a collaborative spirit and strong communication and software skills. Areas of particular interest are               

  • Large-vocabulary automatic speech recognition
  • Acoustic and language modeling
  • Robust speech recognition
  • Signal processing
  • Speaker recognition
  • Speech data mining
  • Natural language understanding and dialog
  • Text and web mining
  • Voice and multimodal search

AT&T Companies are Equal Opportunity Employers. All qualified candidates will receive full and fair consideration for employment. More information and application instructions are available on our website at http://www.research.att.com/. Click on "Join us". For more information, contact Mazin Gilbert (mazin at research dot att dot com).

 

Back to Top

6-2 . Summer Intern positions at Motorola Schaumburg Illinois USA

Motorola Labs - Center for Human Interaction Research (CHIR) located in Schaumburg Illinois, USA, is offering summer intern positions in 2008 (12 weeks each).

CHIR's mission:

Our research lab develops technologies that provide access to rich communication, media and information services effortless, based on natural, intelligent interaction. Our research aims on systems that adapt automatically and proactively to changing environments, device capabilities and to continually evolving knowledge about the user.

Intern profiles:

1) Acoustic environment/event detection and classification.

Successful candidate will be a PhD student near the end of his/her PhD study and is skilled in signal processing and/or pattern recognition; he/she knows Linux and C/C++ programming. Candidates with knowledge of acoustic environment/event classification are preferred.

2) Speaker adaptation for applications on speech recognition and spoken document retrieval.

The successful candidate must currently be pursuing a Ph.D. degree in EE or CS with complete understanding and hand-on experience on automatic speech recognition related research. Proficiency in Linux/Unix working environment and C/C++ programming. Strong GPA. A strong background in speaker adaptation is highly preferred.

3) Development of voice search-based web applications on a smartphone

We are looking for an intern candidate to help create an "experience" prototype based on our voice search technology. The app will be deployed on a smartphone and demonstrate intuitive and rich interaction with web resources. This intern project is oriented more towards software engineering than research. We target an intern with a master's degree and strong software engineering background. Mastery of C++ and experience with web programming (AJAX and web services) is required. Development experience on Windows CE/Mobile desired.

4) Integrated Voice Search Technology For Mobile Devices.

Candidate should be proficient in information retrieval, pattern recognition and speech recognition. Candidate should program in C++ and script languages such as Python or Perl in Linux environment. Also, he/she should have knowledge on information retrieval or search engines.

We offer competitive compensation, fun-to-work environment and Chicago-style pizza.

If you are interested, please send your resume to:

Dusan Macho, CHIR-Motorola Labs

Email: dusan.macho@motorola.com

Tel: +1-847-576-6762

Back to Top

6-3 . Nuance: Software engineer speech dialog tools

In order to strengthen our Embedded ASR Research team, we are looking for a:

SOFTWARE ENGINEER SPEECH DIALOGUE TOOLS

As part of our team, you will be creating solutions for voice user interfaces for embedded applications on mobile and automotive platforms.

OVERVIEW:

- You will work in Nuance's Embedded ASR R&D team, developing technology, tools, and run-time software to enable our customers to develop and test embedded speech applications. Together with our team of speech and language experts, you will work on natural language dialogue systems for our customers in the Automotive and Mobile sector.

- You will work either at Nuance's Office in Aachen, a beautiful, old city right in the heart of Europe with great history and culture, or at Nuance's International Headquarters in Merelbeke, a small town just 5km away from the heart of the vibrant and picturesque city of Ghent, in the Flanders region of Belgium. Both Aachen and Ghent offer some of the most spectacular historic town centers in Europe, and are home to large international universities.

- You will work in an international company and cooperate with people on various locations including in Europe, America and Asia. You may occasionally be asked to travel.

RESPONSIBILITIES:

- You will work on the development of tools and solutions for cutting edge speech and language understanding technologies for automotive and mobile devices.

- You will work on enhancing various aspects of our advanced natural language dialogue system, such as the layer of connected applications, the configuration setup, inter-module communication, etc.

- In particular, you will be responsible for the design, implementation, evaluation, optimization and testing, and documentation of tools such as GUI and XML applications that are used to develop, configure, and fine-tune advanced dialogue systems.

QUALIFICATIONS:

- You have a university degree in computer science, engineering, mathematics, physics, computational linguistics, or a related field.

- You have very strong software and programming skills, especially in C/C++, ideally also for embedded applications.

- You have experience with Python or other scripting languages.

- GUI programming experience is a strong asset.

The following skills are a plus:

- Understanding of communication protocols

- Understanding of databases

- Understanding of computational agents and related frameworks (such as OAA).

- A background in (computational) linguistics, dialogue systems, speech processing, grammars, and parsing techniques, statistics and machine learning, especially as related to natural language processing, dialogue, and representation of information

- You can work both as a team player and as goal-oriented independent software engineer.

- You can work in a multi-national team and communicate effectively with people of different cultures.

- You have a strong desire to make things really work in practice, on hardware platforms with limited memory and processing power.

- You are fluent in English and you can write high quality documentation.

- Knowledge of other languages is a plus.

CONTACT:

Please send your applications, including cover letter, CV, and related documents (maximum 5MB total for all documents, please) to

Deanna Roe                 Deanna.roe@nuance.com

Please make sure to document to us your excellent software engineering skills.

ABOUT US:

Nuance is the leading provider of speech and imaging solutions for businesses and consumers around the world.  Every day, millions of users and thousands of businesses experience Nuance by calling directory assistance, requesting account information, dictating patient records, telling a navigation system their destination, or digitally reproducing documents that can be shared and searched.  With more than 3000 employees worldwide, we are committed to make the user experience more enjoyable by transforming the way people interact with information and how they create, share and use documents. Making each of those experiences productive and compelling is what Nuance is about.

 

Back to Top

6-4 . Nuance: Speech scientist London UK

Nuance is the leading provider of speech and imaging solutions for businesses and consumers around the world. Every day, millions of users and thousands of businesses experience Nuance by calling directory assistance, requesting account information, dictating patient records, telling a navigation system their destination, or digitally reproducing documents that can be shared and searched.  With more than 2000 employees worldwide, we are committed to make the user experience more enjoyable by transforming the way people interact with information and how they create, share and use documents. Making each of those experiences productive and compelling is what Nuance is about.

To strengthen our International Professional Services team, based in London, we are currently looking for a

 

 

                            Speech Scientist, London, UK

Nuance Professional Services (PS) has designed, developed, and optimized thousands of speech systems across dozens of industries, including directory search, call center automation, applications in telecom, finance, airline, healthcare, and other verticals; applications for video games, mobile dictation, enhanced search services, SMS, and in-car navigation.  Nuance PS applications have automated approximately 7 billion phone conversations for some of the world's most respected companies, including British Airways, Vodafone, Amtrak, Bank of America, BellCanada, Citigroup, General Electric, NTT and Verizon.

The PS organization consists of energetic, motivated, and friendly individuals.  The Speech Scientists in PS are among the best and brightest, with PhDs from universities such as Cambridge (UK), MIT, McGill, Harvard, Penn, CMU, and Georgia Tech, and having worked at research labs such Bell Labs, Motorola Labs, and ATR (Japan), culminating in over 300 years of Speech Science experience and covering well over 20 languages.

Come and join Nuance PS and work on the latest technology from one of the prominent speech recognition technology providers, and make a difference in the way the world communicates.

Job Overview

As a Speech Scientist in the Professional Services group, you will work on automated speech recognition applications, covering a broad range of activities in all project phases, including the design, development, and optimization of the system.  You will:

  • Work across application development teams to ensure best possible recognition performance in deployed systems
  • Identify recognition challenges and assess accuracy feasibility during the design phase,
  • Design, develop, and test VoiceXML grammars and create JSPs, Java, and ECMAscript grammars for dynamic contexts
  • Optimize accuracy of applications by analyzing performance and tuning statistical language models, pronunciations, and acoustic models, including identifying areas for improvement by running the recognizer offline
  • Contribute to the generation and presentation of client-facing reports
  • Act as technical lead on more intensive client projects
  • Develop methodologies, scripts, procedures that improve efficiency and quality
  • Develop tools and enhance algorithms that facilitate deployment and tuning of recognition components
  • Act as subject matter domain expert for specific knowledge domains
  • Provide input into the design of future product releases

     Required Skills

  • MS or PhD in Computer Science, Engineering, Computational Linguistics, Physics, Mathematics, or related field (or equivalent)
  • Strong analytical and problem solving skills and ability to troubleshoot issues
  • Good judgment and quick-thinking
  • Strong programming skills, preferably Perl or Python
  • Excellent written and verbal communications skills
  • Ability to scope work taking technical, business and time-frame constraints into consideration
  • Works well in a team and in a fast-paced environment

Beneficial Skills

  • Strong programming skills in either Perl, Python, Java, C/C++, or Matlab
  • Speech recognition knowledge
  • Strong pattern recognition, linguistics, signal processing, or acoustics knowledge
  • Statistical data analysis
  • Experience with XML, VoiceXML, and Wiki
  • Ability to mentor or supervise others
  • Additional language skills, eg French, Dutch, German, Spanish

 

Back to Top

6-5 . Nuance: Research engineer speech engine

In order to strengthen our Embedded ASR Research team, we are looking for a:

RESEARCH ENGINEER SPEECH ENGINE

As part of our team, you will be creating solutions for voice user interfaces for embedded applications on mobile and automotive platforms.

 OVERVIEW:

- You will work in Nuance's Embedded ASR R&D team, developing, improving and maintaining core ASR engine algorithms for our customers in the Automotive and Mobile sector.

- You will work either at Nuance's Office in Aachen, a beautiful, old city right in the heart of Europe with great history and culture, or at Nuance's International Headquarters in Merelbeke, a small town just 5km away from the heart of the vibrant and picturesque city of Ghent, in the Flanders region of Belgium. Both Aachen and Ghent offer some of the most spectacular historic town centers in Europe, and are home to large international universities.

- You will work in an international company and cooperate with people on various locations including in Europe, America and Asia. You may occasionally be asked to travel.

RESPONSIBILITIES:

- You will work on the developing, improving and maintaining core ASR engine algorithms for cutting edge speech and natural language understanding technologies for automotive and mobile devices.

- You will work on the design and development of more efficient, flexible ASR search algorithms with high focus on low memory and processor requirements.

QUALIFICATIONS:

- You have a university degree in computer science, engineering, mathematics, physics, computational linguistics, or a related field. PhD is a plus.

- A background in (computational) linguistics, speech processing, ASR search, confidence values, grammars, statistics and machine learning, especially as related to natural language processing.

- You have very strong software and programming skills, especially in C/C++, ideally also for embedded applications.

The following skills are a plus:

- You have experience with Python or other scripting languages.

- Broad knowledge about architectures of embedded platforms and processors.

- Understanding of databases

- You can work both as a team player and as goal-oriented independent software engineer.

- You can work in a multi-national team and communicate effectively with people of different cultures.

- You have a strong desire to make things really work in practice, on hardware platforms with limited memory and processing power.

- You are fluent in English and you can write high quality documentation.

- Knowledge of other languages is a plus.

CONTACT:

Please send your applications, including cover letter, CV, and related documents (maximum 5MB total for all documents, please) to

Deanna Roe                  Deanna.roe@nuance.com

Please make sure to document to us your excellent software engineering skills.

ABOUT US:

Nuance is the leading provider of speech and imaging solutions for businesses and consumers around the world.  Every day, millions of users and thousands of businesses experience Nuance by calling directory assistance, requesting account information, dictating patient records, telling a navigation system their destination, or digitally reproducing documents that can be shared and searched.  With more than 3000 employees worldwide, we are committed to make the user experience more enjoyable by transforming the way people interact with information and how they create, share and use documents. Making each of those experiences productive and compelling is what Nuance is about.

 

Back to Top

6-6 . Nuance RESEARCH ENGINEER SPEECH DIALOG SYSTEMS:

In order to strengthen our Embedded ASR Research team, we are looking for a:

   RESEARCH ENGINEER SPEECH DIALOGUE SYSTEMS

As part of our team, you will be creating speech technologies for embedded applications varying from simple command and control tasks up to natural language speech dialogues on mobile and automotive platforms.

OVERVIEW:

-You will work in Nuance's Embedded ASR research and production team, creating technology, tools and runtime software to enable our customers develop embedded speech applications. In our team of speech and language experts, you will work on natural language dialogue systems that define the state of the art.

- You will work at Nuance's International Headquarters in Merelbeke, a small town just 5km away from the heart of the picturesque city of Ghent, in the Flanders region of Belgium. Ghent has one of the most spectacular historic town centers of Europe and is known for its unique vibrant yet cozy charm, and is home to a large international university.

- You will work in an international company and cooperate with people on various locations including in Europe, America, and Asia.  You may occasionally be asked to travel.

RESPONSIBILITIES:

- You will work on the development of cutting edge natural language dialogue and speech recognition technologies for automotive embedded systems and mobile devices.

- You will design, implement, evaluate, optimize, and test new algorithms and tools for our speech recognition systems, both for research prototypes and deployed products, including all aspects of dialogue systems design, such as architecture, natural language understanding, dialogue modeling, statistical framework, and so forth.

- You will help the engine process multi-lingual natural and spontaneous speech in various noise conditions, given the challenging memory and processing power constraints of the embedded world.

QUALIFICATIONS:

- You have a university degree in computer science, (computational) linguistics, engineering, mathematics, physics, or a related field. A graduate degree is an asset.

-You have strong software and programming skills, especially in C/C++, ideally for embedded applications. Knowledge of Python or other scripting languages is a plus. [HQ1] 

- You have experience in one or more of the following fields:

     dialogue systems

     applied (computational) linguistics

     natural language understanding

     language generation

     search engines

     speech recognition

     grammars and parsing techniques.

     statistics and machine learning techniques

     XML processing

-You are a team player, willing to take initiative and assume responsibility for your tasks, and are goal-oriented.

-You can work in a multi-national team and communicate effectively with people of different cultures.

-You have a strong desire to make things really work in practice, on hardware platforms with limited memory and processing power.

-You are fluent in English and you can write high quality documentation.

-Knowledge of other languages is a strong asset.

CONTACT:

Please send your applications, including cover letter, CV, and related documents (maximum 5MB total for all documents, please) to

 

Deanna Roe                  Deanna.roe@nuance.com

ABOUT US:

Nuance is the leading provider of speech and imaging solutions for businesses and consumers around the world.  Every day, millions of users and thousands of businesses experience Nuance by calling directory assistance, requesting account information, dictating patient records, telling a navigation system their destination, or digitally reproducing documents that can be shared and searched.  With more than 3000 employees worldwide, we are committed to make the user experience more enjoyable by transforming the way people interact with information and how they create, share and use documents. Making each of those experiences productive and compelling is what Nuance is about.

 

Back to Top

6-7 . Research Position in Speech Processing at Nagoya Institute of Technology,Japan

Nagoya Institute of Technology is seeking a researcher for a

post-doctoral position in a new European Commission-funded project

EMIME ("Efficient multilingual interaction in mobile environment")

involving Nagoya Institute of Technology and other five European

partners, starting in March 2008 (see the project summary below).

The earliest starting date of the position is March 2007. The initial

duration of the contract will be one year, with a possibility for

prolongation (year-by-year basis, maximum of three years). The

position provides opportunities to collaborate with other researchers

in a variety of national and international projects. The competitive

salary is calculated according to qualifications based on NIT scales.

The candidate should have a strong background in speech signal

processing and some experience with speech synthesis and recognition.

Desired skills include familiarity with latest spectrum of technology

including HTK, HTS, and Festival at the source code level.

For more information, please contact Keiichi Tokuda

(http://www.sp.nitech.ac.jp/~tokuda/).

About us

Nagoya Institute of Technology (NIT), founded on 1905, is situated in

the world-quality manufacturing area of Central Japan (about one hour

and 40 minetes from Tokyo, and 36 minites from Kyoto by Shinkansen).

NIT is a highest-level educational institution of technology and is

one of the leaders of such institutions in Japan. EMIME will be

carried at the Speech Processing Laboratory (SPL) in the Department of

Computer Science and Engineering of NIT. SPL is known for its

outstanding, continuous contribution of developing high-performance,

high-quality opensource software: the HMM-based Speech Synthesis

System "HTS" (http://hts.sp.nitech.ac.jp/), the large vocabulary

continuous speech recognition engine "Julius"

(http://julius.sourceforge.jp/), and the Speech Signal Processing

Toolkit "SPTK" (http://sp-tk.sourceforge.net/). The laboratory is

involved in numerous national and international collaborative

projects. SPL also has close partnerships with many industrial

companies, in order to transfer its research into commercial

applications, including Toyota, Nissan, Panasonic, Brother Inc.,

Funai, Asahi-Kasei, ATR.

Project summary of EMIME

The EMIME project will help to overcome the language barrier by

developing a mobile device that performs personalized speech-to-speech

translation, such that a user's spoken input in one language is used

to produce spoken output in another language, while continuing to

sound like the user's voice. Personalization of systems for

cross-lingual spoken communication is an important, but little

explored, topic. It is essential for providing more natural

interaction and making the computing device a less obtrusive element

when assisting human-human interactions.

We will build on recent developments in speech synthesis using hidden

Markov models, which is the same technology used for automatic speech

recognition. Using a common statistical modeling framework for

automatic speech recognition and speech synthesis will enable the use

of common techniques for adaptation and multilinguality.

Significant progress will be made towards a unified approach for

speech recognition and speech synthesis: this is a very powerful

concept, and will open up many new areas of research. In this

project, we will explore the use of speaker adaptation across

languages so that, by performing automatic speech recognition, we can

learn the characteristics of an individual speaker, and then use those

characteristics when producing output speech in another language.

Our objectives are to:

1. Personalize speech processing systems by learning individual

characteristics of a user's speech and reproducing them in

synthesized speech.

2. Introduce a cross-lingual capability such that personal

characteristics can be reproduced in a second language not spoken

by the user.

3. Develop and better understand the mathematical and theoretical

relationship between speech recognition and synthesis.

4. Eliminate the need for human intervention in the process of

cross-lingual personalization.

5. Evaluate our research against state-of-the art techniques and in a

practical mobile application.

 

Back to Top

6-8 . C/C++ Programmer Munich, Germany

Digital publishing AG is one of Europe's leading producers of  interactive software for foreign language training. In our e- learning courses we want to place the emphasis on speaking and  spoken language understanding.  In order to strengthen our Research & Development Team in Munich,  Germany, we are looking for experienced C or C++ programmers with  at least 3 years experience in the design and coding of  sophisticated software systems under Windows.   
We offer   
-a creative working atmosphere in an international team of   software engineers, linguists and editors working on    challenging research projects in speech recognition and    speech dialogue systems  
- participation in all phases of a product life cycle, as we    are interested in the fast transfer of research results    into products.  
- the possibility to participate in international scientific    conferences.   
- a permanent job in the center of Munich.  
- excellent possibilities for development within our fast    growing company.    
- flexible working times, competitive compensation and    arguably the best espresso in Munich.   
We expect  
-several years of practical experience in software    development in C or C++ in a commercial or academic    environment.  
-experience with parallel algorithms and thread    programming.  
-experience with object-oriented design of software    systems.  
-good knowledge of English or German.   
Desirable is  
-experience with optimization of algorithms.  
-experience in statistical speech or language    processing, preferably speech recognition, speech    synthesis, speech dialogue systems or chatbots.  
-experience with Delphi or Turbo Pascal.   
Interested? We look forward to your application:  (preferably by e-mail)   
digital publishing AG  
Freddy Ertl  f.ertl@digitalpublishing.de  
Tumblinger Straße 32  
D-80337 München Germany
Back to Top

6-9 . Speech and Natural Language Processing Engineer at M*Modal, Pittsburgh.PA,USA

M*Modal is a fast-moving speech technology company based in Pittsburgh, PA. Our portfolio of conversational speech recognition and natural language understanding technologies is widely recognized as the most advanced in the industry. We are a leading innovator in the field of conversational documentation services (CDS) - where speech recognition and natural language understanding are combined in a unique setup targeted to truly understand conversational speech and turn it directly into actionable and meaningful data. Our proprietary speech understanding technology - operating on M*Modal's computing grid hosted in our national data center - is already redefining the way clinical information is captured in healthcare.


We are seeking an experienced and dedicated speech and natural language processing engineer who wants to push the frontiers of conversational speech understanding. Join our renowned research and development team, and add to our unique blend of scientific and engineering excellence.

Responsibilities:

  • You will be working with other members of the R&D team to continuously improve our speech and natural language understanding technologies.
  • You will participate in designing and implementing algorithms, tools and methodologies in the area of automatic speech recognition and natural language processing/understanding.
  • You will collaborate with other members of the R&D team to identify, analyze and resolve technical issues.

Requirements:

  • Solid background in speech recognition, natural language processing, machine learning and information extraction.
  • 2+ years of experience participating in software development projects
  • Proficient with Java, C++ and scripting (e.g. Python, Perl, ...)
  • Excellent analytical and problem-solving skills
  • Integrate and communicate well in small R&D teams
  • Masters degree in CS or related engineering fields
  • Experience in a healthcare-related field a plus

 

In June 2007 M*Modal moved to a great new office space in the Squirrel Hill area of Pittsburgh.  We are excited to be growing and are looking for individuals who have a passion for the work they do and are interested in becoming a member of a dynamic work group of smart passionate drivers who also know how to have fun.

 

M*Modal offers a top-notch benefits package that includes medical, dental and vision coverage, short-term disability, matching 401K savings plan, holidays, paid-time-off and tuition refund.  If you would like to be considered for this opportunity, please send your resume and cover letter to Mary Ann Gamble at maryann.gamble@mmodal.com

 

Back to Top

6-10 . Senior Research Scientist -- Speech and Natural Language Processing at M*Modal, Pittsburgh, PA,USA

M*Modal is a fast-moving speech technology company based in Pittsburgh, PA. Our portfolio of conversational speech recognition and natural language understanding technologies is widely recognized as the most advanced in the industry. We are a leading innovator in the field of conversational documentation services (CDS) - where speech recognition and natural language understanding are combined in a unique setup targeted to truly understand conversational speech and turn it directly into actionable and meaningful data. Our proprietary speech understanding technology - operating on M*Modal's computing grid hosted in our national data center - is already redefining the way clinical information is captured in healthcare.


We are seeking an experienced and dedicated senior research scientist who wants to push the frontiers of conversational speech understanding. Join our renowned research and development team, and add to our unique blend of scientific and engineering excellence.

Responsibilities:

  • Plan and perform research and development tasks to continuously improve a state-of-the-art speech understanding system
  • Take a leading role in identifying solutions to challenging technical problems
  • Contribute original ideas and turn them into product-grade software implementations
  • Collaborate with other members of the R&D team to identify, analyze and resolve technical issues

Requirements:

  • Solid research & development background with 3+ years of experience in speech recognition research, covering at least two of the following topics: speech processing, acoustic modeling, language modeling, decoding, LVCSR, natural language processing/understanding, speaker verification/identification, audio mining
  • Working knowledge of Machine Learning, Information Extraction and Natural Language Processing algorithms
  • 3+ years of experience participating in large-scale software development projects using C++ and Java.
  • Excellent analytical, problem-solving and communication skills
  • PhD with focus on speech recognition or Masters degree with 3+ years industry experience working on automatic speech recognition
  • Experience and/or education in medical informatics a plus
  • Working experience in a healthcare related field a plus

 


In June 2007 M*Modal moved to a great new office space in the Squirrel Hill area of Pittsburgh.  We are excited to be growing and are looking for individuals who have a passion for the work they do and are interested in becoming a member of a dynamic work group of smart passionate drivers who also know how to have fun.

 

M*Modal offers a top-notch benefits package that includes medical, dental and vision coverage, short-term disability, matching 401K savings plan, holidays, paid-time-off and tuition refund.  If you would like to be considered for this opportunity, please send your resume and cover letter to Mary Ann Gamble at maryann.gamble@mmodal.com

 

Back to Top

6-11 . Postdoc position at LORIA, Nancy, France

Building an articulatory model from ultrasound, EMA and MRI data

Postdoctoral position

 

 

Research project

An articulatory model comprises both the visible and the internal mobile articulators which are involved in speech articulation: the lower jaw, tongue, lips and velum) as well as the fixed walls (the palate, the rear wall of the pharynx). An articulatory model is dynamic since the articulators deform during speech production. Such a model has a potential interest in the field of language learning by providing visual feedback on the articulation conducted by the learner, and many other applications.

Building an articulatory model is difficult because the different articulators have to be detected from specific image modalities: the lips are acquired through video, the tongue shape is acquired through ultrasound imaging with a high frame rate but these 2D images are very noisy. Finally, 3D images of all articulators can be obtained with MRI but only for sustained sounds (as vowels) due to the long acquisition time of MRI images.

The subject of this post-doc is to construct a dynamic 3D model of the entire vocal tract by merging the 3D information available in the MRI acquisitions and temporal 2D information provided by the contours of the tongue visible on the ultrasound images or X-ray images.

We are working on the construction of an articulatory model within the European project ASPI (http://aspi.loria.fr/ ).

We already built an acquisition system which allows us to obtain synchronized data from ultrasound, MRI, video and EM modalities.

Only a few complete articulatory models are currently available in the world and a real challenge in the field is to design set-ups and easy-to-use methods for automatically building the model of any speaker from 3D and 2D images. Indeed, the existence of more articulatory models would open new directions of research about speaker variability and speech production.

 

Objectives

The aim of the subject is to build a deformable model of the vocal tract from static 3D MRI images and 2D dynamic 2D sequences. Previous works have been conducted on the modelling of the vocal tract, and especially of the tongue (M. Stone[1] O. Engwall[2]). Unfortunately, important human interaction is required to extract tongue contours in the images. In addition, only one image modality is often considered in these works, thus reducing the reliability of the model obtained.

The aim of this work is to provide automatic methods for segmenting features in the images as well as methods for building a parametric model of the 3D vocal tract with these specific aims:

  • The segmentation process is to be guided by prior knowledge on the vocal tract. In particular shape, topologic as well as regularity constraints must be considered.
  • A parametric model of the vocal tract has to be defined (classical models are linear and built from a principal component analysis). Special emphasis must be put on the problem of matching the various features between the images.
  • Besides classical geometric constraints, both the building and the assessment of the model will be guided by acoustic distances in order to check for the adequation between the sound synthesized from the model and the sound realized by the human speaker.

 

Skill and profile

The recruited person must have a solid background in computer vision and in applied mathematics. Informations and demonstrations on the research topics addressed by the Magrit team are available at http://magrit.loria.fr/  

 

References

[1] M. Stone : Modeling tongue surface contours from Cine-MRI images. Journal of Speech, language, hearing research, 2001.

[2]:P. Badin, G. Bailly, L. Reveret: Three-dimensional linear articulatory modeling of tongue, lips and face based on MRI and video images, Journal of Phonetics, 2002, vol 30, p 533-553

 

Contact

Interested candidates are invited to contact Marie-Odile Berger, berger@loria.fr, +33 3 54 95 85 01

 

Important information

This position is advertised in the framework of the national INRIA campaign for recruiting post-docs. It is a one year position, renewable, beginning fall 2008. The salary is 2,320€ gross per month. 

 

Selection of candidates will be a two step process. A first selection for a candidate will be carried out internally by the Magrit group. The selected candidate application will then be further processed for approval and funding by an INRIA committee.

 

Doctoral thesis less than one year old (May 2007) or being defended before end of 2008. If defence has not taken place yet, candidates must specify the tentative date and jury for the defence.

 

Important - Useful links

Presentation of INRIA postdoctoral positions

To apply (be patient, loading this link takes times...)

 

Back to Top

6-12 . Internships at Motorola Labs Schaumburg

Motorola Labs - Center for Human Interaction Research (CHIR) 
located in Schaumburg Illinois, USA, 
is offering summer intern positions in 2008 (12 weeks each). 
CHIR's mission
 
Our research lab develops technologies that provide access to rich communication, media and 
information services effortless, based on natural, intelligent interaction. Our research 
aims on systems that adapt automatically and proactively to changing environments, device 
capabilities and to continually evolving knowledge about the user.
 
Intern profiles
 
1) Acoustic environment/event detection and classification. 
Successful candidate will be a PhD student near the end of his/her PhD study and is skilled 
in signal processing and/or pattern recognition; he/she knows Linux and C/C++ programming. 
Candidates with knowledge of acoustic environment/event classification are preferred. 
 
2) Speaker adaptation for applications on speech recognition and spoken document retrieval
The successful candidate must currently be pursuing a Ph.D. degree in EE or CS with complete 
understanding and hand-on experience on automatic speech recognition related research. Proficiency 
in Linux/Unix working environment and C/C++ programming. Strong GPA. A strong background in speaker 
adaptation is highly preferred.
 
3) Development of voice search-based web applications on a smartphone 
We are looking for an intern candidate to help create an "experience" prototype based on our 
voice search technology. The app will be deployed on a smartphone and demonstrate intuitive and 
rich interaction with web resources. This intern project is oriented more towards software engineering 
than research. We target an intern with a master's degree and strong software engineering background. 
Mastery of C++ and experience with web programming (AJAX and web services) is required. 
Development experience on Windows CE/Mobile desired.
 
4) Integrated Voice Search Technology For Mobile Devices
Candidate should be proficient in information retrieval, pattern recognition and speech recognition. 
Candidate should program in C++ and script languages such as Python or Perl in Linux environment. 
Also, he/she should have knowledge on information retrieval or search engines.
 
We offer competitive compensation, fun-to-work environment and Chicago-style pizza.
 
If you are interested, please send your resume to:
 
Dusan Macho, CHIR-Motorola Labs
Email: dusan [dot] macho [at] motorola [dot] com
Tel: +1-847-576-6762
Back to Top

6-13 . Masters in Human Language Technology

*** Studentships available for 2008/9 ***

                   One-Year Masters Course in  HUMAN LANGUAGE TECHNOLOGY 
                                         Department of Computer Science                
                                           The University of Sheffield - UK  
The Sheffield MSc in Human Language Technology (HLT) has been carefully tailored 
to meet the demand for graduates with the highly-specialised multi-disciplinary skills 
that are required in HLT, both as practitioners in the development of HLT applications 
and as researchers into the advanced capabilities required for next-generation HLT 
systems.  The course provides a balanced programme of instruction across a range 
of relevant disciplines including speech technology, natural language processing and 
dialogue systems.  The programme is taught in a research-led environment.  
This means that you will study the most advanced theories and techniques in the field, 
and have the opportunity to use state-of-the-art software tools.  You will also have 
opportunities to engage in research-level activity through in-depth exploration of 
chosen topics and through your dissertation.  As well as readying yourself for 
employment in the HLT industry, this course is also an excellent introduction to the 
substantial research opportunities for doctoral-level study in HLT.  
***  A number of studentships are available, on a competitive basis, to suitably 
qualified applicants.  These awards pay a stipend in addition to the course fees.  
***  For further details of the course, 
For information on how to apply 


Back to Top

6-14 . PhD positions at Supelec,

Training Generative Bayesian Networks with Missing Data

Learning generative model parameters with missing data : application to user modelling for spoken dialogue systems optimization.

  

Description : 

Probabilistic models such as Bayesian Networks (BN) are widely used for reasoning under uncertainty about many domains. A BN is a graphical model that captures statistical properties of a data set in a parametric and compact representation. This representation can then be used to realise probabilistic inference about the domain from which the data were drawn. As any Bayesian method, Bayesian networks allow taking a priori knowledge into account so as to enhance the performance of the model or speed up the parameters learning process. They are part of a wider class of models called generative models because they also allow generating new data having similar statistical properties as those used for training the model. The purpose of this thesis is to develop new training algorithms so as to learn BN parameters from incomplete datasets ; that is datasets were some data are missing. Since the resulting models will be used to expand the training data set with statistically consistent samples, this may influence the parameters learning process.

 Application :

This thesis is proposed in the framework of a European project (CLASSiC) aiming at automatically optimising human-machine spoken interactions. Current learning methods applied to such a task require a large amount of spoken dialogue data that is not easy to gather and, above all, to annotate. It’s even more difficult if the spoken dialogue system is still in the design process. A widely adopted solution is to expand the existing datasets using probabilistic generative models that produce new samples of dialogues. Yet, the training sets are most often annotated from recorded or transcribed dialogues without additional information coming from the users. Their actual goal when using the system is often missing and difficult to infer from transcriptions. Moreover, none of the current solutions have proven to generate realistic dialogues in term of goal consistency for instance. Training models considering the users goal as missing so as to generate realistic dialogues will be the objective.

 Context :

The PhD student will participate to a European project (CLASSiC) funded by the FP7 ICT program of the European Commission. The CLASSiC consortium includes Supélec (French engineering school), the universities of Edinburgh, Cambridge and Geneva as well as France Télécom (French telecom operator). The selected candidate will be hosted on the Metz campus of Supélec and will join the IMS research group.

 Profile

The candidate should hold a Master or Engineering degree in computer science or signal processing, with knowledge in machine learning and good skills in C++ programming. English speaking is required ; French speaking would be a plus.

 Contact : Olivier Pietquin (olivier.pietquin@supelec.fr)

 

 

Bayesian Methods for Generalization in Reinforcement Learning

   

Bayesian methods for generalization and direct policy search in reinforcement learning : application to spoken dialogue systems optimization.

  

Description : 

Reinforcement Learning (RL) is an on-line machine learning paradigm that aims at finding optimal policies to control complex stochastic dynamical systems. RL is typically a good candidate to replace heuristically-driven control policies because of its ability to learn continuously from experiences so as to maximize a utility function. It has proven its efficiency at finding optimal control policies in the case of discrete systems (discrete state and action spaces, as well as discrete time). Yet, most of real-world problems are continuous or hybrid in states and actions or their state space is big enough to be approximated by a continuous space. Designing realistic reinforcement learning algorithms for handling such problems is still research. Policy generalization by means of supervised learning is promising. Yet the optimal policy, or any related function, cannot be known accurately while learning and standard off line regression is therefore not suitable since new information is gathered while interacting with the system. So a critical issue is to build a generalization method, suitable for policy evaluation, able to update its parameters on-line from uncertain observations. In addition, uncertainty should be managed carefully, and thus estimated all along the learning process, so as to avoid generating hazardous policies while exploring optimally the policy space. Bayesian filtering is proposed as a possible framework to tackle this problem because of its inherent adequacy to learning under uncertainty. Particularly, it is proposed to make use of Bayesian filters to search directly in the policy space.

 Application :

This thesis is proposed in the framework of a European project (CLASSiC) aiming at automatically optimising human-machine spoken interactions. Current learning methods applied to such a task require a large amount of spoken dialogue data that is not easy to gather and, above all, to annotate. It’s even more difficult if the spoken dialogue system is still in the design process. Generalizing policies to handle interactions that are not cannot be found in collected database is therefore necessary. In addition, call centres are used by millions of persons every year. New information should therefore be available after the system has been released and should be used to enhance its performance. This is why on-line learning is crucial.

 Context

The PhD student will participate to a European project (CLASSiC) funded by the FP7 ICT program of the European Commission. The CLASSiC consortium includes Supélec (French engineering school), the universities of Edinburgh, Cambridge and Geneva as well as France Télécom (French telecom operator). The selected candidate will be hosted on the Metz campus of Supélec and will join the IMS research group.

 Profile

The candidate should hold a Master or Engineering degree in computer science or signal processing, with knowledge in machine learning and good skills in C++ programming. English speaking is required ; French speaking would be a plus.

 Contacts : 

Hervé Frezza-Buet (herve.frezza-buet@supelec.fr)  
Back to Top

6-15 . Speech Faculty Position at CMU, Pittsburgh, Pensylvania

Carnegie Mellon University: Language Technologies Institute
Speech Faculty Position

The Language Technologies Institute (LTI), a department in the School of Computer Science at Carnegie Mellon University invites applications for a tenure or research track faculty position, starting on or around August, 2008. We are particularly interested in candidates at the Assistant Professor level for tenure track or research track, and specializing in the area of Speech Recognition. Applicants should have a Ph.D. in Computer Science, or a closely related subject.

Preference will be given to applicants with a strong focus on new aspects of speech recognition such as finite state models, active learning, discriminative training, adaptation techniques.

The LTI offers two existing speech recognition engines, JANUS and SPHINX, which are integrated into a wide range of speech applications including speech-to-speech translation and spoken dialog systems.

The LTI is the largest department of its kind with more than 20 faculty and 100 graduate students covering all areas of language technologies, including speech, translation, natural language processing, information retrieval, text mining, dialog, and aspects of computational biology. The LTI is part of Carnegie Mellon's School of Computer Science, which has hundreds of faculty and students in a wide variety of areas, from theoretical computer science and machine learning to robotics, language technologies, and human-computerinteraction.

Please follow the instructions for faculty applications to the School of Computer Science, explicitly mentioning LTI, at: http://www.cs.cmu.edu/~scsdean/FacultyPage/scshiringad08.html, and also notify the head of the LTI search committee by email, Alan W Black (awb@cs.cmu.edu) or Tanja Schultz (tanja@cs.cmu.edu) so that we will be looking for your application. Electronic submissions are greatly preferred but if you wish to apply on paper, please send two copies of your application materials, to the School of Computer Science
               
1. Language Technologies Faculty Search Committee
   School of Computer Science
   Carnegie Mellon University
   5000 Forbes Avenue
   Pittsburgh, PA 15213-3891

               
Each application should include curriculum vitae, statement of research and teaching interests, copies of 1-3 representative papers, and the names and email addresses of three or more individuals who you have asked to provide letters of reference. Applicants should arrange for reference letters to be sent directly to the Faculty Search Committee (hard copy or email), to arrive before March 31, 2008. Letters will not be requested directly by the Search Committee. All applications should indicate citizenship and, in the case of non-US citizens, describe current visa status.
               
Applications and reference letters may be submitted via email (word or .pdf format) to lti-faculty-search@cs.cmu.edu
 

Back to Top

6-16 . Opened positions at Microsoft: Danish Linguist (M/F)

MLDC – Microsoft Language Development Center, a branch of the Microsoft Product Group that develops Speech Recognition and Synthesis Technologies, situated in Porto Salvo, Portugal (http://www.microsoft.com/portugal/mldc), is seeking a full-time temporary language expert in the Danish language, for a 3 month contract, to work in speech technology related development projects. The successful candidate should have the following requirements:

·        Be native or near native Danish speaker

·         Have a university degree in Linguistics or related field (preferably in Danish Linguistics)

·         Have an advanced level of English

·         Have some experience in working with Speech Technology/Natural Language Processing/Linguistics, either in academia or in industry

·         Have some computational ability – no programming is required, but he/she should be comfortable working with MS Windows and MS Office tools

·         Have team work experience

·         Willing to work in Porto Salvo (near Lisbon) for the duration of the contract

·         Willing to start immediately (April 1, 2008)

To apply, please submit your resume and a brief statement describing your experience and abilities to Daniela Braga: i-dbraga@microsoft.com

We will only consider electronic submissions

Back to Top

6-17 . Opened positions at Microsoft: Swedish Linguist (M/F)

MLDC – Microsoft Language Development Center, a branch of the Microsoft Product Group that develops Speech Recognition and Synthesis Technologies, situated in Porto Salvo, Portugal (http://www.microsoft.com/portugal/mldc), is seeking a full-time temporary language expert in the Swedish language, for a 1 month contract, to work in speech technology related development projects. The successful candidate should have the following requirements:

·        Be native or near native Swedish speaker

·         Have a university degree in Linguistics or related field (preferably in Swedish Linguistics)

·         Have an advanced level of English

·         Have some experience in working with Speech Technology/Natural Language Processing/Linguistics, either in academia or in industry

·         Have some computational ability – no programming is required, but he/she should be comfortable working with MS Windows and MS Office tools

·         Have team work experience

·         Willing to work in Porto Salvo (near Lisbon) for the duration of the contract

·         Willing to start in May 2008

To apply, please submit your resume and a brief statement describing your experience and abilities to Daniela Braga: i-dbraga@microsoft.com

We will only consider electronic submissions

Back to Top

6-18 . Opened positions at Microsoft: Dutch Linguist (M/F)

MLDC – Microsoft Language Development Center, a branch of the Microsoft Product Group that develops Speech Recognition and Synthesis Technologies, situated in Porto Salvo, Portugal (http://www.microsoft.com/portugal/mldc), is seeking a full-time temporary language expert in the Dutch language, for a 1 month contract, to work in speech technology related development projects. The successful candidate should have the following requirements:

·        Be native or near native Dutch speaker

·         Have a university degree in Linguistics or related field (preferably in Dutch Linguistics)

·         Have an advanced level of English

·         Have some experience in working with Speech Technology/Natural Language Processing/Linguistics, either in academia or in industry

·         Have some computational ability – no programming is required, but he/she should be comfortable working with MS Windows and MS Office tools

·         Have team work experience

·         Willing to work in Porto Salvo (near Lisbon) for the duration of the contract

·         Willing to start in May 2008

To apply, please submit your resume and a brief statement describing your experience and abilities to Daniela Braga: i-dbraga@microsoft.com

We will only consider electronic submissions.

Back to Top

6-19 . PhD position at Orange Lab

* Position : PhD, 3 years
* Research Area : speech synthesis, prosody modelling
* Location : Orange Labs, Lannion, France
* Start date: Openings Immediate.
* Summary:=20
The emergence of corpus-based technologies allowed major improvements in 
Text-to-Speech (TTS) during the last decade. Such systems can produce 
very natural synthetic sentences, almost undistinguishable from natural 
speech. Synthetic prompts can now replace human recordings in some 
commercial applications, like IVR services. However their use remains 
delicate due to the lack of prosody control (intonation, rhythm...). The 
aim of the project is to provide the user with a support tool for easily 
specifying the prosody of the synthesized speech.
 
The work will focus on characterising essential prosodic elements needed 
for expressive speech synthesis, possibly restricted to a specific 
application domain. The chosen typology will have to match the prosody 
of the TTS corpora as accurately as possible, through a relevant set of 
prosodic primitives. The robustness of the topology is critical for 
automatic annotation of the databases.
The work will also address ergonomics -how to propose to the user a 
convenient way to specify prosody- and will be closely related to the 
signal production techniques -signal processing and/or unit selection.
 
 
* Research Lab:
The PhD will be hosted in the Speech Synthesis team at Orange Labs. 
Orange Labs develop a state-of-the-art corpus-based speech synthesizer 
(demonstrator available on http://tts.elibel.tm.fr).
 
 
* Requirements:
The candidate has a (research) master in Computer Science or Electrical 
Engineering. The candidate has a strong interest in doing research, 
excellent writing skills in French or English and good programming 
skills. Knowledge in speech processing or automatic classification is a 
plus.
 
 
* Contacts:
For more information please contact:
- Cedric Boidin, cedric.boidin@orange-ftgroup.com, +33 2 96 05 33 53
- Thierry Moudenc, thierry.moudenc@orange-ftgroup.com, +33 2 96 05 16 59
 
Back to Top

6-20 . Social speech scientist at Wright Patterson AFB, Ohio, USA

Title: Social Scientist, DR-0101-II
 
Salary:   Base salary range for the position will be between $56,948 to
$89,423.  Salary will be supplemented by an additional amount related to
the cost of living in Dayton, Ohio.  
 
The Collaborative Interfaces Branch of the Air Force Research
Laboratory, Human Effectiveness Directorate, located at Wright-Patterson
AFB, OH (just outside Dayton, OH) is seeking to hire a social scientist,
(DR-0101-II).  The selectee will contribute to all phases of basic
research, applied research, and prototype development projects involving
the application of linguistic and computer science principles to the
technical areas of computational linguistics and natural language
processing with application to speech-to-speech translation, machine
translation of foreign languages, information retrieval, named entity
detection, topic detection, text categorization, text processing, speech
recognition, speech synthesis, and speech processing.  The selectee will
determine how best to accomplish the research objectives, develop and
evaluate alternatives, design and conduct experiments, analyze and
interpret results, publish papers and technical reports, deliver
presentations of research results, monitors in-house and contractual
work efforts, and meets with customers to determine technology needs.
The selectee will develop patents and licensing strategies for
technologies developed in the research program, where appropriate.
 
All applicants will need to be United States citizens.  To be considered
qualified, applicants must meet the basic requirements for Social
Scientist positions.  These requirements are a degree: behavioral or
social science; or related disciplines appropriate to the position OR a
combination of education and experience--that provided the applicant
with knowledge of one or more of the behavioral or social sciences
equivalent to a major in the field.  OR four years of appropriate
experience that demonstrated that the applicant has acquired knowledge
of one or more of the behavioral or social sciences equivalent to a
major in the field.  More information on these basic requirements can be
found at: http://www.opm.gov/qualifications/SEC-IV/A/GS-PROF.asp . 
 
To apply for this position, please go to:
http://jobsearch.usajobs.gov/getjob.asp?JobID=3D70353344&AVSDM=3D2008%2D0
4%2
D03+00%3A03%3A01&Logo=3D0&sort=3Drv&jbf571=3D2&FedEmp=3DY&vw=3Dd&brd=3D38
76&ss=3D0&Fed
Pub=3DY&rad=3D10&zip=3D45433
 
For more information on this position, please contact David Crawford at
(937) 255-1788 or via e-mail at david.crawford3@wpafb.af.mil .  All
application packages must be received or postmarked by the close of this
announcement: 22 May 2008. 
 
 
Back to Top

6-21 . Professeur a PHELMA du Grenoble INP (in french)

Un poste de Professeur des universités 61e section à l'école PHELMA
du Grenoble INP est ouvert au concours pour la rentrée 2008. Les profils 
enseignement et recherche sont décrits ds la fiche de poste ci-jointe.
   Le profil recherche a été défini par le département "Parole et 
Cognition" de GIPSA-Lab. L'équipe "Machines Parlantes, Agents 
Conversationnels & Interaction Face-à-face" du département est 
particulièrement ciblée par le projet d'intégration, bien que le projet 
puisse concerner d'autres équipes. Vous trouverez le descriptif des 
thèmes de recherche de GIPSA-lab, du département et de ses équipes ainsi 
que les contacts appropriés sur http://www.gipsa-lab.inpg.fr. Merci de 
prendre contact avec la direction du département pour tout renseignement 
complémentaire.
 
   Gerard BAILLY, directeur-adjoint du GIPSA-Lab
 
Back to Top

6-22 . POSTDOCTORAL FELLOWSHIP OPENING AT ICSI Berkeley

POSTDOCTORAL FELLOWSHIP OPENING AT ICSI

The International Computer Science Institute (ICSI) invites applications for a Postdoctoral Fellow position in spoken language
processing. The Fellow will be working with Dilek Hakkani-Tur, along with other PhD student and international colleagues, in the area of information distillation. Some experience with machine learning for text categorization is required, along with strong capabilities in speech and language processing in general.

ICSI is an independent not-for-profit Institute located a few blocks from the Berkeley campus of the University of California. It is
closely affiliated with the University, and particularly with the Electrical Engineering and Computer Science (EECS) Department. See
http://www.icsi.berkeley.edu to learn more about ICSI.

The ICSI Speech Group has been a source of novel approaches to speech and language processing since 1988. It is primarily known for its work in speech recognition, although it has housed major projects in speaker recognition,
metadata extraction, and language understanding in the last few years. The effort in information distillation will draw upon lessons learned in our previous work for language understanding.

Applications should include a cover letter, vita, and the names of at least 3 references (with both postal and email addresses). Applications should be sent by email to dilek@icsi.berkeley.edu

ICSI is an Affirmative Action/Equal Opportunity Employer. Applications from women and minorities are especially encouraged. Hiring is contingent on eligibility to work in the United States

Back to Top

6-23 . PhD positions at GIPSA (formerly ICP) Grenoble France

Laboratory: GIPSA-lab, Speech & Cognition Dept.
Address : ENSIEG, Domaine Universitaire - BP46, 38402 Saint Martin d'Hères
Thesis supervisor: Pierre Badin
e-mail address: Pierre.Badin@gipsa-lab.inpg.fr
Co- supervisor(s): Gérard Bailly
Title: Control of talking heads by multimodal inversion – Application to language learning
and rehabilitation
Context and problem :
Speech production necessitates fairly precise control of the various orofacial articulators (jaw, lips,tongue, velum, cheeks, etc.). Regulating these gestures implies that a fairly precise feedback about his / her vocal production is available to the speaker. Auditory feedback is essential and its degradation can generate degradation, if not total loss, of speech production capabilities. In fact, the perception of the acoustic consequences of articulatory gestures can be degraded in different ways: either peripherically through the degradation, if not the complete loss, of this feedback (deaf and hearing impaired people, implanted or not), either in a more central way through the loss of sensitivity to phonological contrasts due to phonological deafness (contrasts not exploited in the mother language: i.e. Japanese speakers have extreme difficulties producing the /l/ vs. /r/ contrast not exploited in their mother language).
The stake of this doctoral work is to explore the speakers’ abilities to exploit a virtual multisensory
feedback that complements, if not substitutes for, the failing auditory feedback. The virtual
feedback that will be designed and studied in this framework will be provided by a talking head (see on the right in 2D or 3D) that reproduces in an augmented reality mode – in real time or offline – the articulation of a sound for which only the acoustical and / or visual signal is available.
The thesis challenge is to design and assess a robust system that can estimate the articulation from its sensory consequences and in particular that deals with the normalisation problem (establishing the correspondence between the audiovisual spaces of the talking head and of the speaker), and then to quantify the benefit that an hearing impaired person or a second language learner can gain from a restored sensory motor feedback loop.
 
---------------------------------------------------------------------------------------------------------------------------------
Multimodality for face-to-face interaction between an embodied conversational agent and a human
partner: experimental software platform
Thesis financed by a research grant from Rhône-Alpes region - 1750€ gross/month
Selected in 2008 by the research Cluster ISLE (http://www.grenoble-universites.fr/isle)
The research work aims at developing multimodal systems enabling an
embodied conversational agent and a human partner to engage into a
situated face-to-face conversation notably involving objects of the
environment. These interactive multimodal systems involve numerous
software sensors and actuators such as recognizing/synthesizing speech,
facial expressions, gaze or gestures of the interlocutors. The environment
in which this interaction occurs should also be analyzed so that to
maintain or attract attention towards objects of interest in the dialog.
Perception-action loops of these multimodal systems should finally take into account the mutual
conditioning of the cognitive states of the interlocutors as well as the psychophysical, linguistic and social
dimensions of these multimodal turns.
In this context and due to the complexity of the signal and information processing to implement, the
objective of this work is first to conceive and implement a wizard-of-Oz software platform for exploring
the conception space by simulating parts of this interactive system by a human accomplice while other
parts are taken in charge by automatic behavior. The first objective of the work is to study the impact of
this faked versus automatic behavior on the interactions in terms of cognitive load, subject’s satisfaction
or task performance. The final objective is of course to progressively substitute to human intelligence and
comprehension of the scene an autonomous context-sensitive and context-aware interactive system.
The software platform should warrant real-time processing of perceived and generated multimodal events
and should provide the wizard-of-Oz with tools that are adequate and intuitive for controlling the part of
the simulated behavior of the system.
This thesis will be conducted in the framework of the OpenInterface european project (FP6-IST-35182 on
multimodal interaction) and the ANR project Amorces (human-robot collaboration for manipulating
objects).
Expected results
Experimental:
• Prototype of the Wizard-of-Oz platform
• Recordings of multimodal conversations between an embodied conversational agent and a human
partner using the prototype
Theoretical :
• Taxonomy of Wizard-of-Oz platforms
• Design of real-time Wizard-of-Oz platforms
• Highly modular software model of multimodal systems
• Multi-layered model of face-to-face conversation
Keywords
Interaction model, multimodality, multimodal dialog, interaction engineering, software architecture,
Wizard-of-Oz platform
Thesis proposed by
Gérard BAILLY, GIPSA-Lab, MPACIF team Gerard.Bailly@gipsa-lab.inpg.fr
Laurence NIGAY, LIG, IIHM team Laurence.Nigay@imag.fr
Doctoral program: EEATS GRENOBLE – FRANCE http://www.edeeats.inpg.fr/
Back to Top

6-24 . PhD in speech signal processing at Infineon Sophia Antipolis

Open position: PhD in speech signal processing

 

Title: Solutions for non-linear acoustic echo.

 

Background:

Acoustic echo is an annoying disturbance due to the sound feedback between the loudspeaker and the microphone of terminals. Acoustic echo canceller and residual echo cancellation are widely used to reduce the echo signal. The performance of existing echo reduction systems strongly relies on the assumption that the echo path between transducers is linear. However, today’s competitive audio consumer market may favour sacrificing linear performance for the integration of low cost analogue components. The assumption of linearity is not hold anymore, due to the nonlinear distortions introduced by the loudspeakers and the small housing where transducers are placed.

 

Task:

The PhD thesis will lead research in the field of non-linear system applied for acoustic echo reduction. The foreseen tasks deal first with proper modelling of mobile phone transducers presenting non-linearity, to get a better understanding in which environment echo reduction works. Using this model as a basis, study of performance of linear system will permit to get a good understanding on the problems brought by non-linearity. In a further step, the PhD student will develop and test non-linear algorithms coping with echo cancellation in non-linear environment.

About the Company:

Sophia-Antipolis site is one of the main Infineon Technologies research and development centers worldwide. Located in the high-tech valley of Sophia-Antipolis, near Nice in the south of France, a team of 140 experienced research and development engineers specialized in Mobile Solutions, Embedded SRAM, and Design-Flow Software. The PhD will take place within the Mobile Solution group which is responsible for specifying and designing baseband integrated circuits for cellular phones. The team is specialized in innovative algorithm development, especially in audio, system specification and validation, circuit design and embedded software. Its work makes a significant contribution to the Infineon Technologies wireless chipset portfolio.

Required skills:

-        Master degree

-        Strong background in signal processing.

-        Background in speech signal or non-linear system processing is a plus.

-        Programming: Matlab, C.

-        Knowledge in C-fixed point / DSP implementation is a plus.

-        Language: English

Length of the PhD: 3 years

Place: Infineon Technologies France, Sophia-Antipolis

Contact:

Christophe Beaugeant

Phone: +33 (0)4 92 38 36 30

E-mail : christophe.beaugeant@infineon.com

Back to Top

6-25 . PhD position at Institut Eurecom Sophia Antipolis France

Institut Eurécom, Sophia Antipolis, France
Doctoral Position
 
Title: Speaker Diarisation for Internet-based Content Processing
 
Department: Multimedia Communications
URL: http://www.eurecom.fr/research/
Start Date: Immediate vacancy
Duration: 3 years
 
Description: Also known as the “who spoke when?” task, speaker diarization aims to detect the 
number of speakers within an audio document and to identify when each speaker is active. Speaker
diarization is an important problem with applications in speaker indexing, document retrieval,
rich transcription, speech and speaker recognition/biometrics and video conferencing, among
others. Research to date has focused on narrow application domains, namely telephone
speech, broadcast news and meeting recordings. In line with recent shifts in the field, this
research project will explore exciting new applications of speaker diarization in the area of
Internet-based content processing, especially user-generated content. The diversity of such
content presents a number of new challenges. Some areas in which the candidate will be
expected to work involve speech enhancement / noise compensation, beam-forming, speech
activity detection, channel compensation and statistical speaker modelling. The successful
candidate will have the opportunity for international travel and to become involved in
national and European projects and internationally competitive speaker diarization trials.
This position offers a unique opportunity to develop broad knowledge in cutting edge speech
and audio processing research.
 
Requirements: The successful candidate will have a Master’s degree in engineering, mathematics,
computing, physics or a related relevant discipline. You will have strong mathematical,
programming and communication skills and be highly motivated to undertake challenging
research. Good English language speaking and writing skills are essential.
 
Applications: Please send to the address below (i) a one page statement of research interests and
motivation, (ii) your CV and (iii) three letters of reference (2 academic, 1 personal).
Contact: Nicholas Evans
Postal Address: 2229 Route des Crêtes BP 193, F-06904 Sophia Antipolis cedex, France
Email address: nicholas.evans@eurecom.fr
Web address: http://www.eurecom.fr/main/institute/job.en.htm
Phone: +33/0 4 93 00 81 14
Fax: +33/0 4 93 00 82 00
 
Institut Eurécom is located in Sophia Antipolis, a vibrant science park on the French Riviera. It 
is in close proximity with a large number of research units of leading multi-national corporations 
in the telecommunications, semiconductor and biotechnology sectors, as well as other outstanding 
research and teaching institutions. A freethinking, multinational population and the unique 
geographic location provide a quality of life without equal.
 
Institut Eurécom, 2229 Route des Crêtes BP 193, F-06904 Sophia Antipolis cedex, France
www.eurecom.fr
 
Back to Top

6-26 . Two PhD's positions at the University of Karlsruhe Germany

At the Institut für Theoretische Informatik, Lehrstuhl Prof. Waibel Universität Karlsruhe (TH) a

Ph.D. position

in the field of

 

Software System Integration of Automatic Speech Recognition and Machine Translation for Speech based Multimedia Indexing

 

has to be filled immediately with a salary according to TV-L, E13.

 

The responsibilities include integration, fusion and development of core technologies in the area of automatic speech recognition, simultaneous machine translation, in the context of speech based indexing of multimedia documents within application targeted research projects in the area of multimodal Human-machine interaction.  Set in a framework of internationally and industry funded research programs, the successful candidate is expected to contribute to showcases for state-of-the art of modern recognition and translation systems.

 

We are an internationally renowned research group with an excellent infrastructure. Examples of our projects for improving Human-Machine and Human-to-Human interaction are: JANUS - one of the first speech translation systems proposed, simultaneous translation of lectures, portable speech translators, meeting browser and lecture tracker.

 

Within the framework of the International Center for Advanced Communication Technology (interACT), our institute operates in two locations, Universität Karlsruhe (TH), Germany and at Carnegie Mellon University, Pittsburgh, USA.  International joint and collaborative research at and between our centers is common and encouraged, and offers great international exposure and activity. 

 

Applicants are expected to have:

  • an excellent university degree (M.S, Diploma or Ph.D.) in Computer Science, Electrical Engineering, Mathematics, or related fields
  • excellent programming skills 
  • advanced knowledge in at least one of the fields of Machine Learning, Pattern Recognition, Statistics, or System Integration

 

For candidates with Bachelor or Master’s degrees, the position offers the opportunity to work toward a Ph.D. degree.

 

In line with the university's policy of equal opportunities, applications from qualified women are particularly encouraged. Handicapped applicants will be preferred in case of the same qualification.

 

Questions may be directed to: Sebastian Stüker, Tel. +49 721 608 6284, E-Mail: stueker@ira.uka.de,  http://isl.ira.uka.de

 

The application should be sent to Professor Waibel, Institut für Theoretische Informatik, Universität Karlsruhe (TH), Adenauerring 4, 76131 Karlsruhe, Germany

 

----------------------------------------------------------------------------------------------------------------------------------

 

 

 

At the Institut für Theoretische Informatik, Lehrstuhl Prof. Waibel Universität Karlsruhe (TH) a

 

 

Ph.D. position

in the field of

Multimodal Dialog Systems

 

Is to be filled immediately with a salary according to TV-L, E13.

 

The responsibilities include basic research in the area of multimodal dialog systems, especially multimodal human-robot interaction and learning robots, within application targeted research projects in the area of multimodal Human-machine interaction.  Set in a framework of internationally and industry funded research programs, the successful candidate(s) are expected to contribute to the state-of-the art of modern spoken dialog systems, improving natural interaction with robots.

 

We are an internationally renowned research group with an excellent infrastructure. Current research projects for improving Human-Machine and Human-to-Human interaction are focus on dialog management for Human-Robot interaction.

 

Within the framework of the International Center for Advanced Communication Technology (interACT), our institute operates in two locations, Universität Karlsruhe (TH), Germany and at Carnegie Mellon University, Pittsburgh, USA.  International joint and collaborative research at and between our centers is common and encouraged, and offers great international exposure and activity. 

 

Applicants are expected to have:

  • an excellent university degree (M.S, Diploma or Ph.D.) in Computer Science, Computational Linguistics, or related fields
  • excellent programming skills 
  • advanced knowledge in at least one of the fields of Speech and Language Processing, Pattern Recognition, or Machine Learning

 

For candidates with Bachelor or Master’s degrees, the position offers the opportunity to work toward a Ph.D. degree.

 

In line with the university's policy of equal opportunities, applications from qualified women are particularly encouraged. Handicapped applicants will be preferred in case of the same qualification.

 

Questions may be directed to: Hartwig Holzapfel, Tel. +49 721 608 4057, E-Mail: hartwig@ira.uka.de,  http://isl.ira.uka.de

 

The application should be sent to Professor Waibel, Institut für Theoretische Informatik, Universität Karlsruhe (TH), Adenauerring 4, 76131 Karlsruhe, Germany

Back to Top

6-27 . Job opening at TFH Berlin University of Applied Sciences, Department of Computer Sciences and Media, Germany

Job opening at TFH Berlin University of Applied Sciences, Department of Computer Sciences and Media, Germany: Post-graduate position (part-time) for a computer scientist or engineer with a background in ASR and/or TTS in a three-year project in Computer-Aided Language Learning funded by the German Ministry of Education and Research. Start: 1 July 2008. The task will be the development and evaluation of a software system for teaching Mandarin pronunciation to Germans, as well as administrative duties with the funding body. Knowledge of E-Learning applications, German and/or Mandarin are welcome, good English skills mandatory. Candidates will have the opportunity to pursue a PhD degree and should be preferably EU citizens. The position is paid according to BAT 2a/2 (German pay scale for federal employees), about €28.000/year depending on age and marital status. Please direct further enquiries to Prof. Dr. Hansjörg Mixdorff at mixdorff@tfh-berlin.de. 

Back to Top

6-28 . Offre d' Allocation de Recherche - Rentree Universitaire 2008 (in french)

Offre d’Allocation de Recherche – Rentrée Universitaire 2008

Les cartes sensorimotrices de la parole: Corrélats neuroanatomiques des systèmes de perception et de production des voyelles et consonnes du Français.

Marc Sato, Chargé de Recherche CNRS

Jean-Luc Schwartz, Directeur de Recherche CNRS

GIPSA-Lab, UMR CNRS 5216, Département Parole et Cognition, Equipe "Perception, Multimodalité, Développement", Grenoble France (http://gipsa-lab.inpg.fr).

Pour la rentrée universitaire 2008, nous proposons une Allocation Doctorale de Recherche dans le cadre de l’Ecole Doctorale Ingénierie pour la santé, la Cognition et l’Environnement, habilitée par les universités Pierre Mendès France, Joseph Fourier et l’Institut National Polytechnique de Grenoble (EDISCE – ED216, http://www-sante.ujf-grenoble.fr/edisce/).

Dans le cadre théorique d’un possible couplage fonctionnel entre systèmes de perception et de production de la parole, ce projet de recherche a pour but de tester l’existence de connectivités dynamiques fonctionnelles spécifiques entre régions sensorielles et motrices lors de la perception et de la production des voyelles et consonnes du Français. Dans ce but, une série d’expériences en imagerie par résonance magnétique fonctionnelle (IRMf) et en électro-encéphalographie (EEG) devra permettre une description spatiale et temporelle précise des activités cérébrales impliquées lors de la production et la perception des phonèmes du français ainsi que de la connectivité dynamique entre ces régions. Les expériences menées devraient permettre une compréhension approfondie des processus d’analyse et de construction des représentations verbales par la mise en évidence d’une co-structuration et dépendance des régions sensorielles et motrices.

 

Outre une recherche bibliographique approfondie du corpus de la littérature, multidisciplinaire, dans les domaines de la phonétique et de la phonologie, de la neuropsycholinguistique et des neurosciences cognitives, ce projet comprendra l’élaboration de protocoles expérimentaux en IRMf et EEG, la passation des sujets et le recueil des données, enfin l’analyse statistique des données et leur interprétation.

 

Le(a) candidat(e) aura de préférence un M2R en neurosciences, sciences cognitives, psychologie cognitive ou neuropsychologie. Le candidat devra être familier avec l’expérimentation comportementale classique ainsi que les tests et analyses statistiques appliqués en psychologie cognitive. La pratique de l’anglais et une première expérience avec les techniques d’IRMf et/ou d’EEG est souhaitable.

 

Ce travail de thèse sera inscrit dans le contexte d’un projet de recherche grenoblois portant sur la mise en œuvre de nouvelles techniques d’analyses et de modélisation non-linéaire de mesure de la connectivité cérébrale. Il s’appuiera également sur des collaborations nationales et internationales, notamment avec le Centre de Recherche sur le Langage, l’Esprit et le Cerveau de l’Université McGill (Canada). Le financement est pour une période de trois ans, à compter d’Octobre 2008.

 

Contact pour cette annonce:

 

Marc Sato

GIPSA-Lab, UMR CNRS 5216

Université Stendhal

BP 25 - 38040 Grenoble cedex 9

Tel: (+33) (0)476 827 784

Fax: (+33) (0)476 824 335

E-mail: marc.sato [at] gipsa-lab.inpg.fr

 

La date limite pour envoyer un CV détaillé est fixée au 18 juin. Dès que possible, envoyez également les notes finales et classement au M2R, et éventuellement une lettre de recommandation.

 

Back to Top

6-29 . Theses de l' ecole doctorale MITT, Universite Paul Sabatier Toulouse III (mainly in french)

Thèse : Caractérisation et l'identification automatique de dialectes These DeadLine: 10/06/2008 jerome.farinas@irit.fr http://www.irit.fr/recherches/SAMOVA/these-caracterisation-et-lidentification-automatique-de-dialectes.html Description du sujet Le recherche en traitement automatique de la parole s'intéresse de plus en plus au traitement de grandes collections de données, dans des conditions de parole spontanée et conversationnelle. Les performances sont dépendantes de toutes les variabilités de la parole. Une de ces variabilités concerne l'appartenance dialectale du locuteur, qui induit de la variabilité tant au niveau de la prononciation phonétique, mais également au niveau de la morphologie des mots et de la prosodie. Nous proposons de réaliser un sujet de recherche sur la caractérisation automatique dialectale des locuteurs, en vue de diriger l'adaptation des systèmes de reconnaissance de la parole : la sélection de modèles acoustiques et prosodiques adaptées permettront d'améliorer des performances dans des conditions de reconnaissance indépendante du locuteur. La réalisation d'un tel système devra s'appuyer sur les avancées récentes en identification de la langue au niveau de la modélisation acoustique par exp! loration de réseaux de phonèmes et proposer une modélisation fine basée sur la micro et macro prosodie. Les bases de données disponibles au sein du projet sur la phonologie du français contemporain (http://www.projet-pfc.net/) permettront de disposer d'un large éventail de données sur les variances de prononciation. Le système final sera évalué lors des campagnes internationales organisées par le NIST sur la vérification de la langue, qui prennent maintenant en compte les variances dialectales (mandarin, anglais, espagnol et hindi) : http://www.nist.gov/speech/tests/lre/. English version The research in automatic speech processing is increasingly concerned at the treatment of large data collections, with spontaneous and conversational speech. The variability of speech alter the general performances. The dialect of the speaker is of these variability sources, who leads in alterations in terms of both the phonetic pronunciation, but also in terms of the morphology of words and prosody. We propose to conduct a research on the automatic characterization of the dialects, in order to adapt automatic speech recognition systems: by selection of acoustic and prosodic models suited to improve performance in speaker independent recognition conditions. The realization of such a system should be based on recent advances in the identification of the language in the exploration of phonemes modeling lattices and propose a fine modelling based on micro and macroprosody. The databases available within the PFC project (http://www.projet-pfc.net/) will provide a wide range of d! ata on variances pronunciation. The final system will be evaluated during international campaigns conducted by the NIST on language verification, which now take into account the dialect identification (Mandarin, English, Spanish and Hindi): http://www.nist.gov/speech/tests/lre/. Connaissances et compétences requises * compétences en informatique (en particulier traitement automatique de la parole) * compétences en linguistique (phonologie, prosodie) Contact Un financement sera attribué aux meilleurs candidats de thèse de l'Ecole Doctorale, il faut donc nous contacter avant le 10 juin pour pouvoir participer à ce classement.

 

Thèse de l’école doctorale MITT, Université Paul Sabatier Toulouse III

DeadLine: 10/06/2008

Contacts: jerome.farinas@irit.fr

http://www.irit.fr/-Equipe-SAMoVA-

 

DESCRIPTION DU SUJET :

Le recherche en traitement automatique de la parole s'intéresse de plus en plus au traitement de grandes collections de données, dans des conditions de parole spontanée et conversationnelle. Les performances sont dépendantes de toutes les variabilités de la parole. Une de ces variabilités concerne l'appartenance dialectale du locuteur, qui induit de la variabilité tant au niveau de la prononciation phonétique, mais également au niveau de la morphologie des mots et de la prosodie. Nous proposons de réaliser un sujet de recherche sur la caractérisation automatique dialectale des locuteurs, en vue de diriger l'adaptation des systèmes de reconnaissance de la parole : la sélection de modèles acoustiques et prosodiques adaptées permettront d'améliorer des performances dans des conditions de reconnaissance indépendante du locuteur. La réalisation d'un tel système devra s'appuyer sur les avancées récentes en identification de la langue au niveau de la modélisation acoustique par exploration de réseaux de phonèmes et proposer une modélisation fine basée sur la micro et macro prosodie. Les bases de données disponibles au sein du projet sur la phonologie du français contemporain (http://www.projet-pfc.net/) permettront de disposer d'un large éventail de données sur les variances de prononciation. Le système final sera évalué lors des campagnes internationales organisées par le NIST sur la vérification de la langue, qui prennent maintenant en compte les variances dialectales (mandarin, anglais, espagnol et hindi) : http://www.nist.gov/speech/tests/lre/.

 

ENGLISH VERSION:

The research in automatic speech processing is increasingly concerned at the treatment of large data collections, with spontaneous and conversational speech. The variability of speech alter the general performances. The dialect of the speaker is of these variability sources, who leads in alterations in terms of both the phonetic pronunciation, but also in terms of the morphology of words and prosody. We propose to conduct a research on the automatic characterization of the dialects, in order to adapt automatic speech recognition systems: by selection of acoustic and prosodic models suited to improve performance in speaker independent recognition conditions. The realization of such a system should be based on recent advances in the identification of the language in the exploration of phonemes modeling lattices and propose a fine modelling based on micro and macroprosody. The databases available within the PFC project (http://www.projet-pfc.net/) will provide a wide range of data on variances pronunciation. The final system will be evaluated during international campaigns conducted by the NIST on language verification, which now take into account the dialect identification (Mandarin, English, Spanish and Hindi): http://www.nist.gov/speech/tests/lre/.


CONNAISSANCES ET COMPETENCES REQUISES

 

 

    • compétences en informatique (en particulier traitement automatique de la parole)

    • compétences en linguistique (phonologie, prosodie)

Back to Top

6-30 . Cambridge University Research Position in Speech processing

Cambridge University: Research Position in Speech Synthesis and Recognition / Machine Translation


A position exists for a Research Associate to work on the EMIME ("Efficient multilingual interaction in mobile environment") project. This project is funded by the European Commission within the FP7 programme. The project aims to develop a mobile device that performs personalized speech-to-speech translation such that a user's spoken input in one language is used to produce spoken output in another language, while continuing to sound like the user's voice. We will build on recent developments in speech synthesis using hidden Markov models, which is the same technology used for automatic speech recognition. Using a common statistical modelling framework for automatic speech recognition and speech synthesis will enable the use of common techniques for adaptation and multilinguality. The project objectives are to

1. Personalise speech processing systems by learning individual characteristics of a user's speech and reproducing them in synthesised speech.
2. Introduce a cross-lingual capability such that personal characteristics can be reproduced in a second language not spoken by the user.
3. Develop and better understand the mathematical and theoretical relationship between speech recognition and synthesis.
4. Eliminate the need for human intervention in the process of cross-lingual personalisation.
5. Evaluate our research against state-of-the art techniques and in a practical mobile application.
See the EMIME website for more information: http://www.emime.org/

This is an opportunity to work in a research group with a world-leading reputation in speech recognition and statistical machine translation research. There are excellent opportunities for publications, travel and conference visits. The group has outstanding research facilities. For suitably qualified candidates there may also be the chance to contribute to the MPhil in Computer Speech, Text and Internet Technology (http://mi.eng.cam.ac.uk/cstit/).

The successful candidate must have a very good first degree in a relevant discipline and preferably have a higher degree as well as experience in acoustic modeling for speech synthesis and/or recognition. Expertise in one or more of the following technical areas is also a distinct advantage:
- speech recognition with the HTK toolkit (http://htk.eng.cam.ac.uk)
- speech synthesis with the HTS HMM-based Speech Synthesis System (http://hts.sp.nitech.ac.jp)
- weighted finite state transducers for speech and language processing
The project focus is acoustic modeling but experience in statistical machine translation is also an advantage.

The cover sheet for applications, PD18 is available from http://www.admin.cam.ac.uk/offices/personnel/forms/pd18/ Part I and Part III only, should be sent, with a letter and CV to Dr Bill Byrne, Department of Engineering, Trumpington Street, Cambridge, CB2 1PZ, (Fax +44 01223 332662, email wjb31@cam.ac.uk).
Quote Reference: NA03547, Closing Date: 30 June 2008

The University values diversity and is committed to equality of opportunity.

Back to Top

6-31 . Head of NLP at Voxid UK

Head of NLP :

 

 We are now looking for a very experienced Computational Linguist

 to lead our efforts in the natural language processing area. This

 is a hugely challenging, but also a very rewarding role; the

 opportunities for applying linguistic techniques are virtually

 limitless and even small improvements in the algorithms for

 detecting and correcting potential conversion errors translate

 into serious cost savings for the company. This is a senior position

 leading a team and having the autonomy to build a strategic way

 forward for this department.

 

 Experience Needed:

 

 Grammars and parsing for spontaneous speech.

 Statistical methods.

 At least basic programming ability (shell scripts, Perl, awk).

 Spell-checkers, grammar checkers, auto correcting tools and predictive typing.

 Experience with Automatic Speech Recognition technology.

 Probabilistic Language Modelling

 Phonetics

 Multi-lingual

info@voxid.co.uk 

Back to Top

7 . Journals

7-1 . Papers accepted for FUTURE PUBLICATION in Speech Communication

Full text available on http://www.sciencedirect.com/ for Speech Communication subscribers and subscribing institutions. Free access for all to the titles and abstracts of all volumes and even by clicking on Articles in press and then Selected papers.

Back to Top

7-2 . Journal of Multimedia User Interfaces

The development of Multimodal User Interfaces relies on systemic research involving signal processing, pattern analysis, machine intelligence and human computer interaction. This journal is a response to the need of common forums grouping these research communities. Topics of interest include, but are not restricted to:

  • Fusion & Fission,
  • Plasticity of Multimodal interfaces,
  • Medical applications,
  • Edutainment applications,
  • New modalities and modalities conversion,
  • Usability,
  • Multimodality for biometry and security,
  • Multimodal conversational systems.

The journal is open to three types of contributions:

  • Articles: containing original contributions accessible to the whole research community of Multimodal Interfaces. Contributions containing verifiable results and/or open-source demonstrators are strongly encouraged.
  • Tutorials: disseminating established results across disciplines related to multimodal user interfaces.
  • Letters: presenting practical achievements / prototypes and new technology components.

JMUI is a Springer-Verlag publication from 2008.

The submission procedure and the publication schedule are described at:

www.jmui.org

The page of the journal at springer is:

http://www.springer.com/east/home?SGWID=5-102-70-173760003-0&changeHeader=true

More information:

Imre Váradi (varadi@tele.ucl.ac.be)

 

Back to Top

7-3 . CURRENT RESEARCH IN PHONOLOGY AND PHONETICS: INTERFACES WITH NATURAL LANGUAGE PROCESSING

A SPECIAL ISSUE OF THE JOURNAL TAL (Traitement Automatique des Langues)

Guest Editors: Bernard Laks and Noël Nguyen


There are long-established connections between research on the sound shape of language and natural language processing (NLP), for which one of the main driving forces has been the design of automatic speech synthesis and recognition systems. Over the last few years, these connections have been made yet stronger, under the influence of several factors. A first line of convergence relates to the shared collection and exploitation of the considerable resources that are now available to us in the domain of spoken language. These resources have come to play a major role both for phonologists and phoneticians, who endeavor to subject their theoretical hypotheses to empirical tests using large speech corpora, and for NLP specialists, whose interest in spoken language is increasing. While these resources were first based on audio recordings of read speech, they have been progressively extended to bi- or multimodal data and to spontaneous speech in conversational interaction. Such changes are raising theoretical and methodological issues that both phonologists/phoneticians and NLP specialists have begun to address.

Research on spoken language has thus led to the generalized utilization of a large set of tools and methods for automatic data processing and analysis: grapheme-to-phoneme converters, text-to-speech aligners, automatic segmentation of the speech signal into units of various sizes (from acoustic events to conversational turns), morpho-syntactic tagging, etc. Large-scale corpus studies in phonology and phonetics make an ever increasing use of tools that were originally developed by NLP researchers, and which range from electronic dictionaries to full-fledged automatic speech recognition systems. NLP researchers and phonologists/phoneticians also have jointly contributed to developing multi-level speech annotation systems from articulatory/acoustic events to the pragmatic level via prosody and syntax.

In this scientific context, which very much fosters the establishment of cross-disciplinary bridges around spoken language, the knowledge and resources accumulated by phonologists and phoneticians are now being put to use by NLP researchers, whether this is to build up lexical databases from speech corpora, to develop automatic speech recognition systems able to deal with regional variations in the sound pattern of a language, or to design talking-face synthesis systems in man-machine communication.

LIST OF TOPICS

The goal of this special issue will be to offer an overview of the interfaces that are being developed between phonology, phonetics, and NLP. Contributions are therefore invited on the following topics:

. Joint contributions of speech databases to NLP and phonology/phonetics

. Automatic procedures for the large-scale processing of multi-modal databases

. Multi-level annotation systems

. Research in phonology/phonetics and speech and language technologies: synthesis, automatic recognition

. Text-to-speech systems

. NLP and modelisation in phonology/phonetics

Papers may be submitted in English (for non native speakers of French only) or French and will relate to studies conducted on French, English, or other languages. They must conform to the TAL guidelines for authors available at http://www.atala.org/rubrique.php3?id_rubrique=1.

DEADLINES

. 11 February 2008: Reception of contributions
. 11 April 2008: Notification of pre-selection / rejection
. 11 May 2008: Reception of pre-selected articles
. 16 June 2008: Notification of final acceptance
. 30 June 2008: Reception of accepted articles' final versions

This special issue of Traitement Automatique des Langues will appear in autumn 2008.

THE JOURNAL

TAL (Traitement Automatique des Langues / Natural Language Processing, http://www.atala.org/rubrique.php3?id_rubrique=1)is a forty-year old international journal published by ATALA (French Association for Natural Language Processing) with the support of CNRS (French National Center for Scientific Research). It has moved to an electronic mode of publication, with printing on demand. This affects in no way its reviewing and selection process.

SCIENTIFIC COMMITTEE

. Martine Adda-Decker, LIMSI, Orsay
. Roxane Bertrand, LPL, CNRS & Université de Provence
. Philippe Blache, LPL, CNRS & Université de Provence
. Cédric Gendrot, LPP, CNRS & Université de Paris III
. John Goldsmith, University of Chicago
. Guillaume Gravier, Irisa, CNRS/INRIA & Université de Rennes I
. Jonathan Harrington, IPS, University of Munich
. Bernard Laks, MoDyCo, CNRS & Université de Paris X
. Lori Lamel, LIMSI, Orsay
. Noël Nguyen, LPL, CNRS & Université de Provence
. François Pellegrino, DDL, CNRS & Université de Lyon II
. François Poiré, University of Western Ontario
. Yvan Rose, Memorial University of Newfoundland
. Tobias Scheer, BCL, CNRS & Université de Nice
. Atanas Tchobanov, MoDyCo, CNRS & Université de Paris X
. Jacqueline Vaissière, LPP, CNRS & Université de Paris III
. Nathalie Vallée, DPC-GIPSA, CNRS & Université de Grenoble III

Back to Top

7-4 . IEEE Signal Processing Magazine: Special Issue on Digital Forensics

Guest Editors:
Edward Delp, Purdue University (ace@ecn.purdue.edu)
Nasir Memon, Polytechnic University (memon@poly.edu)
Min Wu, University of Maryland, (minwu@eng.umd.edu)

We find ourselves today in a "digital world" where most information
is created, captured, transmitted, stored, and processed in digital 
form. Although, representing information in digital form has many 
compelling technical and economic advantages, it has led to new 
issues and significant challenges when performing forensics analysis 
of digital evidence.  There has been a slowly growing body of 
scientific techniques for recovering evidence from digital data.  
These techniques have come to be loosely coupled under the umbrella 
of "Digital Forensics." Digital Forensics can be defined as "The 
collection of scientific techniques for the preservation, collection, 
validation, identification, analysis, interpretation, documentation 
and presentation of digital evidence derived from digital sources for 
the purpose of facilitating or furthering the reconstruction of 
events, usually of a criminal nature."

This call for papers invites tutorial articles covering all aspects 
of digital forensics with an emphasis on forensic methodologies and 
techniques that employ signal processing and information theoretic 
analysis. Thus, focused tutorial and survey contributions are 
solicited from topics, including but not limited to, the following:

 . Computer Forensics - File system and memory analysis. File carving.
 . Media source identification - camera, printer, scanner, microphone
identification.
 . Differentiating synthetic and sensor media, for example camera vs.
computer graphics images.
 . Detecting and localizing media tampering and processing.
 . Voiceprint analysis and speaker identification for forensics.
 . Speech transcription for forensics. Analysis of deceptive speech.
 . Acoustic processing for forensic analysis - e.g. acoustical gunshot
analysis, accident reconstruction.
 . Forensic musicology and copyright infringement detection.
 . Enhancement and recognition techniques from surveillance video/images.
Image matching techniques for auto-matic visual evidence
extraction/recognition.
 . Steganalysis - Detection of hidden data in images, audio, video. 
Steganalysis techniques for natural language steganography. Detection of covert
channels.
 . Data Mining techniques for large scale forensics.
 . Privacy and social issues related to forensics.
 . Anti-forensics. Robustness of media forensics methods against counter
measures.
 . Case studies and trend reports.

White paper submission: Prospective authors should submit white 
papers to the web based submission system at http://
www.ee.columbia.edu/spm/ according to the timetable. given below.  
White papers, limited to 3 single-column double-spaced pages, should 
summarize the motivation, the significance of the topic, a brief 
history, and an outline of the content.  In all cases, prospective 
contributors should make sure to emphasize the signal processing in 
their submission.

Schedule
 . White Paper Due: April 7, 2008
 . Notification of White paper Review Results: April 30, 2008
 . Full Paper Submission: July 15, 2008
 . Acceptance Notification: October 15, 2008
 . Final Manuscript Due: November 15, 2008
 . Publication Date: March 2009.


Back to Top

7-5 . Special Issue on Integration of Context and Content for Multimedia Management

IEEE Transactions on Multimedia            
 Special Issue on Integration of Context and Content for Multimedia Management
=====================================================================

Guest Editors:

Alan Hanjalic, Delft University of Technology, The Netherlands
Alejandro Jaimes, IDIAP Research Institute, Switzerland
Jiebo Luo, Kodak Research Laboratories, USA
        Qi Tian, University of Texas at San Antonio, USA

---------------------------------------------------
URL: http://www.cs.utsa.edu/~qitian/cfp-TMM-SI.htm
---------------------------------------------------
Important dates:

Manuscript Submission Deadline:       April 1, 2008
        Notification of Acceptance/Rejection: July 1, 2008
        Final Manuscript Due to IEEE:         September 1, 2008
        Expected Publication Date:            January 2009

---------------------
Submission Procedure
---------------------
Submissions should follow the guidelines set out by IEEE Transaction on Multimedia.
Prospective authors should submit high quality, original manuscripts that have not
appeared, nor are under consideration, in any other journals.

-------
Summary
-------
Lower cost hardware and growing communications infrastructure (e.g., Web, cell phones,
etc.) have led to an explosion in the availability of ubiquitous devices to produce,
store, view and exchange multimedia (images, videos, music, text). Almost everyone is
a producer and a consumer of multimedia in a world in which, for the first time,
tremendous amount of contextual information is being automatically recorded by the
various devices we use (e.g., cell ID for the mobile phone location, GPS integrated in
a digital camera, camera parameters, time information, and identity of the producer).

In recent years, researchers have started making progress in effectively integrating
context and content for multimedia mining and management. Integration of content and
context is crucial to human-human communication and human understanding of multimedia:
without context it is difficult for a human to recognize various objects, and we
become easily confused if the audio-visual signals we perceive are mismatched. For the
same reasons, integration of content and context is likely to enable  (semi)automatic
content analysis and indexing methods to become more powerful in managing multimedia
data. It can help narrow part of the semantic and sensory gap that is difficult or
even impossible to bridge using approaches that do not explicitly consider context for
(semi)automatic content-based analysis and indexing.

The goal of this special issue is to collect cutting-edge research work in integrating
content and context to make multimedia content management more effective. The special
issue will unravel the problems generally underlying these integration efforts,
elaborate on the true potential of contextual information to enrich the content
management tools and algorithms, discuss the dilemma of generic versus narrow-scope
solutions that may result from "too much" contextual information, and provide us
vision and insight from leading experts and practitioners on how to best approach the
integration of context and content. The special issue will also present the state of
the art in context and content-based models, algorithms, and applications for
multimedia management.

-----
Scope
-----

The scope of this special issue is to cover all aspects of context and content for
multimedia management.

Topics of interest include (but are not limited to):
        - Contextual metadata extraction
        - Models for temporal context, spatial context, imaging context (e.g., camera
          metadata), social and cultural context and so on
- Web context for online multimedia annotation, browsing, sharing and reuse
- Context tagging systems, e.g., geotagging, voice annotation
- Context-aware inference algorithms
        - Context-aware multi-modal fusion systems (text, document, image, video,
          metadata, etc.)
- Models for combining contextual and content information
        - Context-aware interfaces
- Context-aware collaboration
- Social networks in multimedia indexing
- Novel methods to support and enhance social interaction, including
          innovative ideas integrating context in social, affective computing, and
          experience capture.
- Applications in security, biometrics, medicine, education, personal
          media management, and the arts, among others
- Context-aware mobile media technology and applications
- Context for browsing and navigating large media collections
- Tools for culture-specific content creation, management, and analysis

------------
Organization
------------
Next to the standard open call for papers, we will also invite a limited number of
papers, which will be written by prominent authors and authorities in the field
covered by this Special Issue. While the papers collected through the open call are
expected to sample the research efforts currently invested within the community on
effectively combining contextual and content information for optimal analysis,
indexing and retrieval of multimedia data, the invited papers will be selected to
highlight the main problems and approaches generally underlying these efforts.

All papers will be reviewed by at least 3 independent reviewers. Invited papers will
be solicited first through white papers to ensure the quality and relevance to the
special issue. The accepted invited papers will be reviewed by the guest editors and
expect to account for about one fourth of the papers in the special issue.

---------
Contacts
---------
Please address all correspondences regarding this special issue to the Guest Editors
Dr. Alan Hanjalic (A.Hanjalic@ewi.tudelft.nl), Dr. Alejandro Jaimes
(alex.jaimes@idiap.ch), Dr. Jiebo Luo (jiebo.luo@kodak.com), and Dr. Qi Tian
(qitian@cs.utsa.edu).
-------------------------------------------------------------------------------------

Guest Editors:
Alan Hanjalic, Alejandro Jaimes, Jiebo Luo, and Qi Tian


Back to Top

7-6 . CfP Speech Communication: Special Issue On Spoken Language Technology for Education

*CALL FOR PAPERS*

Special Issue of Speech Communication

on

*Spoken Language Technology for Education*



*Guest-editors:*

Maxine Eskenazi, Associate Teaching Professor, Carnegie Mellon University

Abeer Alwan, Professor, University of California at Los Angeles

Helmer Strik, Assistant Professor, University of Nijmegen

 

Language technologies have evolved to the stage where they are reliable
enough, if their strong and weak points are properly dealt with, to be
used for education. The creation of an application for education
presents several challenges: making the language technology sufficiently
reliable (and thus advancing our knowledge in the language
technologies), creating an application that actually enables students to
learn, and engaging the student. Papers in this special issue should
deal with several of these issues. Although language learning is the
primary target of research at present, papers on the use of language
technologies for other education applications are encouraged. The scope
of acceptable topic interests includes but is not limited to:

 

- Use of speech technology for education

- Use of spoken language dialogue for education

- Applications using speech and natural language processing for education

- Intelligent tutoring systems using speech and natural language

- Pedagogical issues in using speech and natural language technologies
for education

- Assessment of tutoring software

- Assessment of student performance

 

*Tentative schedule for paper submissions, review, and revision**: ** *

Deadline for submissions: June 1, 2008.

Deadline for decisions and feedback from reviewers and editors: August
31, 2008.

Deadline for revisions of papers: November 31, 2008.

 

*Submission instructions:*

Authors should consult the "Guide for Authors", available online, at
http://www.elsevier.com/locate/specom for information about the
preparation of their manuscripts. Authors, please submit your paper via
_http://ees.elsevier.com/specom_, choosing *Spoken Language Tech. *as
the Article Type, and  Dr. Gauvain as the handling E-i-C.

Back to Top

7-7 . CfP Special Issue on Processing Morphologically Rich Languages IEEE Trans ASL

Call for Papers for a Special Issue on
                Processing Morphologically Rich Languages 
          IEEE Transactions on Audio, Speech and Language Processing
 
This is a call for papers for a special issue on Processing Morphologically
Rich Languages, to be published in early 2009 in the IEEE Transactions on 
Audio, Speech and Language Processing. 
 
Morphologically-rich languages like Arabic, Turkish, Finnish, Korean, etc.,
present significant challenges for speech processing, natural language 
processing (NLP), as well as speech and text translation. These languages are 
characterized by highly productive morphological processes (inflection, 
agglutination, compounding) that may produce a very large number of word 
forms for a given root form.  Modeling each form as a separate word leads 
to a number of problems for speech and NLP applications, including: 1) large 
vocabulary growth, 2) poor language model (LM) probability estimation, 
3) higher out-of-vocabulary (OOV) rate, 4) inflection gap for machine 
translation:  multiple different forms of  the same underlying baseform 
are often treated as unrelated items, with negative effects on word alignment 
and translation accuracy.  
 
Large-scale speech and language processing systems require advanced modeling 
techniques to address these problems. Morphology also plays an important 
role in addressing specific issues of “under-studied” languages such as data 
sparsity, coverage and robust modeling. We invite papers describing 
previously unpublished work in the following broad areas: Using morphology for speech recognition and understanding, speech and text translation, 
speech synthesis, information extraction and retrieval, as well as 
summarization . Specific topics of interest include:
- methods addressing data sparseness issue for morphologically rich 
  languages with application to speech recognition, text and speech 
  translation, information extraction and retrieval, speech   
  synthesis, and summarization
- automatic decomposition of complex word forms into smaller units 
- methods for optimizing the selection of units at different levels of 
  processing
- pronunciation modeling for morphologically-rich languages
- language modeling for morphologically-rich languages
- morphologically-rich languages in speech synthesis
- novel probability estimation techniques that avoid data sparseness 
  problems
- creating data resources and annotation tools for morphologically-rich 
  languages
 
Submission procedure:  Prospective authors should prepare manuscripts 
according to the information available at 
http://www.signalprocessingsociety.org/periodicals/journals/taslp-author-in=ormation/. 
Note that all rules will apply with regard to submission lengths, 
mandatory overlength page charges, and color charges. Manuscripts should 
be submitted electronically through the online IEEE manuscript submission 
system at http://sps-ieee.manuscriptcentral.com/. When selecting a 
manuscript type, authors must click on "Special Issue of TASLP on 
Processing Morphologically Rich Languages". 
 
Important Dates:
Submission deadline:  August 1, 2008               
Notification of acceptance: December 31, 2008
Final manuscript due:  January 15, 2009    
Tentative publication date: March 2009
 
Editors
Ruhi Sarikaya (IBM T.J. Watson Research Center) sarikaya@us.ibm.com
Katrin Kirchhoff (University of Washington) katrin@ee.washington.edu
Tanja Schultz (University of Karlsruhe) tanja@ira.uka.de
Dilek Hakkani-Tur (ICSI) dilek@icsi.berkeley.ed
Back to Top

8 . Forthcoming events supported (but not organized) by ISCA

8-1 . SIGDIAL 2008 9th SIGdial Workshop on Discourse and Dialogue

SIGDIAL 2008 9th SIGdial Workshop on Discourse and Dialogue
COLUMBUS, OHIO; June 19-20 2008 (with ACL/HLT 2008)
   
http://www.sigdial.org/workshops/workshop9



Continuing with a series of successful workshops in Antwerp, Sydney,
Lisbon, Boston, Sapporo, Philadelphia, Aalborg, and Hong Kong, this
workshop spans the ACL and ISCA SIGdial interest area of discourse and
dialogue. This series provides a regular forum for the presentation of
research in this area to both the larger SIGdial community as well as
researchers outside this community. The workshop is organized by
SIGdial, which is sponsored jointly by ACL and ISCA. SIGdial 2008 will
be a workshop of ACL/HLT 2008.


TOPICS OF INTEREST

We welcome formal, corpus-based, implementation or analytical work on
discourse and dialogue including but not restricted to the following
three themes:

1. Discourse Processing and Dialogue Systems

Discourse semantic and pragmatic issues in NLP applications such as
text summarization, question answering, information retrieval
including topics like:

- Discourse structure, temporal structure, information structure
- Discourse markers, cues and particles and their use
- (Co-)Reference and anaphora resolution, metonymy and bridging
  resolution
- Subjectivity, opinions and semantic orientation

Spoken, multi-modal, and text/web based dialogue systems including
topics such as:

- Dialogue management models;
- Speech and gesture, text and graphics integration;
- Strategies for preventing, detecting or handling miscommunication
  (repair and correction types, clarification and under-specificity,
  grounding and feedback strategies);
- Utilizing prosodic information for understanding and for
  disambiguation;


2. Corpora, Tools and Methodology

Corpus-based work on discourse and spoken, text-based and multi-modal
dialogue including its support, in particular:

- Annotation tools and coding schemes;
- Data resources for discourse and dialogue studies;
- Corpus-based techniques and analysis (including machine learning);
- Evaluation of systems and components, including methodology, metrics
  and case studies;


3. Pragmatic and/or Semantic Modeling

The pragmatics and/or semantics of discourse and dialogue (i.e. beyond
a single sentence) including the following issues:

- The semantics/pragmatics of dialogue acts (including those which are
  less studied in the semantics/pragmatics framework);
- Models of discourse/dialogue structure and their relation to
  referential and relational structure;
- Prosody in discourse and dialogue;
- Models of presupposition and accommodation; operational models of
  conversational implicature.


SUBMISSIONS

The program committee welcomes the submission of long papers for full
plenary presentation as well as short papers and demonstrations. Short
papers and demo descriptions will be featured in short plenary
presentations, followed by posters and demonstrations.

- Long papers must be no longer than 8 pages, including title,
  examples, references, etc. In addition to this, two additional pages
  are allowed as an appendix which may include extended example
  discourses or dialogues, algorithms, graphical representations, etc.
- Short papers and demo descriptions should aim to be 4 pages or less
  (including title, examples, references, etc.).

Please use the official ACL style files:
    http://www.ling.ohio-state.edu/~djh/acl08/stylefiles.html

Submission/Reviewing will be managed by the EasyChair system. Link to
follow.

Papers that have been or will be submitted to other meetings or
publications must provide this information (see submission
format). SIGdial 2008 cannot accept for publication or presentation
work that will be (or has been) published elsewhere. Any questions
regarding submissions can be sent to the co-Chairs.

Authors are encouraged to make illustrative materials available, on
the web or otherwise. For example, excerpts of recorded conversations,
recordings of human-computer dialogues, interfaces to working systems,
etc.


IMPORTANT DATES

Submission        Mar 14 2008
Notification      Apr 27 2008
Camera-Ready   May 16 2008
Workshop          June 19-20 2008


WEBSITES

Workshop website: http://www.sigdial.org/workshops/workshop9
Submission link: To be announced
SIGdial organization website: http://www.sigdial.org/
CO-LOCATION ACL/HLT 2008 website: http://www.acl2008.org/


CONTACT

For any questions, please contact the co-Chairs at:
Beth Ann Hockey  bahockey@ucsc.edu
David Schlangen  das@ling.uni-potsdam.de

Back to Top

8-2 . LIPS 2008 Visual Speech Synthesis Challenge

LIPS 2008 is the first visual speech synthesis challenge. It will be

held as a special session at INTERSPEECH 2008 in Brisbane, Australia

(http://www.interspeech2008.org). The aim of this challenge is to

stimulate discussion about subjective quality assessment of synthesised

visual speech with a view to developing standardised evaluation procedures.

In association with this challenge a training corpus of audiovisual

speech and accompanying phoneme labels and timings will be provided to

all entrants, who should then train their systems using this data. (As

this is the first year the challenge will run and to promote wider

participation, proposed entrants are free to use a pre-trained model).

Prior to the session a set of test sentences (provided as audio, video

and phonetic labels) must be synthesised on-site in a supervised room. A

series of double-blind subjective tests will then be conducted to

compare each competing system against all others. The overall winner

will be announced and presented with their prize at the closing ceremony

of the conference.

All entrants will submit a 4/6 (TBC) page paper describing their system

to INTERSPEECH indicating that the paper is addressed to the LIPS special

session. A special edition of the Eurasip Journal on Speech, Audio and Music

Processing in conjunction with the challenge is also scheduled.

To receive updated information as it becomes available, you can join the

mailing list by visiting

https://mail.icp.inpg.fr/mailman/listinfo/lips_challenge. Further

details will be mailed to you in due course.

Please invite colleagues to join and dispatch this email largely to your

academic and industrial partners. Besides a large participation of

research groups in audiovisual speech synthesis and talking faces we

particularly welcome participation of the computer game industry.

Please confirm your willingness to participate in the challenge, submit

a paper describing your work and join us in Brisbane by sending an email

to sascha.fagel@tu-berlin.de, b.theobald@uea.ac.uk,

gerard.bailly@gipsa-lab.inpg.fr

Organising Committee

Sascha Fagel, University of Technology, Berlin - Germany

Barry-John Theobald, University of East Anglia, Norwich - UK

Gerard Bailly, GIPSA-Lab, Dpt. Speech & Cognition, Grenoble - France

 

Back to Top

8-3 . Human-Machine Comparisons of consonant recognition in quiet and noise

Consonant Challenge:
  Human-machine comparisons of consonant recognition in quiet and noise

                   Interspeech, 22-26 September 2008
                         Brisbane, Australia

* Update:
All information concerning the native listener experiments and baseline 
recognisers
including their results can now be found and downloaded from the Consonant 
Challenge website:
http://www.odettes.dds.nl/challenge_IS08/

* Deadline for submissions:
The deadline and paper submission guidelines for full paper submission (4 
pages) is April
7th, 2008. Paper submission is done exclusively via the Interspeech 2008 
conference website.
Participants of this Challenge are asked to indicate the correct Special 
Session during
submission. More information on the Interspeech conference can be found 
here: http://
www.interspeech2008.org/

* Topic of the Consonant Challenge:
Listeners outperform automatic speech recognition systems at every level 
of speech
recognition, including the very basic level of consonant recognition. What 
is not clear is
where the human advantage originates. Does the fault lie in the acoustic 
representations of
speech or in the recogniser architecture, or in a lack of compatibility 
between the two?
There have been relatively few studies comparing human and automatic 
speech recognition on
the same task, and, of these, overall identification performance is the 
dominant metric.
However, there are many insights which might be gained by carrying out a 
far more detailed
comparison.

The purpose of this Special Session is to make focused human-computer 
comparisons on a task
involving consonant identification in noise, with all participants using 
the same training
and test data. Training and test data and native listener and baseline 
recogniser results
will be provided by the organisers, but participants are encouraged to 
also contribute
listener responses.

* Call for papers:
Contributions are sought in (but not limited to) the following areas:

- Psychological models of human consonant recognition
- Comparisons of front-end ASR representations
- Comparisons of back-end recognisers
- Exemplar vs statistical recognition strategies
- Native/Non-native listener/model comparisons

* Organisers:
Odette Scharenborg (Radboud University Nijmegen, The Netherlands -- 
O.Scharenborg@let.ru.nl)
Martin Cooke (University of Sheffield, UK -- M.Cooke@dcs.shef.ac.uk)

Back to Top

9 . Future Speech Science and Technology Events

9-1 . Call for participation INFILE@CLEF2008 Evaluation

Call for participation

            INFILE@CLEF2008

     Information, Filtering, Evaluation
    
          http://www.infile.org


INFILE welcomes participation of any institution to its first evaluation
campaign. This participation is free of charge and
participants can keep and use the development and evaluation data for
free after the evaluations for research and development purposes.

INFILE (INformation, Filtrage, Evaluation) is a cross-language adaptive
filtering evaluation campaign jointly organized by CEA,
Université de Lille 3 and ELDA. It is organized as a pilot track in CLEF
2008 and is supported by NIST TREC.

INFILE extends the last filtering track of TREC 2002 in the following ways:
-INFILE is crosslingual (English, French and Arabic); a corpus of 
100,000 comparable news-wire stories from Agence France Presse (AFP)
 for each language is used for the evaluations.

-Evaluation will be performed using an automatic interrogation of test 
systems with a simulated user feedback. Each system will be able to
 use the feedback at any time to increase performance.

The participant systems will have to provide a Boolean decision for each
document according to each filtering  profile. A curve of the evolution
of efficiency will be computed.
Although cross-lingual systems are encouraged, the campaign is also open
to monolingual systems.


Tasks and languages
-------------------
Two tasks and three languages are considered. The first task is
Information Filtering on general news and events. For this task,
participants will have to classify each transmitted news-wire into
zero, one or more different profiles. 30 general profiles will be made
available in 3 languages (Arabic, English and French).

The second task is Information Filtering on science and technology
domain. Participants will have to associate to each news-wire zero, one
or more science and technology profiles.
A total of 20 profiles will be available in 3 languages (Arabic, English
and French).

For each task, participants are free to register to monolingual
filtering (e.g. information filtering using profiles and news-wires in
the same language) or to crosslingual filtering (e.g. information
filtering according to profiles in one language and news-wires in
another language).

Corpus
------
The corpus consists of 300,000 news-wires in Arabic, English and French
from the news agency Agence France Presse covering the 2004-2006 period.
The news-wires are related to general news and events information and
are comparable between Arabic, English and French.


Protocol description
---------------------
General information about the domain of profiles is given to each
participant. 15 days afterwards, 50 profiles are given to participants
(30 general
profiles and 20 profiles related to science and technology). Profiles
are composed of a list of keywords (simple and complex noun phrases) and
up to 3
documents illustrating each profile.
 
Then, news-wires are transmitted by the organizer to an automated 
interface of each participating system. The interface returns a
 Boolean response for each profile. After reception of this response,
and if requested by the participant, the organizer sends a feedback
 consisting of expected profile assignments for each document 
submitted. Participants may adapt their system at any time using this
feedback.


Important Dates
---------------
    Registration Opens - Feb 11th, 2008
    Dry Run - June 2nd to June 14th, 2008
    Evaluation Run June 30th - July 19th
    Release of Human Assessments and Individual Results - August 4th, 2008
    Submission of Paper for Working Notes - 15 August 2008
    Workshop - 17-19 September 2008 CLEF Workshop

Contact
-------
    info@infile.org
    http://www.infile.org

Back to Top

9-2 . AERFAISS'08 Bilbao

AERFAISS'08 SECOND ANNOUNCEMENT
 
Second AERFAI Summer School on
"NEW TRENDS IN PATTERN RECOGNITION FOR LANGUAGE TECHNOLOGIES"
Bilbao (Spain), June 23-28, 2008
url: http://www.ehu.es/aerfaiss08
e-mail: aerfaiss08@ehu.es
 
We are  pleased to  announce the second  AERFAI Summer School  on "New
Trends in Pattern Recognition for Language Technologies".  This summer
school is the  successor to the well-received school  held in 2006 and
organized  by the  AERFAI, the  Spanish section  of  the International
Association  for Pattern Recognition  (IAPR). AERFAISS'08  will tackle
cutting-edge  technologies  related   to  pattern  recognition,  being
language  technologies  the  main  focus. To ensure  a high  ratio
between  tutors  and  students  the  school  will  be  limited  to  50
participants.  We  find this a good opportunity  for young researchers
to exchange  their ideas  with senior lecturers,  getting to  know the
state-of-the art toolkits on  language technologies and the trends for
future developments in this field.
 
FINAL PROGRAM
 
AERFAISS'08 consists of both  tutorials, laboratory practices and open
discussion forums summing up a  total of 36 hours distributed all over
a week.
 
TUTORIALS
 
- "Statistical models and algorithms for speech and language
  technologies" by Dr. Hermann Ney (RWTH Aachen, Germany)
 
- "Multi-modal interaction involving speech and language technologies"
  by Dr. Alex Waibel (Carnegie Mellon University, USA)
 
- "Applying unsupervised learning in creating language models for
  information retrieval and machine translation" by Dr. Timo Honkela
  (Helsinki University of Technology, Finland)
 
- "Speech production models for automatic speech recognition" by
  Dr. Richard Rose (McGill University, Canada)
 
- "Phrase-based statistical translation models" by Dr. Philipp Koehn
  (University of Edinburgh, United Kingdom)
 
- "Computer assisted transcription of speech and text images" by
  Dr. Enrique Vidal (Instituto Tecnológico de Informática, Technical
  University of Valencia, Spain)
 
- "Interactive machine translation" by Dr. Francisco Casacuberta
  (Instituto Tecnológico de Informática, Technical University of
  Valencia, Spain)
 
PRACTICES
 
Given by local teachers and PhD students working on PR&NLP following the general
directives of lecturers:
 
 - "Using statistical language processing toolkits for computer assisted
   transcription" by Dr. Alejandro Héctor Toselli (Instituto Tecnológico
   de Informática, Technical University of Valencia, Spain)
 
_ "Applying Morphessor for unsupervised morphology induction" by Víctor
  Guijarrubia (university of the Basque Country, Spain)
 
- "Training translation models and decoding with Moses" by Germán Sanchís
  (Instituto Tecnológico de Informática, Technical University of
  Valencia, Spain)
 
VENUE
 
The AERFAI Summer School 2008 will take place in:
 
  Department of Electricity and Electronics
  Faculty of Science and Technology
  University of the Basque Country
  48940 Leioa
  Campus of Biscay
 
Biscay  is an  area located  in the  northeast of  Spain,  besides the
Atlantic  watershed.  Bilbao,  the   capital  of  Biscay,  is  a  well
communicated  city by  either  plain,  bus or  train.  Apart from  the
Guggenheim museum,  the Euskalduna conference  centre (awarded world's
best  by IACC),  the underground  designed by  Norman  Foster, Bizkaia
Bridge  (UNESCO   Common  Heritage  of  Mankind)   and  summer's  jazz
festivals, one might enjoy the well known Basque gastronomy, so rooted
in  Bilbao. To find  out more  about Bilbao,  turn to  the town-hall's
information website: http://www.bilbao.net
 
REGISTRATION
 
Fill in the  application form from the web-page and  fax it along with
the bank receipt of payment  to the attention of Euskoiker Foundation.
Fax:  +34 (9)44153905 (attaching  both the  registration form  and the
prove of payment)
 
Registration Fee:
 
Includes admission to all sessions, support material for both lectures
and practices, coffee breaks and wi-fi access to the internet.
 
- Early registration: from 1st January 2008 to 31st May 2008, for
  either full time students or AERFAI members: 325€
- Late registration: from 1st June 2008 to 15th June 2008, for either
  full time students or AERFAI members: 350€
- Others: 375€
 
CONTACT
 
For any question, please  address to the organizing committee; e-mail:
aerfaiss08@ehu.es
 
--
AERFAI governing board and Summer School Organization
 
Dr. M. Inés Torres
University of the Basque Country
Spain
 
Dr. José Miguel Benedí
ITI, Technical University of Valencia
Spain
Back to Top

9-3 . Speech production workshop: Paris

CALL FOR PARTICIPATION

Phonetics & Phonology Laboratories, UMR-7018 CNRS-Univ. Paris III, will have a one-day meeting: Speech Production Workshop: Instrumentation-based Approach, on the day just after ACOUSTICS ’08 at ILPGA, Paris 5e.

 

This workshop is free of charge. If you would like to join us, please send us the following information by replying to this email so that we will know the approximate number of the participants.

 

 ----%---

 I will attend the workshop.

 Name:

 Affiliation:

 Email:

 ---%---

 

SPEECH PRODUCTION WORKSHOP: INSTRUMENTATION-BASED APPROACH

 

Date: 5th JULY, 2008 (Saturday)

Location: ILPGA 19 rue des Bernardins 75005 Paris

http://lpp.univ-paris3.fr/workshop.html

 

This workshop aims to introduce recent progresses of instrumentational approach in the fields of speech production research to advance our knowledge of key technologies and recent outcomes. Eleven invited speakers will give their recent development in four sessions.

 

[Voice Production]

J. Ohala      (to be announced)

  D. Demolin    Subglottal air pressure measurement

  H. Imagawa   High-speed digital imaging of vocal fold vibration

[Articulation 1]

  A. Marchal    Electropalatography, new development

 M-O. Berger   Multimodality acquisition of articulatory data

 P. Hoole       3D electromagnetic articulography, recent progress

[Articulation 2]

  R. Sock        Digital X-ray cinematography and database

  M. Stone       Magnetic resonance imaging (MRI), new techniques

  S. Masaki      Magnetic resonance imaging (MRI), motion imaging

[Demonstration]

  K. Honda      Noninvasive photoglottographic technique

  H. Takemoto   MRI-based vocal-tract solid models

 

Organizers

   Jacqueline Vaisisiere (LPP, UMR-7018 CNRS-Univ. Paris III, ATR-CIS)

   Kiyoshi Honda (LPP, UMR-7018 CNRS-Univ. Paris III)

   Shinji Maeda (LPP, UMR-7018 CNRS-Univ. Paris III, & ENST)


Back to Top

9-4 . Seminaires du Dpt Parole et Cognition du GIPSA (ex ICP Grenoble) (in french)

lundi 7 juillet, 13h30 - Séminaire externe
==============================================
Jorge Lucero
Department of Mathematics
University of Brasilia
Brésil

Titre et résumé à préciser

Salle de réunion du Département Parole et Cognition (B314)
3ème étage Bâtiment B ENSIEG
961 rue de la Houille Blanche
Domaine Universitaire

 

Back to Top

9-5 . 6th Intl Conference on Content-based Multimedia Indexing CBMI '08

Sixth International Conference on Content-Based Multimedia Indexing (CBMI'08)  
                               http://cbmi08.qmul.net/                        
                              18-20th June, 2008, London, UK    
CBMI is the main international forum for the presentation and discussion of the latest 
technological advances, industrial needs and product  developments in multimedia indexing, 
search, retrieval, navigation and  browsing. Following the five successful previous events 
(Toulouse 1999,  Brescia 2001, Rennes 2003, Riga 2005, and Bordeaux 2007), CBMI'08 
will be  hosted by Queen Mary, University of London in the vibrant city of London.  
The focus of CBMI'08 is the integration of what could be regarded as  unrelated disciplines 
including image processing, information retrieval,  human computer interaction and 
semantic web technology with industrial  trends and commercial product development. 
The technical program of  CBMI'08 will include presentation of invited plenary talks, 
special  sessions as well as regular sessions with contributed research papers.   
Topics of interest include, but are not limited to: 
* Content-based browsing, indexing and retrieval of images, video and audio   
* Matching and similarity search   
* Multi-modal and cross-modal indexing   
* Content-based search   
* Multimedia data mining   
* Summarisation and browsing of multimedia content    
* Semantic web technology   
* Semantic inference   
* Semantic mapping and ontologies   
* Identification and tracking of semantic regions in scenes    
* Presentation and visualization tools   
* Graphical user interfaces for navigation and browsing   
* Personalization and content adaptation   
* User modelling, interaction and relevance feed-back    
* Metadata generation and coding   
* Large scale multimedia database management    
* Applications of multimedia information retrieval   
* Analysis and social content applications   
* Evaluation metrics 
Submission  
Prospective authors are invited to submit papers using the on-line system  at the 
conference website http://cbmi08.qmul.net/. Accepted papers will  be published in the 
Conference Proceedings. Extended and improved  versions of CBMI papers will be 
reviewed and considered for publication  in Special Issues of IET Image Processing 
(formerly IEE Proceedings  Vision, Image and Signal Processing) and EURASIP journal 
on Image and  Video Processing.  
 Important Dates: 
Submission of full papers (to be received by): 5th February, 2008 
Notification of acceptance:                    20th March, 2008 
Submission of camera-ready papers:             10th April, 2008 Conference:                                    18-20th June, 2008   
Organisation Committee:
General Chairs:     Ebroul Izquierdo,Queen Mary, University of London, UK  
Technical Co- Chairs:   Jenny Benois-Pineau , University of Bordeaux, France     
Arjen P. de Vries, Centrum voor Wiskunde en Informatica, NL 
Alberto Del Bimbo,Universita` degli Studi di Firenze, Italy     
Bernard Merialdo,Institut Eurecom, France  
EU Commission:     
Roberto Cencioni(Head of Unit INFSO E2) European Commission     
Luis Rodriguez Rosello(Head of Unit INFSO D2) European Commission  
Special Session Chairs:     
Stefanos Kollias, National Technical University of Athens, Greece     
Gael Richard,GET-Telecom Paris, France  
Contacts: 
Ebroul Izquierdo          ebroul.izquierdo@elec.qmul.ac.uk     
Giuseppe Passino          giuseppe.passino@elec.qmul.ac.uk     
Qianni Zhang              qianni.zhang@elec.qmul.ac.uk  


Back to Top

9-6 . IIS2008 Workshop on Spoken Language and Understanding and Dialog Systems

Zakopane, Poland    18 June 2008                  
 http://nlp.ipipan.waw.pl/IIS2008/luna.html                 
The workshop is organized by the IST LUNA (http://www.ist-luna.eu/)  projects members and it is aimed to give an opportunity to share ideas on problems related to communication with computer systems in natural language and  dialogue systems.   
SCOPE    
The main area of interest of the workshop is human-computer interaction  in natural language and include among others:      
 - spontaneous speech recognition,     
- preparation of speech corpora,     
- transcription problems in spoken corpora     
- parsing problems in spoken texts      
- semantic interpretation of text,     
- knowledge representation in relation to dialogue systems,    
- dialogue models,     
- spoken language understanding.   
SUBMISSIONS  
The organizers invite long (10 pages) and short (5 pages) papers. The   papers will be refereed on the  basis of long abstracts (4 pages) by an  international committee. The final papers are to be prepared using LaTeX.   The conference proceedings in paper and electronic form will be  distributed at the conference. They will be available on-line after the conference.      I
IMPORTANT DATES            
Submission deadline (abstracts)         31 January 2008           
Notification of acceptance:    		  29 February 2008          
Full papers, camera-ready version due:  31 March  2008          
Workshop:                               18 June 2008  
ORGANISERS         
Malgorzata Marciniak mm@ipipan.waw.pl         
Agnieszka Mykowiecka  agn@ipian.waw.pl 
Krzysztof Marasek kmarasek@pjwstk.edu.pl 


Back to Top

9-7 . HLT Workshop on Mobile Language Technology (ACL-08)

ACL-08: HLT Workshop on

Mobile Language Processing

http://www.mobileNLPworskshop.org

 

Columbus, Ohio, United States

June 19th or June 20th, 2008

** Paper submission deadline: March 7, 2007 **

************************************************************************

 

Mobile Language Processing

 

Mobile devices, such as ultra-mobile PCs, personal digital assistants, and smart phones have many unique characteristics that make them both highly desirable as well as difficult to use. On the positive side, they are small, convenient, personalizable, and provide an anytime-anywhere communication capability. Conversely, they have limited input and output capabilities, limited bandwidth, limited memory, and restricted processing power. 

 

The purpose of this workshop is to provide a forum for discussing the challenges in natural and spoken language processing and systems research that are unique to this domain. We argue that mobile devices not only provide an ideal opportunity for language processing applications but also offer new challenges for NLP and spoken language understanding research.

For instance, mobile devices are beginning to integrate sensors (most commonly for location detection through GPS, Global Positioning Systems) that can be exploited by context/location aware NLP systems; another interesting research direction is the use of information from multiple devices for "distributed" language modeling and inference. To give some concrete examples, knowing the type of web queries made from nearby devices or from a specific location or a specific ‘context' can be combined for various applications and could potentially improve information retrieval results. Learned language models could be transferred from device to device, propagating and updating the language models continuously and in a decentralized manner.

 

Processing and memory limitations incurred by executing NLP and speech recognition on small devices need to be addressed. Some applications and practical considerations may require a client/server or distributed architecture: what are the implications for language processing systems in using such architectures?

 

The limitation of the input and output channels necessitates typing on increasingly smaller keyboards which is quite difficult, and similarly reading on small displays is challenging. Speech interfaces for dictation or for understanding navigation commands and/or language models for typing suggestions would enhance the input channel, while NLP systems for text classification, summarization and information extraction would be helpful for the output channel.  Speech interfaces, language generation and dialog systems would provide a natural way to interact with mobile devices.

Furthermore, the growing market of cell phones in developing regions can be used for delivering applications in the areas of health, education and economic growth to rural communities.  Some of the challenges in this area are the limited literacy, the many languages and dialects spoken and the networking infrastructure.

We solicit papers on topics including, but not limited to the following:

·       Special challenges of NLP for mobile devices

·       Applications of NLP for mobile devicesNLP enhanced by sensor data

·       Distributed NLP

·       Speech and multimodal interfaces

·       Machine translation

·       Language model sharing

·       Applications for the developing regions

 

The goal of this one day workshop is to provide a forum to allow both industrial and academic researchers to share their experiences and visions, to present results, compare systems, exchange ideas and formulate common goals.

"Keynote Speaker: Dr. Lisa Stifelman, Principal User Experience Manager at Tellme/Microsoft. The title of her talk is soon to be announced."

 - Paper submission deadline: March 7, 2008

- Notification of acceptance: April 8, 2008
- Camera-ready Copy: April 18, 2008


Organizing committee:

Rosario, Barbara        Intel Research    

Paek, Tim               Microsoft Research    

 

Contact

For questions about the workshop, please contact Barbara Rosario (barbara.rosario@intel.com).

 We accept position papers (2 pages), short research or demo papers (4 pages), and regular papers (8 content pages with 1 extra page for references). Papers must be submitted through the submission system at
https://www.softconf.com/acl08/ACL08-WS07/submit.html

Please use the LaTeX or Microsoft Word style files available at http://ling.osu.edu/acl08/stylefiles.html.

 

Important dates:

 

Back to Top

9-8 . 4TH TUTORIAL AND RESEARCH WORKSHOP PIT08

Following previous successful workshops between 1999 and 2006, the

4TH TUTORIAL AND RESEARCH WORKSHOP
PERCEPTION AND INTERACTIVE TECHNOLOGIES FOR SPEECH-BASED SYSTEMS
(PIT08)

will be held at the Kloster Irsee in southern Germany from June 16 to
June 18, 2008.

Please follow this link to visit our workshop website
http://it.e-technik.uni-ulm.de/World/Research.DS/irsee-workshops/pit08/introduction.html

Submissions will be short/demo or full papers of 4-10 pages.

Important dates:
**February 22, 2008: Deadline for Long, Short and Demo Papers**
March 15, 2008: Author notification
April 1, 2008: Deadline for final submission of accepted paper
April 18, 2008: Deadline for advance registration
June 7, 2008: Final programme available on the web

The workshop will be technically co-sponsored by the IEEE Signal
Processing Society. It is envisioned to publish the proceedings in the
LNCS/LNAI Series by Springer.

We welcome you to the workshop.

Elisabeth André, Laila Dybkjaer, Wolfgang Minker, Heiko Neumann,
Michael Weber, Roberto Pieraccini

PIT'08 Organising Committee


Wolfgang Minker
University of Ulm
Department of Information Technology
Albert-Einstein-Allee 43
D-89081 Ulm
Phone: +49 731 502 6254/-6251
Fax:   +49 691 330 3925516
http://it.e-technik.uni-ulm.de/World/Research.DS/

Back to Top

9-9 . YRR-2008 Young Researchers' Roundtable

Young Researchers' Roundtable on Spoken Dialog Systems
June 21st, 2008, Columbus, Ohio,

Second Call for Participation

The Young Researchers' Roundtable on Spoken Dialog Systems is an annual
workshop designed for students, post docs, and junior researchers
working in research related to spoken dialogue systems in both academia
and industry. The roundtable provides an open forum where participants
can discuss their research interests, current work and future plans. The
workshop is meant to provide an interdisciplinary forum for creative
thinking about current issues in spoken dialogue systems research, and
help create a stronger international network of young researchers
working in the field. The workshop is co-located with ACL 2008  and
will occur the day after the 9th SIGdial workshop.

Workshop Format

Workshop events will include small informal discussion groups, a larger
Question & Answers style discussion with senior researchers from
academia and industry, and an opt-in demo presentation session. There
will also be time for participants to have informal discussions over
coffee with senior researchers on potential career opportunities. The
small discussion groups are intended to allow participants to exchange
ideas on key research topics, and identify issues that are likely to be
important in the coming years. The results of each discussion group will
then presented and discussed in plenary sessions. The topics for
discussion are still open and will be determined by participant
submissions and finalized online before the workshop. Potential
participants should submit a short paper, as described below in the
submission process to get accepted to the workshop.

In addition to the traditional one day event, a half day extension on
the topic of  "Frameworks and Grand Challenges for Dialog System
Evaluation" is under consideration for the morning of June 22nd, 2008.
The aim of this extra extension is to provide an opportunity for dialog
systems researchers to discuss issues of evaluation, and hopefully
determine an agenda for a future evaluation event or framework.
Organization of this extended event will depend on interest; we
therefore, as described below, invite potential participants to indicate
their interest with their YRR08 submission.

Submission Process

We invite participation from students, post docs, and junior researchers
in academia or industry who are currently working in spoken dialog
systems research. We also invite participation from those who are
working in related fields such as linguistics, psychology, or speech
processing, as applied to spoken dialogue systems.  Please note that by
'young researchers' the workshop's organizers mean to target students
and researchers in the field who are at a relatively early stage of
their careers, and in no way mean to imply that participants must meet
certain age restrictions.

Potential participants should submit a 2-page position paper and suggest
topics for discussion and whether they would be interested in attending
the extended session on Sunday morning. A template and specific
submission instructions will be available on http://www.yrrsds.org/ on
March 1.  Submissions will be accepted on a rolling basis from that day
until the maximum number of participants for the workshop (50) is
reached, or until the submission deadline (May 10th, 2008) is reached.
Proceedings from previous years' workshops are also available on our web
site. Specific questions can be directed to the organizing committee at
yrr08-organizers__AT_googlegroups_DOT_com

Important Dates

Submissions:            March 1st, 2008
Submissions deadline:   May 10th, 2008
Notification:           May 20th, 2008
Registration begins:    to be announced
Workshop:               June 21st, 2008

Organizing Committee

Hua Ai, Intelligent Systems Program, University of Pittsburgh, USA
Carlos Gómez Gallo, Department of Computer Science, University of
Rochester, USA
Robert J. Ross, Department of Computer Science, University of Bremen,
Germany
Sabrina Wilske, Department of Computational Linguistics, Saarland
University, Germany
Andi Winterboer, Institute for Communicating and Collaborative Systems,
University of Edinburgh, UK
Craig Wootton, University of Ulster, Belfast, Northern Ireland

Local Organization

Tim Weale, Department of Computer Science and Engineering, The Ohio
State University, USA

Back to Top

9-10 . SIGIR 2008 workshop: Searching Spontaneous Conversational Speech

24 July 2008, Singapore
http://ilps.science.uva.nl/SSCS2008
Large scale use and commercial application of retrieval technology for spoken
content requires a sustained effort focused on merging speech recognition,
audio analysis and information retrieval into a concerted discipline with a
common vision. After the success of the 2007 Search in Spontaneous
Conversational Speech workshop (SSCS 2007; cf.
http://hmi.ewi.utwente.nl/sscs), a second workshop, SSCS 2008, will be held in
conjunction with ACM SIGIR 2008 in order to further the cross-pollination
between the speech research community and the information retrieval
community. The workshop addresses application domains including:
conversational broadcast, podcasts, meetings, lectures, discussions, debates,
interviews and cultural heritage archives.
 
We welcome contributions on a range of cross-cutting issues, including:
 
-Representation of spoken content for optimal search (e.g., LVCSR, word
 lattice search, STD on phone lattice) 
-Exploitation of evidence beyond word-level (e.g., emotional state, speaker
 characteristics, topic shifts, audio events)
 
-Application of text IR techniques to the speech domain
-Speech mining in multimedia data
-Multimodality (integrating features from associated non-speech content)
-Search effectiveness (e.g., evidence combination, expansion)
-Access to large scale collections
-Evaluation resources and benchmarking activities
-Multi-/cross-lingual retrieval
-Cross-media mining (e.g., coupling images or text fragments to speech)
-Interaction design and system development (e.g., query formulation, result
 presentation, search strategies)
-Spoken audio visualization (e.g. results lists, individual results)
-Spoken query search
 
Contributions for oral presentations (8 pages) poster presentations (2 pages),
demonstration descriptions (2 pages) and position papers for selection of
panel members (2 pages) will be accepted. Further information including
submission guidelines can be found on the workshop website
http://ilps.science.uva.nl/SSCS2008
 
 
Important Dates
16 May: Paper submission deadline
6 Jun: Notification of acceptance
20 Jun: Camera-ready papers due
24 July: SSCS 2008 Workshop at SIGIR 2008
 
Organizers
Joachim Koehler, Fraunhofer IAIS, Germany
Martha Larson, University of Amsterdam, The Netherlands
Franciska de Jong, University of Twente, The Netherlands
Roeland Ordelman, University of Twente, The Netherlands
Wessel Kraaij, TNO, The Netherlands
Back to Top

9-11 . Summer school New Trends in Pattern Recognition for Language Bilbao Spain

Second announce of the AERFAI (Spanish
IAPR) Summer School "New Trends in Pattern Recognition for Language
Technologies" to be held in Bilbao in June 23th-28th.

url: http://www.ehu.es/aerfaiss08

Back to Top

9-12 . 2nd Workshop on Analytics for Noisy Unstructured Text Data

SIGIR-08 Workshop

2nd Workshop on Analytics for Noisy Unstructured Text Data
24 July 2008 , Singapore

http://and2008workshop.googlepages.com/

Call for Papers


Workshop Description and Objectives
Noise is an unavoidable fact of life. It can manifest itself at the
earliest stages of processing in the form of degraded inputs that
our systems must be prepared to handle. People are adept when it
comes to pattern recognition tasks involving typeset or handwritten
documents or recorded speech, machines less-so. From the perspective
of down-stream processes that take as their inputs the outputs of
recognition systems, including document analysis and OCR, noise can
be viewed as the errors made by earlier stages of processing, which
are rarely perfect and sometimes quite brittle.
Noisy unstructured text data is also found in informal settings such
as online chat, SMS, email, message board and newsgroup postings,
blogs, wikis and web pages. In addition to the aforementioned
recognition errors, such text may contain spelling errors,
abbreviations, non-standard terminology, missing punctuation,
misleading case information, as well as false starts, repetitions,
and pause-filling sounds such as “um” and “uh” in the case of speech.
By its very nature, noisy text warrants moving beyond traditional
text analytics techniques. Noise introduces challenges that need
special handling, either through new methods or improved versions of
existing ones. We invite you to submit your own unique perspectives
on this important topic.

Topics of Interest (not limited to)
Information Retrieval and Information Extraction on noisy texts
IR-related tasks (classification, clustering, genre recognition,
document summarization, keyword search) on noisy texts
Formal models for noise, characterization and classification of noise
Treatment of noisy data in special application fields
        - Historical Texts
         - Multilingual Texts
         - Blogs
         - Chat logs/SMS
         - Social Network Analysis
         - Patent Search
         - Optical Character Recognition
         - Automated Speech Recognition
         - Machine Translation
Data sets, benchmarks and evaluation techniques for analysis of noisy
texts

Participation
We hope that the workshop will allow researchers working in areas
related to unstructured data analytics, Natural Language Processing,
Information Extraction, Information Retrieval, etc., to focus on the
needs of users extracting useful information from noisy text. The
target audience is a mixture of academia and industry researchers
working with noisy text. We believe this work is of direct relevance
to domains such as call centers, the world-wide web, and government
organizations that need to analyze huge amounts of noisy data.

Important Dates
Paper Submission: May 16th, 2008
Notification of Acceptance: Jun 6th, 2008
Camera-Ready papers due: Jun 20th, 2008
Workshop at SIGIR 2008: Jul 24th, 2008

Submission Requirements
We invite papers up to 8 pages in length in the style specified at
http://and2008workshop.googlepages.com/submission There will also be a
Best Student Paper Award. Papers with a student as the primary
author/presenter will be eligible for this award.

Publication
We are currently in negotiation with a leading publisher for the
proceedings to be available onsite. We have also received tentative
approval for a special issue of a journal for post-workshop
publication of selected papers.

Workshop Chairs
  Daniel Lopresti
Lehigh University
  Shourya Roy
IBM Research, India Research Lab
   Klaus U Schulz
University of Munich
  L. Venkata Subramaniam
IBM Research, India Research Lab

Workshop contacts
* L. V. Subramaniam lvsubram@in.ibm.com
* Shourya Roy rshourya@in.ibm.com


Please visit the workshop website
***** http://and2008workshop.googlepages.com/  *****
for information about participation and submitting papers.

For general information, please visit the SIGIR website
***** http://www.sigir2008.org/ *****

Back to Top

9-13 . eNTERFACE 2008 Orsay Paris

eNTERFACE'08 the next international summer workshop on multimodal

interfaces will take place at LIMSI, in Orsay (near Paris), France,

during four weeks, August 4th-29th, 2008.

http://entreface08.limsi.fr

Please consider proposing projects and participate to the workshop (see

the Call for Projects proposal on the web site or attached to this mail).

eNTERFACE08 is the next of a series of successful workshops initiated by

SIMILAR, the European Network of Excellence (NoE) on Multimodal

interfaces. eNTERFACE'08 will follow the fruitful path opened by

eNTERFACE05 in Mons, Belgium, continued by eNTERFACE06 in Dubrovnik,

Croatia and eNTERFACE07 in Istambul, Turkey. SIMILAR came to an end in

2007, and the eNTERFACE http://www.enterface.org workshops are now

under the aegis of the OpenInterface http://www.openinterface.org

Foundation.

eNTERFACE'08 Important Dates

. December 17th, 2007: Reception of the complete Project proposal in

the format provided by the Author's kit

. January 10rd, 2008: Notification of project acceptance

. February 1st, 2008: Publication of the Call for Participation

. August 4th -- August 29th, 2008: eNTERFACE 08 Workshop

 

Christophe d'Alessandro

CNRS-LIMSI, BP 133 - F91403 Orsay France

tel +33 (0) 1 69 85 81 13 / Fax -- 80 88

 

Back to Top

9-14 . 2nd IEEE Intl Conference on Semantic Computing

IEEE ICSC2008

Second IEEE International Conference on Semantic Computing


August 4th-7th,  2008
Santa Clara, CA, USA   http://icsc.eecs.uci.edu/

 

 

 

 

 

The field of Semantic Computing (SC) brings together those disciplines concerned with connecting the (often vaguely-formulated) intentions  of humans with computational content. This connection can go both ways: retrieving,  using  and  manipulating  existing content according to user's goals ("do what the user means"); and creating, rearranging, and managing content that matches the  author's intentions ("do what the author means").

 

The content addressed in SC includes, but is not limited to, structured and semi-structured data, multimedia data, text, programs, services and, even, network behaviour. This connection between content and the user is made via (1) Semantic  Analysis, which analyzes content with the goal of converting it to meaning (semantics); (2)Semantic Integration, which  integrates  content and  semantics  from multiple sources; (3)Semantic Applications, which utilize content and semantics to solve  problems;  and  (4)Semantic Interfaces, which attempt to interpret users' intentions expressed in natural language or other communicative forms.

 

Example areas of SC include (but, again, are not limited to) the following:

ANALYSIS AND UNDERSTANDING OF CONTENT

  • Natural-language processing
  • Image and video analysis
  • Audio and speech analysis
  • Analysis of structured and semi-structured data
  • Analysis of behavior of software, services, and networks

INTEGRATION OF MULTIPLE SEMANTIC REPRESENTATIONS

  • Database schema integration
  • Ontology integration
  • Interoperability and Service Integration

SEMANTIC INTERFACES

  • Natural-Language Interface
  • Multimodal Interfaces

APPLICATIONS

  • Semantic Web and other search technologies
  • Question answering
  • Semantic Web services
  • Multimedia databases
  • Engineering of software, services, and networks based on
  • natural-language specifications
  • Context-aware networks of sensors, devices, and/or applications

The second IEEE International Conference  on  Semantic  Computing (ICSC2008)  builds on the success of ICSC2007 as an international interdisciplinary forum  for  researchers  and  practitioners  to present research that advances the state of the art and practice of Semantic Computing, as well as identifying  the  emerging  research topics and defining the future of Semantic Computing.  The conference particularly welcomes interdisciplinary research that facilitates the ultimate success of Semantic Computing.

 

The event is located in Santa Clara, California, the heart of Silicon Valley. The technical program of ICSC2008 includes tutorials, workshops, invited talks, paper presentations, panel discussions, demo sessions, and an industry track. Submissions of high-quality papers  describing  mature results or on-going work are invited.

 

In addition to Technical Papers, the conference will feature

 * Tutorials  * Workshops * Demo Sessions * Special Sessions * Panels   * Industry Track

 

SUBMISSIONS

Authors are invited to submit an 8-page technical paper manuscript in double-column IEEE format following the guidelines available on the ICSC2008 web page under "submissions".

 

The Conference Proceedings will be published by the IEEE Computer Society Press.  Distinguished quality papers presented at the conference will be selected for publications in internationally renowned journals

Back to Top

9-15 . EUSIPCO-2008 - 16th European Signal Processing Conference - Lausanne Switzerland

EUSIPCO-2008 - 16th European Signal Processing Conference - August 25-29, 2008, Lausanne, Switzerland

- http://www.eusipco2008.org/

DEADLINE FOR SUBMISSION: February 8, 2008

 

The 2008 European Signal Processing Conference (EUSIPCO-2008) is the sixteenth in a series of conferences promoted by EURASIP, the European Association for Signal Processing (www.eurasip.org). This edition will take place in Lausanne, Switzerland, organized by the Swiss Federal Institute of Technology, Lausanne (EPFL).

 

EUSIPCO-2008 will focus on the key aspects of signal processing theory and applications. Exploration of new avenues and methodologies of signal processing will also be encouraged. Accepted papers will be published in the Proceedings of EUSIPCO-2008. Acceptance will be based on quality, relevance and originality. Proposals for tutorials are also invited.

 

*** This year will feature some exciting events and novelties: ***

 

- We are preparing a very attractive tutorial program and for the first time, access to the tutorials will be free to all registered participants! Some famous speakers have already been confirmed, but we also hereby call for new proposals for tutorials.

- We will also have top plenary speakers, including Stéphane Mallat (Polytechnique, France), Jeffrey A. Fessler (The University of Michigan, Ann Arbor, Michigan, USA), Phil Woodland (Cambridge, UK) and Bernhard Schölkopf (Max Planck Institute, Tübingen, Germany).

- The Conference will include 12 very interesting special sessions on some of the hottest topics in signal processing. See http://www.eusipco2008.org/11.html for the complete list of those special sessions.

- The list of 22 area chairs has been confirmed: see details at http://www.eusipco2008.org/7.html

- The social program will also be very exciting, with a welcome reception at the fantastic Olympic Museum in Lausanne, facing the Lake Geneva and the Alps (http://www.olympic.org/uk/passion/museum/index_uk.asp) and with the conference banquet starting with a cruise on the Lake Geneva on an historical boat, followed by a dinner at the Casino of Montreux (http://www.casinodemontreux.ch/).

Therefore I invite you to submit your work to EUSIPCO-2008 by the deadline and to attend the Conference in August in Lausanne.


 

 

IMPORTANT DATES:

Submission deadline of full papers (5 pages A4): February 8, 2008

Submission deadline of proposals for tutorials: February 8, 2008

Notification of Acceptance: April 30, 2008

Conference: August 25-29, 2008

 

 

More details on how to submit papers and proposals for tutorials can be found on the conference web site http://www.eusipco2008.org/

Back to Top

9-16 . 5th Joint Workshop on Machine Learning and Multimodal Interaction MLMI 2008

8-10 September 2008
                     Utrecht, The Netherlands

                       http://www.mlmi.info/


The fifth MLMI workshop will be held in Utrecht, The Netherlands,
following successful workshops in Martigny (2004), Edinburgh (2005),
Washington (2006) and Brno (2007).  MLMI brings together researchers
from the different communities working on the common theme of advanced
machine learning algorithms applied to multimodal human-human and
human-computer interaction.  The motivation for creating this joint
multi-disciplinary workshop arose from the actual needs of several large
collaborative projects, in Europe and the United States.


* Important dates

Submission of papers/posters: Monday, 31 March 2008
Acceptance notifications: Monday, 12 May 2008
Camera-ready versions of papers: Monday, 16 June 2008
Workshop: 8-10 September 2008


* Workshop topics

MLMI 2008 will feature talks (including a number of invited speakers),
posters and demonstrations.  Prospective authors are invited to submit
proposals in the following areas of interest, related to machine
learning and multimodal interaction:
 - human-human communication modeling
 - audio-visual perception of humans
 - human-computer interaction modeling
 - speech processing
 - image and video processing
 - multimodal processing, fusion and fission
 - multimodal discourse and dialogue modeling
 - multimodal indexing, structuring and summarization
 - annotation and browsing of multimodal data
 - machine learning algorithms and their applications to the topics above


* Satellite events

MLMI'08 will feature special sessions and satellite events, as during
the previous editions of MLMI (see http://www.mlmi.info/ for examples).  To
propose special sessions or satellite events, please contact the special
session chair.

MLMI 2008 is broadly colocated with a number of events in related
domains: Mobile HCI 2008, 2-5 September, in Amsterdam; FG 2008, 17-19
September, in Amsterdam; and ECML 2008, 15-19 September, in Antwerp.


* Guidelines for submission

The workshop proceedings will be published in Springer's Lecture Notes
in Computer Science series (pending approval).  The first four editions
of MLMI were published as LNCS 3361, 3869, 4299, and 4892.  However,
unlike previous MLMIs, the proceedings of MLMI 2008 will be printed
before the workshop and will be already available onsite to MLMI 2008
participants.

Submissions are invited either as long papers (12 pages) or as short
papers (6 pages), and may include a demonstration proposal.  Upon
acceptance of a paper, the Program Committee will also assign to it a
presentation format, oral or poster, taking into account: (a) the most
suitable format given the content of the paper; (b) the length of the
paper (long papers are more likely to be presented orally); (c) the
preferences expressed by the authors.

Please submit PDF files using the submission website at
http://groups.inf.ed.ac.uk/mlmi08/, following the Springer LNCS format
for proceedings and other multiauthor volumes
(http://www.springer.com/east/home/computer/lncs?SGWID=5-164-7-72376-0).
 Camera-ready versions of accepted papers, both long and short, are
required to follow these guidelines and to take into account the
reviewers' comments.  Authors of accepted short papers are encouraged to
turn them into long papers for the proceedings.


* Venue

Utrecht is the fourth largest city in the Netherlands, with historic
roots back to the Roman Empire.  Utrecht hosts one of the bigger
universities in the country, and with its historic centre and the many
students it provides and excellent atmosphere for social activities in-
or outside the workshop community.  Utrecht is centrally located in the
Netherlands, and has direct train connections to the major cities and
Schiphol International Airport.

TNO, organizer of MLMI 2008, is a not-for-profit research organization.
 TNO speech technological research is carried out in Soesterberg, at
TNO Human Factors, and has research areas in ASR, speaker and language
recognition, and word and event spotting.

The workshop will be held in "Ottone", a beautiful old building near the
"Singel", the canal which encircles the city center.  The conference
hall combines a spacious setting with a warm an friendly ambiance.


* Organizing Committee

David van Leeuwen, TNO (Organization Chair)
Anton Nijholt, University of Twente (Special Sessions Chair)
Andrei Popescu-Belis, IDIAP Research Institute (Programme Co-chair)
Rainer Stiefelhagen, University of Karlsruhe (Programme Co-chair)


Back to Top

9-17 . TDS 2008 11th Int.Conf. on Text, Speech and Dialogue

Eleventh International Conference on TEXT, SPEECH and DIALOGUE (TSD 2008)

Brno, Czech Republic, 8-12 September 2008

http://www.tsdconference.org/

The conference is organized by the Faculty of Informatics, Masaryk

University, Brno, and the Faculty of Applied Sciences, University of

West Bohemia, Pilsen. The conference is supported by International

Speech Communication Association.

Venue: Brno, Czech Republic

TSD SERIES

TSD series evolved as a prime forum for interaction between

researchers in both spoken and written language processing from the

former East Block countries and their Western colleagues. Proceedings

of TSD form a book published by Springer-Verlag in their Lecture Notes

in Artificial Intelligence (LNAI) series.

 

TOPICS

Topics of the conference will include (but are not limited to):

text corpora and tagging

transcription problems in spoken corpora

sense disambiguation

links between text and speech oriented systems

parsing issues

parsing problems in spoken texts

multi-lingual issues

multi-lingual dialogue systems

information retrieval and information extraction

text/topic summarization

machine translation

semantic networks and ontologies

semantic web

speech modeling

speech segmentation

speech recognition

search in speech for IR and IE

text-to-speech synthesis

dialogue systems

development of dialogue strategies

prosody in dialogues

emotions and personality modeling

user modeling

knowledge representation in relation to dialogue systems

assistive technologies based on speech and dialogue

applied systems and software

facial animation

visual speech synthesis

Papers on processing of languages other than English are strongly

encouraged.

 

PROGRAM COMMITTEE

Frederick Jelinek, USA (general chair)

Hynek Hermansky, Switzerland (executive chair)

FORMAT OF THE CONFERENCE

The conference program will include presentation of invited papers,

oral presentations, and a poster/demonstration sessions. Papers will

be presented in plenary or topic oriented sessions.

Social events including a trip in the vicinity of Brno will allow

for additional informal interactions.

 

CONFERENCE PROGRAM

The conference program will include oral presentations and

poster/demonstration sessions with sufficient time for discussions of

the issues raised.

 

IMPORTANT DATES

March 15 2008 ............ Submission of abstract

March 22 2008 ............ Submission of full papers

May 15 2008 .............. Notification of acceptance

May 31 2008 .............. Final papers (camera ready) and registration

July 23 2008 ............. Submission of demonstration abstracts

July 30 2008 ............. Notification of acceptance for

demonstrations sent to the authors

September 8-12 2008 ...... Conference date

The contributions to the conference will be published in proceedings

that will be made available to participants at the time of the

conference.

 

OFFICIAL LANGUAGE

of the conference will be English.

 

ADDRESS

All correspondence regarding the conference should be

addressed to

 

Dana Hlavackova, TSD 2008

Faculty of Informatics, Masaryk University

Botanicka 68a, 602 00 Brno, Czech Republic

phone: +420-5-49 49 33 29

fax: +420-5-49 49 18 20

email: tsd2008@tsdconference.org

 

LOCATION

Brno is the the second largest city in the Czech Republic with a

population of almost 400.000 and is the country's judiciary and

trade-fair center. Brno is the capital of Moravia, which is in the

south-east part of the Czech Republic. It had been a Royal City since

1347 and with its six universities it forms a cultural center of the

region.

Brno can be reached easily by direct flights from London, Moscow, Barcelona

and Prague and by trains or buses from Prague (200 km) or Vienna (130 km).

Back to Top

9-18 . Third Workshop on Speech in Mobile and Pervasive Environments

Call for papers
      Third Workshop on Speech in Mobile and Pervasive Environments
                  (in conjunction with ACM MobileHCI '08)
                        Amsterdam, The Netherlands
                             September 2, 2008
                      http://research.ihost.com/SiMPE



In the past, voice-based applications have been accessed using
unintelligent telephone devices through Voice Browsers that reside on the
server. The proliferation of pervasive devices and the increase in their
processing capabilities, client-side speech processing has been emerging as
a viable alternative. In SiMPE 2008, the third in the series, we will
continue to explore the various possibilities and issues that arise while
enabling speech processing on resource-constrained, possibly mobile
devices.



Topics of Interest:

All areas that enable, optimise or enhance Speech in mobile and pervasive
environments and devices. Possible areas include, but are not restricted
to:
      * Robust Speech Recognition in Noisy and Resource-constrained
Environments
      * Memory/Energy Efficient Algorithms
      * Multimodal User Interfaces for Mobile Devices
      * Protocols and Standards for Speech Applications
      * Distributed Speech Processing
      * Mobile Application Adaptation and Learning
      * Prototypical System Architectures
      * User Modelling
      * HCI issues in SiMPE applications
      * Design and cultural issues in SiMPE
      * Speech interfaces/applications for Developing Regions



Submissions:

We seek original, unpublished papers in the following three categories: (a)
Position papers that describe novel ideas that can lead to interesting
research directions, (b) Early results or work-in-progress that has
significant promise, or, (c) Full papers. Papers should be of 4-8 pages in
length in the MobileHCI publication format. The LaTeX and Microsoft Word
templates are available at the workshop website. All submissions should be
in the PDF format and should be submitted electronically through the
workshop submission web site,
http://www.easychair.org/conferences/?conf=simpe08. Since the submission
deadlines are dependent on the MobileHCI conference, we will not be able to
grant any extensions in any circumstances.

For any comments regarding submissions and participation, contact:
simpe08@easychair.org



Key Dates:

 * Paper Submission Deadline: May 05, 2008 (11:59 PM CET)
 * Notification of Acceptance: May 19, 2008
 * Early Registration Deadline: June 02, 2008
 * Workshop: September 02, 2008.



Organising Committee:

Amit A. Nanavati, IBM India Research Laboratory.
Nitendra Rajput, IBM India Research Laboratory.
Alexander I. Rudnicky, Carnegie Mellon University.
Markku Turunen, University of Tampere, Finland.



Programme Committee:

Lou Boves, University of Nijmegen, The Netherlands
Matt Jones, Swansea University, UK
Yoon Kim, Novauris Technologies, USA
Lars Bo Larsen, Aalborg University, Denmark
Gary Marsden, University of Cape Town, South Africa
Michael McTear, University of Ulster, Ireland
Shrikanth S. Narayanan, University of Southern California, USA
Tim Paek, Microsoft, USA
David Pearce, Motorola, UK.
Mike Phillips, Vlingo, USA
Markku Turunen, University of Tampere, Finland
Yaxin Zhang, Motorola, China
(More to be updated)



Websites:

 * SiMPE Workshop: http://research.ihost.com/SiMPE
 * ACM MobileHCI '08: http://www.mobilehci.org/
 * SiMPE 2007: http://research.ihost.com/SiMPE/2007
 * SiMPE 2006: http://research.ihost.com/SiMPE/2006

Back to Top

9-19 . 50th International Symposium ELMAR-2008

10-13 September 2008, Zadar, Croatia

http://www.elmar-zadar.org/

TECHNICAL CO-SPONSORS

IEEE Region 8

EURASIP - European Assoc. Signal, Speech and Image Processing

IEEE Croatia Section

IEEE Croatia Section Chapter of the Signal Processing Society

IEEE Croatia Section Joint Chapter of the AP/MTT Societies

TOPICS

 

--> Image and Video Processing

--> Multimedia Communications

--> Speech and Audio Processing

--> Wireless Commununications

--> Telecommunications

--> Antennas and Propagation

--> e-Learning and m-Learning

--> Navigation Systems

--> Ship Electronic Systems

--> Power Electronics and Automation

--> Naval Architecture

--> Sea Ecology

--> Special Session Proposals - A special session consist

of 5-6 papers which should present a unifying theme

from a diversity of viewpoints; deadline for proposals

is February 04, 2008.

KEYNOTE TALKS

* Professor Sanjit K. Mitra, University of Southern California, Los Angeles, California, USA:

Image Processing using Quadratic Volterra Filters

* Univ.Prof.Dr.techn. Markus Rupp, Vienna University

of Technology, AUSTRIA:

Testbeds and Rapid Prototyping in Wireless Systems

* Professor Paul Cross, University College London, UK:

GNSS Data Modeling: The Key to Increasing Safety and

Legally Critical Applications of GNSS

* Dr.-Ing. Malte Kob, RWTH Aachen University, GERMANY:

The Role of Resonators in the Generation of Voice

Signals

SPECIAL SESSIONS

SS1: "VISNET II - Networked Audiovisual Systems"

Organizer: Dr. Marta Mrak, I-lab, Centre for Communication

Systems Research, University of Surrey, UNITED KINGDOM

Contact: http://www.ee.surrey.ac.uk/CCSR/profiles?s_id=3D3937

SS2: "Computer Vision in Art"

Organizer: Asst.Prof. Peter Peer and Dr. Borut Batagelj,

University of Ljubljana, Faculty of Computer and Information

Science, Computer Vision Laboratory, SLOVENIA

Contact: http://www.lrv.fri.uni-lj.si/~peterp/ or

http://www.fri.uni-lj.si/en/personnel/298/oseba.html

SUBMISSION

Papers accepted by two reviewers will be published in

symposium proceedings available at the symposium and

abstracted/indexed in the INSPEC and IEEExplore database.

More info is available here: http://www.elmar-zadar.org/

IMPORTANT: Web-based (online) paper submission of papers in

PDF format is required for all authors. No e-mail, fax, or

postal submissions will be accepted. Authors should prepare

their papers according to ELMAR-2008 paper sample, convert

them to PDF based on IEEE requirements, and submit them using

web-based submission system by March 03, 2008.

SCHEDULE OF IMPORTANT DATES

Deadline for submission of full papers: March 03, 2008

Notification of acceptance mailed out by: April 21, 2008

Submission of (final) camera-ready papers : May 05, 2008

Preliminary program available online by: May 12, 2008

Registration forms and payment deadline: May 19, 2008

Accommodation deadline: June 02, 2008

GENERAL CO-CHAIRS

Ive Mustac, Tankerska plovidba, Zadar, Croatia

Branka Zovko-Cihlar, University of Zagreb, Croatia

PROGRAM CHAIR

Mislav Grgic, University of Zagreb, Croatia

CONTACT INFORMATION

Assoc.Prof. Mislav Grgic, Ph.D.

FER, Unska 3/XII

HR-10000 Zagreb

CROATIA

Telephone: + 385 1 6129 851=20

Fax: + 385 1 6129 568=20

E-mail: elmar2008 (_) fer.hr

For further information please visit:

http://www.elmar-zadar.org/

Back to Top

9-20 . Dynamique de la nasalite (in french) Ile de Porquerolles

Le GIPSA-Lab et le LPP organisent une école thématique CNRS pluridisciplinaire autour du thème de la nasalité:

 

"Dynamique de la nasalité"

 

du 15 au 19 Septembre 2008 sur l'île de Porquerolles.

 

La nasalité et les processus de nasalisation dans les langues sont des phénomènes étudiés dans différents champs disciplinaires proches ou éloignés, et qui ne se rencontrent pas forcément (dialectologie, typologie linguistique, phonétique acoustique et articulatoire, phonologie, morphophonologie, linguistique historique, traitement du signal…). Les disciplines impliquées dans l’étude de ces processus sont des domaines de recherche qui ont chacun leur réseau de diffusion scientifique, avec des publications spécifiques, des colloques et autres manifestations scientifiques propres à chaque discipline, très souvent même des structures de laboratoires disjointes. Les espaces de production et de diffusion du savoir distincts rendent difficiles les échanges entre chercheurs de ces communautés. Pourtant, dans ces disciplines, les spécialistes s’accordent à reconnaître que les processus de nasalisation sont des phénomènes dynamiques complexes dont les mécanismes linguistiques, physiques et physiologiques ne sont pas encore clairement élucidés. Les objectifs de cette école sont d’accompagner et de favoriser le rapprochement de ces disciplines connexes, de faire se rencontrer des communautés scientifiques qui évoluent chacune dans des réseaux spécifiques, de connaître les terminologies, les formalismes, d’échanger les outils et les données, de stimuler les interactions et favoriser la transdisciplinarité des projets de recherche. Cette école sera aussi le moyen de susciter la création de réseaux d’échanges et de collaboration, comme de projets de recherche transdisciplinaires.

 

Public concerné : doctorants, ingénieurs, chercheurs et enseignants-chercheurs dans les domaines de la parole et du langage

Possibilité d’accueil : 50 participants

 

Plus d’information sur le site internet :
http://lpp.univ-paris3.fr/productions/conferences/2008/Dynamique_nasalite/

 

Back to Top

9-21 . 2008 International Workshop on Multimedia Signal Processing

October 8-10, 2008 
Shangri-la Hotel Cairns, Queensland, Australia 
http://www.mmsp08.org/  
MMSP-08 Call for Papers  MMSP-08 is the tenth international workshop on multimedia signal 
processing. The workshop is organized by the Multimedia Signal Processing Technical 
Committee of the IEEE Signal Processing Society. A new theme of this workshop is 
Bio-Inspired Multimedia Signal Processing in Life Science Research. 
The main goal of MMSP-2008 is to further the scientific research within the broad field of 
multimedia signal processing and its interaction with other new emerging areas such 
as life science. The workshop will focus on major trends and challenges in this area, i
ncluding brainstorming a roadmap for the success of future research and application. 
MMSP-08 workshop consists of interesting features:   
* A Student Paper Contest with awards sponsored by Canon. To enter the contest a 
paper submission must have a student as the first author 
* A Best Paper from oral presentation session with awards sponsored by Microsoft. 
* A Best Poster presentation with awards sponsored by National ICT Australia (NICTA).   
* New session for Bio-Inspired Multimedia Signal Processing  SCOPE  Papers are solicited 
in, but not limited to, the following general areas: 
*Bio-inspired multimedia signal processing 
*Multimedia processing techniques inspired by the study of signals/images derived from 
medical, biomedical and other life science disciplines with applications to multimedia signal processing. *Fusion mechanism 
of multimodal signals in human information processing system and applications to 
multimodal multimedia data fusion/integration. 
*Comparison between bio-inspired methods and conventional methods. 
*Hybrid multimedia processing technology and systems incorporating bio-inspired and 
conventional methods. 
*Joint audio/visual processing, pattern recognition, sensor fusion, medical imaging, 
2-D and 3-D graphics/geometry coding and animation, pre/post-processing of digital video, 
joint source/channel coding, data streaming, speech/audio, image/video coding and 
processing 
*Multimedia databases (content analysis, representation, indexing, recognition and 
retrieval) 
*Human-machine interfaces and interaction using multiple modalities 
*Multimedia security (data hiding, authentication, and access control)   
*Multimedia networking (priority-based QoS control and scheduling, traffic engineering, 
soft IP multicast support, home networking technologies, position aware computing, 
wireless communications). 
*Multimedia Systems Design, Implementation and Application (design, distributed 
multimedia systems, real time and non-real-time systems; implementation; multimedia 
hardware and software) 
*Standards    
SCHEDULE  
* Special Sessions (contact the respective chair):  March 8, 2008  
* Papers (full paper, 4-6 pages, to be received by):  April 18, 2008  
* Notification of acceptance by:  June 18,  2008 
* Camera-ready paper submission by:  July 18, 2008  
 
General Co-Chairs 
Prof. David Feng,  University of Sydney, Australia, and Hong Kong 
Polytechnic University feng@it.usyd.edu.au  
Prof. Thomas Sikora,  Technical University Berlin Germany sikora@nue.tu-berlin.de  
Prof. W.C. Siu,  Hong Kong Polytechnic University enwcsiu@polyu.edu.hk  
Technical Program Co-Chairs 
Dr. Jian Zhang National ICT Australia jian.zhang@nicta.com.au  
Prof. Ling Guan Ryerson University, Canada  lguan@ee.ryerson.ca  
Prof. Jean-Luc Dugelay Institute EURECOM, Sophia Antipolis, France  Jean-Luc.Dugelay@eurecom.fr  
Special Session Co-Chairs: 
Prof. Wenjun Zeng University of Missouri, USA  zengw@missouri.edu  
Prof. Pascal Frossard EPFL, Switzerland pascal.frossard@epfl.ch  
Back to Top

9-22 . 4th IBM Watson Emerging leaders in Multimedia at IBM Watson.

The IBM Watson “Emerging Leaders in Multimedia” workshop series is an annual event organized to recognize outstanding student researchers in the multimedia area. We are currently inviting student applications for the fourth workshop in this series. This is a two day event that will be held on October 16 and 17, 2008 at the IBM T. J. Watson Research Center in Hawthorne, New York. The workshop will consist of student research presentations, demonstrations of multimedia projects currently underway at IBM, and several interactive sessions among students and researchers on open and emerging problems in the field and exciting directions for future research. Please visit the following website http://domino.research.ibm.com/comm/research.nsf/pages/r.multimedia.workshop2008.html for more information.

We plan to invite 8 exceptional graduate students working in these areas to visit our labs(expenses covered by IBM), present their research, and learn about the state-of-the art industrial media research at this workshop. We encourage mid to senior level graduate PhD. students from CS, EE, ECE, and all other relevant disciplines to apply. The application package should include a short (2-3 paragraphs) abstract that describes the student's current research, an up to date resume with a list of publications, and a letter of support from the student's thesis advisor. Additional supporting material is optional.
Please submit your applications by August 24, 2008 to Gayathri Shaikh (g3@us.ibm.com) or Ying Li (yingli@us.ibm.com).




Back to Top

9-23 . 2008 IEEE Intl Workshop on MACHINE LEARNING FOR SIGNAL PROCESSING

2008 IEEE International Workshop on MACHINE LEARNING FOR SIGNAL PROCESSING
(Formerly the IEEE Workshop on Neural Networks for Signal Processing)

October 16-19, 2008 Cancun, Mexico
Fiesta Americana Condesa Cancun, www.fiestamericana.com

Deadlines:
Submission of full paper:                     May 5, 2008
Notification of acceptance:                     June 16, 2008
Camera-ready paper and author registration:     June 23, 2008
Advance registration before:                    July 1, 2008

http://mlsp2008.conwiz.dk/

The workshop will feature keynote addresses, technical presentations, special
sessions and tutorials organized in two themes that will be included in the
registration. Tutorials will take place on the afternoon of 16 October, and
the workshop will begin on 17 October. The two themes for MLSP 2008 are
Cognitive Sensing and Kernel Methods for Nonlinear Signal Processing. Papers
are solicited for, but not limited to, the following areas:

Algorithms and Architectures:
Artificial neural networks, kernel methods, committee models, Gaussian
processes, independent component analysis, advanced (adaptive, nonlinear)
signal processing, (hidden) Markov models, Bayesian modeling, parameter
estimation, generalization, optimization, design algorithms.

Applications:
Speech processing, image processing (computer vision, OCR) medical imaging,
multimodal interactions, multi-channel processing, intelligent multimedia and
web processing, robotics, sonar and radar, biomedical engineering, financial
analysis, time series prediction, blind source separation, data fusion, data
mining, adaptive filtering, communications, sensors, system identification,
and other signal processing and pattern recognition applications.

Implementations:
Parallel and distributed implementation, hardware design, and other general
implementation technologies.

For the fourth consecutive year, a Data Analysis and Signal Processing
Competition is being organized in conjunction with the workshop. The goal of
the competition is to advance the current state-of-the-art in theoretical and
practical aspects of signal processing domains. The problems are selected to
reflect current trends, evaluate existing approaches on common benchmarks, and
identify critical new areas of research. Previous competitions produced novel
and effective approaches to challenging problems, advancing the mission of the
MLSP community. A description of the competition, the submissions, and the
results, will be included in a paper which will be published in the
proceedings. Winners will be announced and awards given at the workshop.

Selected papers from MLSP 2008 will be considered for a special issue of The
Journal of Signal Processing Systems for Signal, Image, and Video Technology,
to appear in 2009. The MLSP technical committee may invite one or more winners
of the data analysis and signal processing competition to submit a paper
describing their methodology to the special issue.

Paper Submission Procedure
Prospective authors are invited to submit a double column paper of up to six
pages using the electronic submission procedure at http://mlsp2008.conwiz.dk.
Accepted papers will be published on a CDROM to be distributed at the
workshop.

MLSP'2007 webpage: http://mlsp2008.conwiz.dk/

MLSP 2008 ORGANIZING COMMITTEE:

General Chair
Jose Principe

Program Chair
Deniz Erdogmus

Technical Chair
Tulay Adali

Publicity Chairs
Ignacio Santamaria
Marc Van Hulle

Publication Chair
Jan Larsen

Data Competition
Ken Hild
Vince Calhoun

Local Arrangements
Juan Azuela

Back to Top

9-24 . V Jornadas en Tecnologia de Habla and Evaluation campaigns Bilbao Spain

VJTH’2008 – CALL FOR PAPERS

5th Workshop on Speech Technology                      V Jornadas en Tecnología del Habla

November 12-14, 2008, Bilbao, Spain

http://jth2008.ehu.es

Organized by the Aholab-Signal Processing Laboratory of the Dept. of Electronics and Telecommunications of the University of the Basque Country (UPV/EHU) and supported by the Spanish Thematic Network on Speech Technologies and ISCA.

The “V Jornadas en Tecnología del Habla” (http://jth2008.ehu.es) , will be held in November 12-14, 2008 in Bilbao, Spain. Previous workshops were held in Sevilla (2000), Granada (2002), Valencia (2004) and Zaragoza (2006).The aim of the workshop is to present and discuss the wide range of speech technologies and applications related to Iberian languages. The workshop will feature technical presentations, special sessions and invited conferences, all of which will be included in the registration. During the workshop, the results of the ALBAYZIN 08 Evaluation campaigns and best papers awards will be presented.

The main topics of the workshop are:


  • Speech recognition and understanding
  • Speech synthesis
  • Signal processing and feature extraction
  • Natural language processing
  • Dialogue systems
  • Automatic translation
  • Speech perception
  • Speech coding
  • Speaker and language identification
  • Speech and language resources
  • Information retrieval
  • Applications for handicapped persons
  • Applied systems for advanced interaction


 

Invited Speakers:

·         Nestor Becerra (Universidad de Santiago, Chile)

Aplicaciones de las tecnologías del habla en sistemas CALL (Computer Aided Language Training) y CAPT (Computer Aided Pronunciation Training)

·         Giussepe Ricardi (University of Trento, Italy)

Next Generation Spoken Language Interfaces

·         Björn Granstrom (KTH - Royal Institute of Technology, Suecia)

Embodied conversational agents  in verbal and non-verbal communication

·         Yannis Stilianou (University of Crete, Grecia)

Voice Conversion: State of the art and Perspectives

Important dates:

·         Full paper submission: July 20, 2008

  • Notification of acceptance: October 1, 2008
  • Conference V JTH 2008: November 12-14, 2008

Contact information:

VJTH’2008

Dept. Electronics and Telecommunications

Faculty of Engineering

Alda. Urkijo s/n

48013 Bilbao

Tel.: +34 946 013 969

Fax.: +34 946 014 259

E-mail: 5jth@ehu.es           Web: http://jth2008.ehu.es

 

 

EVALUATION CAMPAIGNS

               ALBAYZIN-08 System Evaluation Proposal

The Speech Technologies Thematic Network ("Red Temática en Tecnologías del Habla") is a common forum where the researchers on Speech Technologies can work together and share experiences in order to: 

  • Promote Speech Technology research, attracting new young researchers by means of formation courses, student interchange, grants and awards.
  • Get investments from enterprises for Speech Technology research, looking for new applications  that can bring business opportunities. These applications must be shown in demostrators that can attract enterprises' interest.
  • Make progress in creating collaboration ties among the Network members, enforcing the leadership of Spain in the Spanish speech technologies, as well as the co-official languages, such as Catalan, Basque or Galician.

 

In order to promote new young researchers' Speech Technology investigation, the "Red Temática en Tecnologías del Habla"  organizes a system evaluation proposal, on the next areas: 

 

Registration Form

 

http://gtts.ehu.es:8080/RTTH-LRE08/Formulario.jsp
 

 

Registration Form

 

http://jth2008.ehu.es/form_ALBAYZIN08_CTV_en.pdf
 

 

Registration Form

http://jth2008.ehu.es/form_ALBAYZIN08_TA_en.pdf

 

These are the conditions for the participants: 

 

The participants undertake to present the evaluation results in a special session during the V Jornadas en Tecnología del Habla. 

Participants can take part individually or as a team.

 

 

 

 

Back to Top

9-25 . 10th International Conference on Multimodal Interfaces (ICMI 2008)

The Tenth International Conference on Multimodal Interfaces (ICMI

2008) will take place in Chania, Greece, on October 20-22, 2008. The

main aim of ICMI 2008 is to further scientific research within the

broad field of multimodal interaction and systems. The conference will

focus on major trends and challenges in this area, including help

identify a roadmap for future research and commercial success. ICMI

2008 will feature a main conference with keynote speakers, panel

discussions, technical paper presentations and discussion (single

track), poster sessions, and demonstrations of state-of-the-art

multimodal concepts and systems. Organized on the island of Crete,

ICMI-08 provides excellent conditions for brainstorming and sharing

the latest advances about multimodal interaction and systems in an

inspired setting full of history, mythology and art.

Paper Submission

There are two different submission categories: regular paper and short

paper. The page limit is 8 pages for regular papers and 4 pages for

short papers. The presentation style (oral or poster) will be decided

based on suitable delivery of the content.

 

Demo Submission

Proposals for demonstrations shall be submitted to demo chairs

electronically. A 1-2 page description of the demonstration is required.

 

Doctoral Spotlight

Doctoral Student Travel Support and Spotlight Session. Funds are

expected from NSF to support participation of doctoral candidates at

ICMI 2008, and a spotlight session is planned to showcase ongoing

thesis work. Students

interested in travel support can submit a short or long paper as

specified above.

 

Topics of interest include

* Multimodal and Multimedia processing

* Multimodal input and output interfaces

* Multimodal applications

* User Modeling and Adaptation

* Multimodal Architectures, Tools and Standards

* Evaluation of Multimodal Interfaces

 

*Important Dates:*

Paper submission: May 23, 2008

Author notification July 14, 2008

Camera ready deadline: August 15, 2008

Conference: October 20-22, 2008

 

Organizing Committee

 

General Co-Chairs

Vassilis Digalakis, TU Crete, Greece

Alex Potamianos, TU Crete, Greece

Matthew Turk, UC Santa Barbara, USA

 

Program Co-Chairs

Roberto Pieraccini, SpeechCycle, USA

Jian Wang, Microsoft Research, China

Yuri Ivanov, MERL 
Back to Top

9-26 . 9th International Conference on Signal Processing

Oct. 26-29, 2008 Beijing, CHINA
 
The 9th International Conference on Signal Processing will be held in Beijing,
China on Oct. 26-29, 2008. It will include sessions on all aspects of theory,
design and applications of signal processing. Prospective authors are invited
to propose papers in any of the following areas, but not limited to:
 
A. Digital Signal Processing (DSP)
B. Spectrum Estimation & Modeling
C. TF Spectrum Analysis & Wavelet
D. Higher Order Spectral Analysis
E. Adaptive Filtering & SP
F. Array Signal Processing
G. Hardware Implementation for SP
H  Speech and Audio Coding
I. Speech Synthesis & Recognition
J. Image Processing & Understanding
K. PDE for Image Processing
L. Video compression & Streaming
M. Computer Vision & VR
N. Multimedia & Human-computer Interaction
O. Statistic Learning,ML & Pattern Recognition
P. AI & Neural Networks
Q. Communication Signal Processing
R. SP for Internet, Wireless and Communications
S. Biometrics & Authentification
T. SP for Bio-medical & Cognitive Science
U. SP for Bio-informatics
V. Signal Processing for Security
W. Radar Signal Processing 
X. Sonar Signal Processing and Localization
Y. SP for Sensor Networks
Z. Application & Others
 
PAPER SUBMISSION GUIDELINE
Prospective authors are invited to submit the full papers, which should be
composed of title of the paper, author's names, addresses, telephone, Fax,
E-mail, topic area, by uploading the electronic submissions in .pdf format to
 
http://icsp08.bjtu.edu.cn
 
Before June 15 , 2008.
 
PROCEEDINGS
The proceedings with Catalog number of IEEE and Library of Congress will be
published prior to the conference in both hardcopy and CD-ROM, and distributed
to all registered participants at the conference. The proceedings will be
indexed by EI.
 
LANGUAGE
The working language is English.
 
TOURS
The accompanying person’s activities and tours will be arranged by Organizing
Committee.
 
DEADLINES
Submission of papers               June 15, 2008
Notification of acceptance         July 15, 2008
Submission of Camera-ready papers  Aug. 15, 2008
Pre-registration                   Sept. 20, 2008
       
 
Please visit http://icsp08.bjtu.edu.cn for more details.
 
Sponsor 
IEEE Beijing Section
Technical Co-sponsor
IEEE Signal Processing Society   
Co-sponsors
The Chinese Institute of Electronics
IET
URSI
Nat. Natural Sci. Foundation of China
IEEE SP Society Beijing Chapter
IEEE Computer Society Beijing Chapter
Japan China Science and Technology 
Exchange Association
 
Organizers
Beijing Jiaotong University
CIE Signal Processing Society
 
Technical Program Committee
Prof. RUAN Qiuqi
Beijing Jiaotong University
Beijing 100044, CHINA
Tel.: (8610)5168-8616, 5168-8073
Email: bzyuan@bjtu.edu.cn
 
Organizing Committee
Mr. ZHOU Mengqi
P.O. Box 165, Beijing 100036,CHINA
Email: zhoumq@public3.bta.net.cn
 
Secretary
Ms. TANG Xiaofang
Email: bfxxstxf@bjtu.edu.cn
Back to Top

9-27 . 8th International Seminar on Speech Production - ISSP 2008

We are pleased to announce that the eighth International Seminar on Speech Production - ISSP 2008 will be held in Strasbourg, Alsace, France from the 8th to the 12th of December, 2008.

We are looking forward to continuing the tradition established at previous ISSP meetings in Grenoble, Leeds, Old Saybrook, Autrans, Kloster Seeon, Sydney, and Ubatuba of providing a congenial forum for presentation and discussion of current research in all aspects of speech production.

The following invited speakers have accepted to present their ongoing research works:

Vincent Gracco
McGill University, Montreal, Canada
General topic Neural control of speech production and perception


Sadao HIROYA
Boston University, United States
General topic Speech production and perception, brain imaging and stochastic speech production modeling


Alexis Michaud
Phonetics and Phonology Laboratory of Université Paris III, Paris, France
General topic Prosody in tone languages


Marianne Pouplier
Institute for Phonetics and Speech Communication, Munich, Germany
General topic Articulatory speech errors


Gregor Schoener
Institute for Neuroinformatics Bochum, Germany
General topic Motor control of multi-degree of freedom movements

Topics covered

Topics of interest for ISSP'2008 include, but are not restricted to, the following:

  • Articulatory-acoustic relations
  • Perception-action control
  • Intra- and inter-speaker variability
  • Articulatory synthesis
  • Acoustic to articulatory inversion
  • Connected speech processes
  • Coarticulation
  • Prosody
  • Biomechanical modeling
  • Models of motor control
  • Audiovisual synthesis
  • Aerodynamic models and data
  • Cerebral organization and neural correlates of speech
  • Disorders of speech motor control
  • Instrumental techniques
  • Speech and language acquisition
  • Audio-visual speech perception
  • Plasticity of speech production and perception

In addition, the following special sessions are currently being planned:

1. Speech inversion (Yves Laprie)

2. Experimental techniques investigating speech (Susanne Fuchs)

For abstract submission, please include:

•1)      the name(s) of the author(s);

•2)       affiliations, a contact e-mail address;

•3)      whether you prefer an oral or a poster presentation in the first lines of the body of the message.

All abstracts should be no longer than 2 pages (font 12 points, Times) and written in English.

Deadline for abstract submission is the 28th of March 2008. All details can be viewed at

http://issp2008.loria.fr/

Notification of acceptance will be given on the 21st of April, 2008.

The organizers:

Rudolph Sock

Yves Laprie

Susanne Fuchs

 

Back to Top

9-28 . 1st CfP 5th International MultiMedia Modeling Conference (MMM2009)

FIRST CALL FOR PAPERS
The 15th International MultiMedia Modeling Conference (MMM2009)
7-9 January 2009,
Institut EURECOM, Sophia Antipolis, France.
http://mmm2009.eurecom.fr
 
===============================================================
 
The International MultiMedia Modeling (MMM) Conference is a 
leading international conference http://mmm2009.eurecom.fr for 
researchers and industry practitioners to share their new ideas,
original research results and practical development experiences 
from all MMM related areas. The conference calls for original 
high-quality papers in, but not limited to, the following areas 
related to multimedia modeling technologies and applications:
 
1. Multimedia Content Analysis
1.1 Multimodal Content Analysis
1.2 Media Assimilation and Fusion
1.3 Content-Based Multimedia Retrieval and Browsing
1.4 Multimedia Indexing
1.5 Multimedia Abstraction and Summarization
1.6 Semantic Analysis of Multimedia Data
1.7 Statistical Modeling of Multimedia Data
2. Multimedia Signal Processing and Communications
2.1 Media Representation and Algorithms
2.2 Audio, Image, Video Processing, Coding and Compression
2.3 Multimedia Database, Content Delivery and Transport
2.4 Multimedia Security and Content Protection
2.5 Wireless and Mobile Multimedia Networking
2.6 Multimedia Standards and Related Issues
3. Multimedia Applications and Services
3.1 Real-Time, Interactive Multimedia Applications
3.2 Ambiance Multimedia Applications
3.3 Multi-Modal Interaction
3.4 Virtual Environments
3.5 Personalization
3.6 Collaboration, Contextual Metadata, Collaborative Tagging
3.7 Web Applications
3.8 Multimedia Authoring
3.9 Multimedia-Enabled New Applications
(E-Learning, Entertainment, Health Care, Web2.0, SNS, etc.)
 
Paper Submission Guidelines
Papers should be no more than 10-12 pages in length, conforming
to the formatting instructions of Springer Verlag, LNCS series 
www.springer.com/lncs. Papers will be judged by an international 
program committee based on their originality, significance, 
correctness and clarity. All papers should be submitted 
electronically in PDF format at MMM2009 paper submission website: 
http://mmm2009.eurecom.fr
To publish the paper in the conference, one of the authors needs 
to register and present the paper in the conference.
Authors of selected papers will be invited to submit extended 
versions to "EURASIP Journal on Image and Video Processing" journal.
 
Important Dates
Submission of full papers: 6 Jul. 2008 (23:59 Central European Time (GMT+1))
Notification of acceptance: 15 Sep. 2008
Camera-ready Copy Due: 10 Oct. 2008
Author registration: 10 Oct. 2008
Conference: 7-9 Jan. 2009
 
General Chair
Benoit HUET, Institut EURECOM
 
Program Co-Chairs
Alan SMEATON, Dublin City University
Ketan MAYER-PATEL, UNC-Chapel Hill
Yannis AVRITHIS, National Technical University of Athens
 
Local Organizing Co-Chairs
Jean-Luc DUGELAY, Institut EURECOM
Bernard MERIALDO, Institut EURECOM
 
Demo Chair
Ana Cristina ANDRES DEL VALLE, Accenture Technology Labs
 
Finance Chair
Marc ANTONINI, University Nice Sophia-Antipolis
 
Publication Chairs
Thierry DECLERCK, DFKI GmbH
 
Publicity & Sponsorship Chair
Nick EVANS, Institut EURECOM
 
US Liaison
Ketan MAYER-PATEL, UNC-Chapel Hill
 
Asian Liaison
Liang Tien CHIA, National Technical University Singapore
 
European Liaison
Suzanne BOLL, University of Oldenburg
 
Steering Committee
Yi-Ping Phoebe CHEN, Deakin University , Australia
Tat-Seng CHUA, National University of Singapore, Singapore
Tosiyasu L. KUNII, Kanazawa Institute of Technology, Japan
Wei-Ying MA, Microsoft Research Asia, Beijing, China
Nadia MAGNENAT-THALMANN, University of Geneva, Switzerland
Patrick SENAC, ENSICA, France
 
 
In cooperation with Institut EURECOM and ACM SigM
Back to Top