ISCA Archive

Sixth ISCA Workshop on Speech Synthesis

Bonn, Germany
August 22-24, 2007

Link to

Author Index and Quick Access to Abstracts

Names written in boldface refer to first authors, in CAPITAL letters to keynote and invited papers. Full papers and demonstration material (audio, presentations, posters as far as available) can be accessed from the abstracts (ISCA members only). Please note that each abstract opens in a separate window.

Ackermann   Adell   Adsett (58)   Adsett (316)   Andre   Aylett   Bachmann   Badino   Bailly   Bansal   Barbisch   Barbot (85)   Barbot (211)   Barreaud   Le Beux   Birkholz   Black (90)   Black (182)   Black (188)   Black (294)   Black (322)   Black (380)   BLACK (392)   Boeffard (85)   Boeffard (119)   Boeffard (211)   Boidin   Bonafonte (194)   Bonafonte (223)   Braunschweiler   Breton   Breuer (5)   Breuer (77)   Breuer (166)   Breuer (282)   Buchholz   Bunnell   Cabral   Cahill   Campbell, Nick   Campbell, Pauline   Carson-Berndsen   Chevelu   Chomphan   Chu (46)   Chu (206)   Clark   Commandeur   Cozijn   d’Alessandro   Damnati   Damper   Das_Mandal   Datta   Delhay   Demenko (71)   Demenko (77)   DePlacido   Dogil   Erro   Escudero   Fernandez   Figueira   Gabbouj   Gangadharaiah   Gonzalvo   Govokhina   Gruber   Gu   Hansakunbuntheung   Helander   Hertrich   Hirai   Hirose (148)   Hirose (154)   van Hooijdonk   Huckvale   Hunecke   Iriondo   Jilka   Kain (11)   Kain (172)   Kala   Kato   Kawai   King (125)   King (258)   Kirkpatrick   Klabbers   Klessa   Kobayashi (125)   Kobayashi (160)   Kominek   Kong   Koppinen   Krahmer   KRÖGER   Krul   Kumar   Kuroiwa   Lacroix   Lambert   Langner   Lasarcyk   Latacz   Lee   Liang   Liddell   Lilley   Lolive   Lyudovyk   Macek   Maia (28)   Maia (131)   Marchand (58)   Marchand (316)   Marsi   Martínez   Masuko   Matousek   Mendes   Mesbahi   Miao   Minematsu (148)   Minematsu (154)   Mishra (246)   Mishra (339)   Möbius (71)   Möbius (304)   Moers   Monzo   Moreno   Moudenc   Nakamura, Kenichi   Nakamura, Satoshi   Nankaku (131)   Nankaku (333)   Ni   Nose   O’Brien   Ohta   Ohtani (101)   Ohtani (107)   Oliveira   Owens   Paiva   Pammi   Paulo   Peng   Prahallad (90)   Prahallad (188)   Qian   Raj   Ramabhadran   Rao   Reichel   Renals (113)   Renals (125)   Richmond   Rilliard   Robeiko   Romportl   Rosé   Roux   Säuberlich   Sagisaka   Sakai   Sako   van Santen (11)   van Santen (172)   van Santen (246)   van Santen (339)   Sarkar   Saruwatari (101)   Saruwatari (107)   Scaife   Schnell   Schröder   Schultz   Schweitzer   Sequeira   Shechtman   Shikano (101)   Shikano (107)   Silen   Socoró   Soong (46)   Soong (137)   Soong (206)   Steiner   Sun   Szymanski   Tani   Tenpaku   Tihelka   Toda (28)   Toda (101)   Toda (107)   Toda (125)   Toda (131)   Toda (333)   Tokuda (28)   Tokuda (125)   Tokuda (131)   Tokuda (294)   Tokuda (333)   Toth   Tsuzaki   Tucker Prud’hommeaux   Vala   Verhelst   Visagie   Vogt   Wagner, Agnieszka   Wagner, Petra   Wang   Watanabe   Weiss   Wollermann   Wolters   Xiao   Yamagishi (81)   Yamagishi (113)   Yamagishi (125)   Yamagishi (294)   Yanagisawa   Yuvaraj   Yvon   Zen (125)   Zen (131)   Zen (294)   Zhang   Zhao (46)   Zhao (206)  

Bibliographic Reference

[SSW6-2007] Sixth ISCA Tutorial and Research Workshop on Speech Synthesis (SSW6), Bonn, Germany, August 22-24, 2007, ed. by Petra Wagner, Julia Abresch, Stefan Breuer, and Wolfgang Hess (Bonn, Germany, 2007)

Table of Contents and Access to Abstracts

Keynote 1

Kröger, Bernd J.: "Perspectives for articulatory speech synthesis", 391 (abstract).

Various Topics

Govokhina, Oxana / Bailly, Gérard / Breton, Gaspard: "Learning optimal audiovisual phasing for an HMM-based control model for facial animation", 1-4.

Birkholz, Peter / Steiner, Ingmar / Breuer, Stefan: "Control concepts for articulatory speech synthesis", 5-10.

Kain, Alexander B. / Miao, Qi / Santen, Jan P. H. van: "Spectral control in concatenative speech synthesis", 11-16.

Kirkpatrick, Barry / O’Brien, Darragh / Scaife, Ronán: "Feature transformation applied to the detection of discontinuities in concatenated speech", 17-21.

Expressive Speech Synthesis

Campbell, Nick: "Towards conversational speech synthesis; lessons learned from the expressive speech processing project", 22-27.

Sakai, Shinsuke / Ni, Jinfu / Maia, Ranniery / Tokuda, Keiichi / Tsuzaki, Minoru / Toda, Tomoki / Kawai, Hisashi / Nakamura, Satoshi: "Communicative speech synthesis with XIMERA: a first step", 28-33.

Fernandez, Raul / Ramabhadran, Bhuvana: "Automatic exploration of corpus-specific properties for expressive text-to-speech: a case study in emphasis", 34-39.

Wollermann, Charlotte / Lasarcyk, Eva: "Modeling and perceiving of (un-)certainty in articulatory speech synthesis", 40-45.

Wang, Lijuan / Chu, Min / Peng, Yaya / Zhao, Yong / Soong, Frank K.: "Perceptual annotation of expressive speech", 46-51.

Poster Session 1

Schnell, Karl / Lacroix, Arild: "Joint analysis of speech frames for synthesis based on lossy tube models", 52-57.

Adsett, Connie R. / Marchand, Yannick: "Are rule-based syllabification methods adequate for languages with low syllabic complexity? the case of Italian", 58-63.

Huckvale, Mark / Yanagisawa, Kayoko: "Spoken language conversion with accent morphing", 64-70.

Demenko, Grazyna / Wagner, Agnieszka / Jilka, Matthias / Möbius, Bernd: "Comparative investigation of peak alignment in Polish and German unit selection corpora", 71-76.

Klessa, Katarzyna / Szymanski, Marcin / Breuer, Stefan / Demenko, Grazyna: "Optimization of Polish segmental duration prediction with CART", 77-80.

Hirai, Toshio / Yamagishi, Junichi / Tenpaku, Seiichi: "Utilization of an HMM-based feature generation module in 5 ms segment concatenative speech synthesis", 81-84.

Lolive, Damien / Barbot, Nelly / Boeffard, Olivier: "Clustering algorithm for F0 curves based on hidden Markov models", 85-89.

Kumar, Rohit / Gangadharaiah, Rashmi / Rao, Sharath / Prahallad, Kishore / Rosé, Carolyn P. / Black, Alan W.: "Building a better Indian English voice using "more data"", 90-94.

Schröder, Marc / Hunecke, Anna: "Creating German unit selection voices for the MARY TTS platform from the BITS corpora", 95-100.

Voice Conversion

Ohta, Kumi / Ohtani, Yamato / Toda, Tomoki / Saruwatari, Hiroshi / Shikano, Kiyohiro: "Regression approaches to voice quality controll based on one-to-many eigenvoice conversion", 101-106.

Tani, Daisuke / Ohtani, Yamato / Toda, Tomoki / Saruwatari, Hiroshi / Shikano, Kiyohiro: "An evaluation of many-to-one voice conversion algorithms with pre-stored speaker data sets", 107-112.

Cabral, Joao P. / Renals, Steve / Richmond, Korin / Yamagishi, Junichi: "Towards an improved modeling of the glottal source in statistical parametric speech synthesis", 113-118.

Mesbahi, Larbi / Barreaud, Vincent / Boeffard, Olivier: "GMM-based speech transformation systems under data reduction", 119-124.

Speech Synthesis by HMM

Yamagishi, Junichi / Kobayashi, Takao / Renals, Steve / King, Simon / Zen, Heiga / Toda, Tomoki / Tokuda, Keiichi: "Improved average-voice-based speech synthesis using gender-mixed modeling and a parameter generation algorithm considering GV", 125-130.

Maia, Ranniery / Toda, Tomoki / Zen, Heiga / Nankaku, Yoshihiko / Tokuda, Keiichi: "An excitation model for HMM-based speech synthesis based on residual modeling", 131-136.

Liang, Hui / Qian, Yao / Soong, Frank K.: "An HMM-based bilingual (Mandarin-English) TTS", 137-142.

Roux, Justus C. / Visagie, Albert S.: "Data-driven approach to rapid prototyping Xhosa speech synthesis", 143-147.

Tone and Tone Accent Languages

Minematsu, Nobuaki / Kuroiwa, Ryo / Hirose, Keikichi / Watanabe, Michiko: "CRF-based statistical learning of Japanese accent sandhi for developing Japanese text-to-speech synthesis systems", 148-153.

Sun, Qinghua / Hirose, Keikichi / Minematsu, Nobuaki: "Two-step generation of Mandarin F0 contours based on tone nucleus and superpositional models", 154-159.

Chomphan, Suphattharachai / Kobayashi, Takao: "Design of tree-based context clustering for an HMM-based Thai speech synthesis system", 160-165.

Bachmann, Arne / Breuer, Stefan: "Development of a BOSS unit selection module for tone languages", 166-171.

Poster Session 2

Kain, Alexander B. / Santen, Jan P. H. van: "Unit-selection text-to-speech synthesis using an asynchronous interpolation model", 172-177.

Hertrich, Ingo / Ackermann, Hermann: "Modelling voiceless speech segments by means of an additive procedure based on the computation of formant sinusoids", 178-181.

Toth, Arthur R. / Black, Alan W.: "Using articulatory position data in voice transformation", 182-187.

Raj, Anand Arokia / Sarkar, Tanuja / Pammi, Satish Chandra / Yuvaraj, Santhosh / Bansal, Mohit / Prahallad, Kishore / Black, Alan W.: "Text processing for text-to-speech systems in Indian languages", 188-193.

Erro, Daniel / Moreno, Asunción / Bonafonte, Antonio: "Flexible harmonic/stochastic speech synthesis", 194-199.

Romportl, Jan / Kala, Jirí: "Prosody modelling in Czech text-to-speech synthesis", 200-205.

Zhao, Yong / Zhang, Chengsuo / Soong, Frank K. / Chu, Min / Xiao, Xi: "Measuring attribute dissimilarity with HMM KL-divergence for speech synthesis", 206-210.

Chevelu, Jonathan / Barbot, Nelly / Boeffard, Olivier / Delhay, Arnaud: "Lagrangian relaxation for optimal corpus design", 211-216.

Krul, Aleksandra / Damnati, Géraldine / Yvon, François / Boidin, Cédric / Moudenc, Thierry: "Adaptive database reduction for domain specific speech synthesis", 217-222.

Adell, Jordi / Bonafonte, Antonio / Escudero, David: "Statistical analysis of filled pauses² rhythm for disfluent speech synthesis", 223-227.

Gu, Wentao / Lee, Tan: "Quantitative analysis of F0 contours of emotional speech of Mandarin", 228-233.

Prosody Modelling

Shechtman, Slava: "Maximum-likelihood dynamic intonation model for concatenative text-to-speech system", 234-239.

Reichel, Uwe D.: "Data-driven extraction of intonation contour classes", 240-245.

Mishra, Taniya / Tucker Prud’hommeaux, Emily / Santen, Jan P. H. van: "Word accentuation prediction using a neural net classifier", 246-251.

Badino, Leonardo / Clark, Robert A. J.: "Issues of optionality in pitch accent placement", 252-257.

Inventory Construction

Aylett, Matthew P. / King, Simon: "Single speaker segmentation and inventory selection using dynamic time warping self organization and joint multigram mapping", 258-263.

Lambert, Tanya / Braunschweiler, Norbert / Buchholz, Sabine: "How (not) to select your voice corpus: random selection vs. phonologically balanced", 264-269.

Latacz, Lukas / Kong, Yuk On / Verhelst, Werner: "Unit selection synthesis using long non-uniform units and phonemic identity matching", 270-275.

Gruber, Martin / Tihelka, Daniel / Matousek, Jindrich: "Evaluation of various unit types in the unit selection approach for the Czech language using the Festival system", 276-281.

Keynote 2

Black, Alan W.: "The Blizzard Challenge: evaluating corpus-based speech synthesis techniques", 392 (abstract).


Moers, Donata / Wagner, Petra / Breuer, Stefan: "Assessing the adequate treatment of fast speech in unit selection speech synthesis systems for the visually impaired", 282-287.

Wolters, Maria / Campbell, Pauline / DePlacido, Christine / Liddell, Amy / Owens, David: "Making speech synthesis more accessible to older people", 288-293.


Zen, Heiga / Nose, Takashi / Yamagishi, Junichi / Sako, Shinji / Masuko, Takashi / Black, Alan W. / Tokuda, Keiichi: "The HMM-based speech synthesis system (HTS) version 2.0", 294-299.

Weiss, Christian / Oliveira, Luis C. / Paulo, Sergio / Mendes, Carlos / Figueira, Luis / Vala, Marco / Sequeira, Pedro / Paiva, Ana / Vogt, Thurid / Andre, Elisabeth: "eCIRCUS: building voices for autonomous speaking agents", 300-303.

Barbisch, Martin / Dogil, Grzegorz / Möbius, Bernd / Säuberlich, Bettina / Schweitzer, Antje: "Unit selection synthesis in the Smartweb project", 304-309.

Silen, Hanna / Helander, Elina / Koppinen, Konsta / Gabbouj, Moncef: "Building a Finnish unit selection TTS system", 310-315.

Poster Session 3

Marchand, Yannick / Adsett, Connie R. / Damper, Robert I.: "Evaluating automatic syllabification algorithms for English", 316-321.

Kominek, John / Schultz, Tanja / Black, Alan W.: "Voice building from insufficient data - classroom experiences with web-based language development tools", 322-327.

Cahill, Peter / Macek, Jan / Carson-Berndsen, Julie: "SVM based feature extraction in speech synthesis", 328-332.

Nankaku, Yoshihiko / Nakamura, Kenichi / Toda, Tomoki / Tokuda, Keiichi: "Spectral conversion based on statistical models including time-sequence matching", 333-338.

Klabbers, Esther / Mishra, Taniya / Santen, Jan P. H. van: "Analysis of affective speech recordings using the superpositional intonation model", 339-344.

Beux, Sylvain Le / Rilliard, Albert / d’Alessandro, Christophe: "Calliphony: a real-time intonation controller for expressive speech synthesis", 345-350.

Das Mandal, Shyamal Kumar / Datta, Asoke Kumar: "Epoch synchronous non-overlap-add (ESNOLA) method-based concatenative speech synthesis system for Bangla", 351-355.

Hansakunbuntheung, Chatchawarn / Kato, Hiroaki / Sagisaka, Yoshinori: "Syllable-based Thai duration model using multi-level linear regression and syllable accommodation", 356-361.

Gonzalvo, Xavier / Socoró, Joan Claudi / Iriondo, Ignasi / Monzo, Carlos / Martínez, Elisa: "Linguistic and mixed excitation improvements on a HMM-based speech synthesis for Castilian Spanish", 362-367.

Lyudovyk, Tetyana / Robeiko, Valentyna: "Inventory of intonation contours for text-to-speech synthesis", 368-373.


Bunnell, H. Timothy / Lilley, Jason: "Analysis methods for assessing TTS intelligibility", 374-379.

Langner, Brian / Black, Alan W.: "Understandable production of massive synthesis", 380-384.

Hooijdonk, Charlotte van / Commandeur, Edwin / Cozijn, Reinier / Krahmer, Emiel / Marsi, Erwin: "The online evaluation of speech synthesis using eye movements", 385-390.