ISCA Archive Eurospeech 1999 Sessions Booklet
  ISCA Archive Sessions Booklet
top

6th European Conference on Speech Communication and Technology

Budapest, Hungary
5-9 September 1999

General Chair: Géza Gordos
doi: 10.21437/Eurospeech.1999




Speech Recognition - Acoustic Processing


Robust energy normalization using speech/nonspeech discriminator for German connected digit recognition
Rathinavelu Chengalvarayan

Acoustic pre-processing for optimal effectivity of missing feature theory
Johan de Veth, Bert Cranen, Febe de Wet, Louis Boves

Simultaneous recognition of multiple sound sources based on 3-d n-best search using microphone array
Panikos Heracleous, Takeshi Yamada, Satoshi Nakamura, Kiyohiro Shikano

Down-sampling speech representation in ASR
Hynek Hermansky, Pratibha Jain

Comparison of time & frequency filtering and cepstral-time matrix approaches in ASR
Dusan Macho, Climent Nadeu, Peter Jancovic, Gregor Rozinaj, Javier Hernando

Syllable onset detection applied to the portuguese language
Hugo Meinedo, Joao P. Neto, Luis B. Almeida

Decorrelated and liftered filter-bank energies for robust speech recognition
Kuldip K. Paliwal

Optimization algorithms for estimating modulation spectrum domain filters
Pau Paches-Leal, Richard C. Rose, Climent Nadeu

Efficient vector quantization using an n-path binary tree search algorithm
R. San-Segundo, R. Córdoba, J. Ferreiros, A. Gallardo, J. Colás, J. Pastor, Y. López

Neural network based optimal feature extraction for ASR
Narada D. Warakagoda, Magne H. Johnsen

A study of speech recognition for the elderly
Fumihiro Yato, Naomi Inoue, Kazuo Hashimoto

The analysis and application of a new endpoint detection method based on distance of autocorrelated similarity
Jie Zhu, Fei-li Chen


Articulatory Measurements and Modelling


Hyper-articulated speech: auditory and visual intelligibility
Denis Beautemps, Pascal Borel, Sébastien Manolios

Modeling of the vocal tract in three dimensions
Olov Engwall

Articulatory reduction in emotional speech
Miriam Kienast, Astrid Paeschke, Walter Sendlmeier

A trajectory formation model of articulatory movements using a multidimensional phonemic task
Tokihiko Kaburagi, Masaaki Honda, Takeshi Okadome

LPC-based inversion of the DRM articulatory model
Sacha Krstulovic

A vocal tract model using multi-line equivalent circuits
Nobuhiro Miki, Thoru Yokoyama, Takeshi Ohtani, Shinobu Masaki, Ikuhiro Shimada, Ichiro Fujimoto, Yuji Nakamura

Acoustic nature of the whisper
Masahiro Matsuda, Hideki Kasuya

Relations between utterance speed and articulatory movements
Takeshi Okadome, Tokihiko Kaburagi, Masaaki Honda

Design of hypercube codebooks for the acoustic-to-articulatory inversion respecting the non-linearities of the articulatory-to-acoustic mapping
Slim Ouni And Yves Laprie

A missing-word test comparison of human and statistical language model performance
Marie Owens, Anja Krüger, Paul Donnelly, F J Smith, Ji Ming

Estimating velum height from acoustics during continuous speech
Korin Richmond

On improving the decision algorithm for articulatory codebook search
C. Silva, S. Chennoukh, Isabel Trancoso

Extraction of articulators in x-ray image sequences
G. Thimm, J. Luettin

Effects of source-tract interaction in perception of nasality
António Teixeira, Francisco Vaz, José Carlos Príncipe

Perceiving anticipatory phonetic gestures in French
Béatrice Vaxelaire, Rudolph Sock, Véronique Hecker

Motor equivalence evidenced by articulatory modelling
Anne Vilain, Christian Abry, Pierre Badin






Speech Recognition - Confidence Measures 2


Accurate recognition of city names with spelling as a fall back strategy
Josef G. Bauer, Jochen Junkawitsch

Selective prosodic post-processing for improving recognition of French telephone numbers
Katarina Bartkova, Denis Jouvet

Improving rejection with semantic slot-based confidence scores
Eric I. Chang

The IBM conversational telephony system for financial applications
K. Davies, R. Donovan, M. Epstein, Martin Franz, Abraham Ittycheriah, E. E. Jan, J. M. LeRoux, David Lubensky, Chalapathy Neti, Mukund Padmanabhan, K. Papineni, Salim Roukos, A. Sakrajda, Jeffrey S. Sorensen, B. Tydlitat, T. Ward

Error spotting using syllabic fillers in spontaneous conversational speech recognition
Rachida El Méliani, Douglas O’Shaughnessy

Recognition of spelled names over the telephone and rejection of data out of the spelling lexicon
Denis Jouvet, Jean Monné

An utterance verification system based on subword modeling for a vocabulary independent speech recognition system
Myoung-Wan Koo, Sun-Jeong Lee

Use of a confidence measure based on frame level likelihood ratios for the rejection of incorrect data
Nicolas Moreau, Denis Jouvet

Variable preselection list length estimation using neural networks in a telephone speech hypothesis-verification system
J. Macías-Guarasa, J. Ferreiros, A. Gallardo, R. San-Segundo, Juan Manuel Pardo, L. Villarrubia

Speaker normalization and pronunciation variant modeling: helpful methods for improving recognition of fast speech
Thilo Pfau, Robert Faltlhauser, Günther Ruske

Automatic speech recognition using acoustic confidence conditioned language models
Richard C. Rose, Giuseppe Riccardi

Utilizing prosody for unconstrained morpheme recognition
Volker Strom, Henrik Heine

Modeling the prosody of hidden events for improved word recognition
Andreas Stolcke, Elizabeth Shriberg, Dilek Hakkani-Tür, Gökhan Tür

A comparison of word graph and n-best list based confidence measures
Frank Wessel, Klaus Macherey, Hermann Ney


Speech Analysis and Tools


C++ software environment for speech signal processing
Marcus M. Prätzas, Ulrich Balss, Herbert Reininger, Harald Wüst

Improvement of electrolaryngeal speech by introducing normal excitation information
Kun Ma, Pelin Demirel, Carol Espy-Wilson, Joel MacAuslan

Detecting user speech in barge-in over prompts using speaker identification methods
Abraham Ittycheriah, Richard J. Mammone

Speaker and channel-normalized set of formant parameters for telephone speech recognition
Boris Lobanov, T. Levkovskaya, Igor E. Kheidorov

Fuzzy segmentation of lip image using cluster analysis
Alan W.C. Liew, K. L. Sum, S. H. Leung, Wai H. Lau

Software to support research and development of spoken dialogue systems
Michael F. McTear

Analysis of sources of variability in speech
Sachin Kajarekar, Narendranath Malayath, Hynek Hermansky

Adaptive nonlinear prediction based on order statistics for speech signals
Tetsuya Shimamura, Haruko Hayakawa

Developing a voiced information retrieval system for the portuguese language capable to handle both brazilian and portuguese spoken versions
M. N. Souza, E. J. Caprini, C. G. Machado, M. V. Ludolf, L. P. Calôba, J. M. Seixas, F. G. Resende, S. L. Netto, Diamantino R. Freitas, Joao Paulo Teixeira, C. Espain, V. Pera, F. Moreira

Real-time speech modeling using computationally efficient locally recurrent neural networks (CERNs)
John J. Soraghan, Amir Hussain, Ivy Shim

Effectiveness of KL-transformation in spectral delta expansion
M. Tokuhira, Y. Ariki





Speech Recognition - Search and Pronunciation Modelling


A two-stage speech recognition method with an error correction model
Yoshiharu Abe, Hiroyasu Itsui, Yuzo Maruta, Kunio Nakajima

Speech recognition with automatic punctuation
C. Julian Chen

Automatic modeling of pronunciation variations
Ellen Eide

Reducing search complexity in low perplexity tasks
Martin Franz, Miroslav Novak

A two-stage speech recognition method for information retrieval applications
Paolo Coletti, Marcello Federico

Multi-level decision trees for static and dynamic pronunciation models
Eric Fosler-Lussier

Modeling and efficient decoding of large vocabulary conversational speech
Michael Finke, Jürgen Fritsch, Detlef Koll, Alex Waibel

Evaluation of a segmentation system based on multi-level lattices
Jean-Luc Husson

The application of an improved DP match for automatic lexicon generation
Philip Hanna, Darryl Stewart, Ji Ming

Modeling trajectories in the HMM framework
Rukmini Iyer, Owen Kimball, Herbert Gish

Korean large vocabulary continuous speech recognition using pseudomorpheme units
Oh-Wook Kwon, Kyuwoong Hwang, Jun Park

Navigating German cities by spontaneous French queries
Harouna Kabré, Alexander Waibel

Generating alternative pronunciations from a dictionary
Filipp Korkmazskiy, Chin-Hui Lee

Finding consensus among words: lattice-based word error minimization
Lidia Mangu, Eric Brill, Andreas Stolcke

An efficient decoding method for real time speech recognition
Stefan Ortmanns, Wolfgang Reichl, Wu Chou

Recent improvements in voicemail transcription
Mukund Padmanabhan, G. Saon, S. Basu, Jing Huang, Geoffrey Zweig

Acoustics-based baseform generation with pronunciation and/or phonotactic models
Bhuvana Ramabhadran, Sabine Deligne, Abraham Ittycheriah

Improving recognition correct rate of important words in large vocabulary speech recognition
Yasuo Shirosaki, Hideaki Kikuchi, Katsuhiko Shirai

Pronunciation modeling by sharing gaussian densities across phonetic models
Murat Saraclar, Harriet Nock, Sanjeev Khudanpur

One pass cross word decoding for large vocabularies based on a lexical tree search organization
Xavier L. Aubert









Speaker Recognition - Acoustic Features and Robustness


Experimental evaluation of text-independent speaker verification on laboratory and field test databases in the M2VTS project
Laurent Besacier, J. Luettin, G. Maitre, E. Meurville

Channel estimation and normalization by coherent spectral averaging for robust speaker verification
Rajesh Balchandran, Vidhya Ramanujam, Richard J. Mammone

Time-frequency principal components of speech: application to speaker identification
Ivan Magrin-Chagnolleau, Geoffrey Durou

Speaker recognition by means of a combination of linear and nonlinear predictive models
Marcos Faúndez-Zanuy

Feature vector transformation using independent component analysis and its application to speaker identification
Gil-Jin Jang, Seong-Jin Yun, Yung-Hwan Oh

The prototype model in speaker identification
Yizhar Lavner, Judith Rosenhouse, Isak Gath

A new cepstrum-based channel compensation method for speaker verification
T. F. Lo, M. W. Mak, K. K. Yiu

Speaker recognition based on discriminative feature extraction - optimization of mel-cepstral features using second-order all-pass warping function
Chiyomi Miyajima, Hideyuki Watanabe, Tadashi Kitamura, Shigeru Katagiri

Facing severe channel variability in forensic speaker verification conditions
Javier Ortega-Garcia, Santiago Cruz-Llanas, Joaquin Gonzalez-Rodriguez

Speaker and language recognition using speech codec parameters
Thomas F. Quatieri, E. Singer, R. B. Dunn, Douglas A. Reynolds, J. P. Campbell

Robust speaker verification in noisy conditions by modification of spectral time trajectories
Vidhya Ramanujam, Rajesh Balchandran, Richard J. Mammone

Toward parametric representation of speech for speaker recognition systems
Rivarol Vergin, Douglas O'Shaughnessy, Pierre Dumouchel

Text independent speaker identification using LSP codebook speaker models and linear discriminant functions
R. D. Zilca, Y. Bistritz





Speech Recognition - Multilinguality


Recognition of continuous persian speech using a medium-sized vocabulary speech corpus
S. M. Ahadi

Multi-lingual speech recognition based on demi-syllable subword units
Tibor Fegyó, Péter Tatai

MAP-based cross-language adaptation augmented by linguistic knowledge: from English to Chinese
Pascale Fung, Chi Yuen Ma, Wai Kat Liu

Analysis of HMM models in alphabet letters recognition
Stefan Grocholewski

Tone recognition of Chinese continuous speech using tone critical segments
Keikichi Hirose, Jin-song Zhang

Phonetic state tied-mixture tone modeling for large vocabulary continuous Mandarin speech recognition
Tai-Hsuan Ho, Chin-Jung Liu, Herman Sun, Ming-Yi Tsai, Lin-Shan Lee

The clustering algorithm for the definition of multilingual set of context dependent speech models
Bojan Imperl, Bogomir Horvat

Study on tone classification of Chinese continuous speech in speech recognition system
Jian Liu, Xiaodong He, Fuyuan Mo, Tiecheng Yu

Decision tree-based triphones are robust and practical for mandarian speech recognition
Yi Liu, Pascale Fung

Decision trees for inter-word context dependencies in Spanish continuous speech recognition tasks
K. López de Ipiña, A. Varona, I. Torres, L. J. Rodríguez

End points detection for noisy speech using a wavelet based algorithm
Amin M. Nassar, Nemat S. Abdel Kader, Amr M. Refat

Adaptation of acoustic models for multilingual recognition
C. Nieuwoudt, E. C. Botha

Recognition of non-native German speech with multilingual recognizers
Ulla Uebler, Manuela Boros


Systems, Architectures, Interfaces


Relational vs. object-oriented models for representing speech: a comparison using ANDOSL data
Toomas Altosaar, Bruce Millar, Martti Vainio

First experiences of the German speechdat-car database collection in mobile environments
Christoph Draxler, Robert Grudszus, Stephan Euler, Klaus Bengler

OASIS - a framework for spoken language call steering
Mike Edgington, David Attwater, Peter Durston

VOCAPI - small standard API for command & control
Eike Gegenmantel

Standardised speech interfaces - key for objective evaluation of recognition accuracy
Christel Müller, Karsten Schröder

A medical rehabilitation diagnoses transcription method that integrates continuous and isolated word recognition
Shoichi Matsunaga, Yoshiaki Noda, Katsutoshi Ohtsuki, Eiji Doi, Tomio Itoh

Problems of creating a flexible e-mail reader for hungarian
Géza Németh, Csaba Zainkó, Gábor Olaszy, Gábor Prószéky

Interactive, TTS supported speech message composer for large, limited vocabulary, but open information systems
Gábor Olaszy, Géza Németh, Péter Olaszi, Géza Gordos

ALE for speech: a translation prototype
Gerald Penn, Bob Carpenter

An integrated system for Spanish CSR tasks
L.J. Rodríguez, M. I. Torres, J. M. Alcaide, A. Varona, K. López de Ipina, M. Penagarikano, G. Bordel

Use of speech synthesis in an application
Angelien Sanderman, Ellen Bosgoed, Hans de Graaff, Peter van Splunder

Text-to-audio-visual speech synthesis based on parameter generation from HMM
Masatsune Tamura, Shigekazu Kondo, Takashi Masuko, Takao Kobayashi

Authoring tools for speech synthesis using the sable markup standard
Johan Wouters, Brian Rundle, Michael W. Macon


Speaker Recognition - Scoring and Decision


Dynamic weighting of the distortion sequence in text-dependent speaker verification
A. M. Ariyaeeinia, P. Sivakumaran, M. Pawlewski, M. J. Loomes

On the use of supra model information from multiple classifiers for robust speaker identification
Hakan Altincay, Mübeccel Demirekler

Missing features detection and handling for robust speaker verification
Mounir El-Maliki, Andrzej Drygajlo

High performance text-independent speaker recognition system based on voiced/unvoiced segmentation and multiple neural nets
Nikos Fakotakis, John Sirigos, George Kokkinakis

Similarity normalization method based on world model and a posteriori probability for speaker verification
Corinne Fredouille, Jean-François Bonastre, Teva Merlin

Text-independent speaker verification using virtual speaker based cohort normalization
Toshihiro Isobe, Jun-ichi Takahashi

Robust person verification based on speech and facial images
J. Luettin, S. Ben-Yacoub

A neural network-based text-dependent speaker verification system using suprasegmental features
M. Mathew, B. Yegnanarayana, R. Sundar

Modelling output probability distributions for enhancing speaker recognition
Jason Pelecanos, Sridha Sridharan

On the use of neural networks to combine utterance and speaker verification systems in a text-dependent speaker verification task
L. Rodríguez-Linares, C. García-Mateo, J. L. Alba-Castro

Genesys: a neural network model for speaker identification
B. Ruiz-Mezcua, R. Rodríguez-Galán, Luis A. Hernández-Gómez, Paloma Domingo-García, Enrique Bailly-Baillicre Gutiérrez

Speaker verification with growing cell structures
Bogdan Sabac, Inge Gavat

Environment adaptation and long term parameters in speaker identification
Chakib Tadj, Pierre Dumouchel, Mohamed Mihoubi, Pierre Ouellet

Speaker identification using subband HMMS
K. Yoshida, K. Takagi, K. Ozeki

A priori threshold determination for phrase-prompted speaker verification
W. D. Zhang, K. K. Yiu, M. W. Mak, C. K. Li, M. X. He











Dialogue 2


Knowledge collection for natural language spoken dialog systems
Egbert Ammicht, Allen Gorin, Tirso Alonso

Improving discourse management in TRIPS-98
Donna K. Byron

Speech act modeling in a spoken dialogue system using fuzzy hidden Markov model and bayes' decision criterion
Chung-Hsien Wu, Gwo-Lang Yan, Chien-Liang Lin

Task hierarchies representing sub-dialogs in speech dialog systems
Ute Ehrlich

Effects of system barge-in responses on user impressions
Jun-Ichi Hirasawa, Mikio Nakano, Takeshi Kawabata, Kiyoaki Aikawa

A new word-confidence threshold technique to enhance the performance of spoken dialogue systems
R. López-Cózar, Antonio J. Rubio, P. García, J. C. Segura

Confirmation strategies to improve correction rates in a telephonic inquiry dialogue system
C. Alexia Lavelle, Martine de Calmés, Guy Pérennou

Mathematical analysis of dialogue control strategies
Yasuhisa Niimi, Takuya Nishimoto

Processing of anaphoric and elliptic sentences in a spoken dialog system
Jana Ocelikova, Vaclav Matousek

Free-flow dialog management using forms
K. A. Papineni, Salim Roukos, T. Ward

Towards the detection and description of textual meaning indicators in spontaneous conversations
Klaus Ries

Dialogue management in the dutch ARISE train timetable information system
Janienke Sturm, Els den Os, Lou Boves

Problem spotting in human-machine interaction
Emiel Krahmer, Marc Swerts, Mariet Theune, Mieke Weegels

Consistent dialogue across concurrent topics based on an expert system model
Bor-shen Lin, Hsin-min Wang, Lin-shan Lee


Speech Coding


Secondary codebook storage quantisation
Thomas M. Chapman, C. S. Xydeas

Pseudo-articulatory representations: promise, progress and problems
W. H. Edmondson, D. J. Iskra, P. Kienzle

A 1.7KBPS waveform interpolation speech coder using decomposition of pitch cycle waveform
Ge Gao, P. C. Ching

Enhanced analysis-by-synthesis waveform interpolative coding at 4 KBPS
Oded Gottesman, Allen Gersho

Joint source-channel decoding by channel-coded optimal estimation (CCOE) for a CELP speech codec
Norbert Görtz

Analysis-by-synthesis low-rate multimode harmonic speech coding
Chunyan Li, Allen Gersho, Vladimir Cuperman

Variable length coding of transformed LSF coefficients
László Lois

Low bit-rate speech coding using quantization of variable length segments
R. Mayrench, D. Malah

Low delay analysis/synthesis schemes for joint speech enhancement and low bit rate speech coding
Rainer Martin, Hong-Goo Kang, Richard V. Cox

A comparative study of several ADPCM schemes with linear and nonlinear prediction
Oscar Oliva, Marcos Faúndez-Zanuy

Segmental feature extraction and coding for speech synthesis
H. Ohmura, K. Tanaka

Backward adaptive RBF-based hybrid predictors for CELP-type coders at medium bit-rates
C. Peláez-Moreno, F. Díaz-de-María

An improved speech model with allowance for time-varying pitch harmonic amplitudes and frequencies in low bit-rate MBE coders
Valentin V. Sercov, Alexander A. Petrovsky

Sparse vector linear prediction matrices with multidiagonal structure
Davor Petrinovic, Davorka Petrinovic

Source-dependent variable rate speech coding below 3 KBPS
M. Stefanovic, A. Kondoz

A novel speech coding approach based on half-wave vector quantization *
Xiaoping Chen, Yantao Song, Tiecheng Yu

Speech coding using mixture of gaussians polynomial model
Parham Zolfaghari, Tony Robinson




Speech Recognition - Language Modelling


Language modeling for broadcast news transcription
Gilles Adda, Michéle Jardino, Jean-Luc Gauvain

Large Span statistical language models: application to homophone disambiguation for large vocabulary speech recognition in French
Frédéric Béchet, Alexis Nasr, Thierry Spriet, Renato de Mori

Language modelling and spoken dialogue systems - the ARISE experience
P. Baggia, A. Kellner, Guy Pérennou, C. Popovici, Janienke Sturm, Frank Wessel

Language model level vs. lexical level for modeling pronunciation variation in a French CSR
Laure Brieussel-Pousse, Guy Perennou

Characteristics of Chinese language models for large vocabulary telephone speech
Roger H.Y. Leung, Chi-Yan Choy, Hong C. Leung

A new based distance language model for a dictation machine: application to MAUD
D. Langlois, K. Smadli

Using various language model smoothing techniques for the transcription of a weather forecast broadcasted by the czech radio
Ludek Müller, Josef Psutka

Studies in acoustic training and language modeling using simulated speech data
Don McAllaster, Larry Gillick

Language model adaptation using minimum discrimination information
Wolfgang Reichl

Automatic and manual clustering for large vocabulary speech recognition: a comparative study
K. Smadli, A. Brun, I. Zitouni, Jean-Paul Haton

Learning of stochastic context-free grammars by means of estimation algorithms
Joan-Andreu Sánchez, José-Miguel Benedí

Part-of-speech n-gram and word n-gram fused language model
Hirofumi Yamamoto, Yoshinori Sagisaka

Linguistic features for whole sentence maximum entropy language models
Xiaojin Zhu, Stanley F. Chen, Ronald Rosenfeld

Variable-length sequence language model for large vocabulary continuous dictation machine
I. Zitouni, J. F. Mari, K. Smadli, Jean-Paul Haton

Using detailed linguistic structure in language modelling
Ruiqiang Zhang, Ezra Black, Andrew Finch


Prosody - Study of Prosody for Speech Synthesis


Decision tree micro-prosody structures for text to speech synthesis
Aimin Chen, Shu Lian Wong, Saeed Vaseghi, Charles Ho

Automatic modeling of duration in a Spanish text-to-speech system using neural networks
R. Córdoba, J. A. Vallejo, J. M. Montero, J. Gutierrez-Arriola, M. A. López, Juan Manuel Pardo

Objective methods for evaluating synthetic intonation
Robert A.J. Clark, Kurt E. Dusterhoff

Using decision trees within the tilt intonation model to predict F0 contours
Kurt E. Dusterhoff, Alan W. Black, Paul Taylor

Levels of prosodic representation in spoken discourse: an empirical approach
Richard Esposito, Li-chiung Yang

Segmental duration modelling in a text-to-speech system for the galician language
Xavier Fernández-Salgado, Eduardo R. Banga

The symbolic coding of segmental duration and tonal alignment: an extension to the INTSINT system.
Daniel Hirst

Training an application-dependent prosodic model corpus, model and evaluation
Yann Morlec, Gérard Bailly, Véronique Aubergé

Farsi language prosodic structure, research and implementation using a speech synthesizer
H. Sheikhzadeh, A. Eshkevari, M. Khayatian, R. Sadigh, S. M. Ahadi

Acoustical characterisation of the accented syllable in portuguese, a contribution to the naturalness of speech synthesis
Joao Paulo Teixeira, Elisabete Rosa Paulo, Diamantino Freitas, Maria da Graca Pinto

Analysis and synthesis of the four tones in connected speech of the standard Chinese based on a command-response model
Changfu Wang, Hiroya Fujisaki, Sumio Ohno, Tomohiro Kodama

A profile of the discourse and intonational structures of route descriptions
Sandra Williams, Catherine I. Watson





Speech Generation and Synthesis - Prosody


Predicting gradient F0 variation: pitch range and accent prominence
Ivan Bulyko, Mari Ostendorf

CART-based duration modeling using a novel method of extracting prosodic features
Paul Deans, Andrew Breen, Peter Jackson

A primary study on the randomness control of the prosodic boundary index for natural synthetic speech
Ki-Wan Eom, Jin-Young Kim, Sun-Mi Kim

On a hybrid time domain-LPC technique for prosody superimposing used for speech synthesis
Attila Ferencz, István Nagy, Tünde-Csilla Kovács, Teodora Ratiu, Maria Ferencz

Multilingual prosody modelling using cascades of regression trees and neural networks
J. W. A. Fackrell, H. Vereecken, J.-P. Martens, Bert Van Coile

An efficient speaker adaptation method for TTS duration model
Wentao Gu, Chilin Shih, Jan P.H. van Santen

Child-directed speech synthesis: evaluation of prosodic variation for an educational computer program
David House, Linda Bell, Kjell Gustafson, Linn Johansson

Representation and processing of linguistic structures for an all-prosodic synthesis system using XML
Mark Huckvale

A study on a pitch alteration by using the formant and phase compensation technique
Won Park, Hyung-Bin Park, Myung-Jin Bae

Micro-prosodic control in cantonese text-to-speech synthesis
Tan Lee, Helen M. Meng, Wai H. Lau, W. K. Lo, P. C. Ching

Exploring the naturalness of several German high-quality-text-to-speech systems
Hansjörg Mixdorff, Dieter Mehnert

Detecting accent sandhi in Japanese using a superpositional F0 model
A. Sakurai, Hiromichi Kawanami, Keikichi Hirose

Focus detection by comparison of speech waveforms
Satoshi Kitagawa, Nick Campbell

An advanced intonation model for synthesis
Mark Tatham, Eric Lewis, Katherine Morton

A new F0 modification algorithm by manipulating harmonics of magnitude spectrum
Satoshi Takano, Masanobu Abe

A mixed strategy approach to Spanish prosody
Juan Manuel Villar Navarro, Eduardo López Gonzalo, José Relaño Gil






Speech Understanding - Miscellaneous Topics


Linguistic phrase spotting in a simple application spoken dialogue system
Manuela Boros, Paul Heisterkamp

Learning of domain dependent knowledge in semantic networks
F. Deinzer, J. Fischer, U. Ahlrichs, Elmar Nöth

Combining words and prosody for information extraction from speech
Dilek Hakkani-Tür, Gökhan Tür, Andreas Stolcke, Elizabeth Shriberg

Error correction translation using text corpora
Kai Ishikawa, Eiichiro Sumita

Efficient sentence disambiguation by preferred constituent order
S. Kronenberg, K. Skuplik

Identifying linguistic segmentations in Chinese spoken dialogue
Yue-Shi Lee, Hsin-Hsi Chen

Error recovery for robust language understanding in spoken dialogue systems
Tung-Hui Chiang, Yi-Chung Lin

A monolingual semantic decoder based on word sense disambiguation for mixed language understanding
Xiaohu Liu, Pascale Fung, Chi Shun Cheung

To believe is to understand
Helen M. Meng, Wai Lam, Carmen Wai

A hybrid approach to spoken dialogue understanding: prosody, statistics and partial parsing
Elmar Nöth, Jürgen Haas, Volker Warnke, Florian Gallwitz, Manuela Boros

Portable speech interpreter which has voice input and sophisticated correction functions
Yasunari Obuchi, Atsuko Koizumi, Yoshinori Kitahara, Jun'ichi Matsuda, Toshihisa Tsukada

Categorical understanding using statistical ngram models
Alexandros Potamianos, Giuseppe Riccardi, Shrikanth Narayanan

Detection and correction of speech repairs in word lattices
Jörg Spilker, Hans Weber, Günther Görz

Connectionist language models for speech understanding: the problem of word order variation
Igor Schadle, Jean-Yves Antoine, Daniel Memmi

Semi-automatic acquisition of domain-specific semantic structures
Kai-Chung Siu, Helen M. Meng

Transformation into language processing units by dividing and connecting utterance units
Toshiyuki Takezawa

Learning a lightweight robust deterministic parser
Aboy Wong, Dekai Wu

An information-based method for selecting feature types for word prediction
Dekai Wu, Zhifang Sui, Jun Zhao

A robust parser for spoken language understanding
Ye-Yi Wang


Speech Generation and Synthesis - Systems, Linguistic Processing


Aiuruete: a high-quality concatenative text-to-speech system for brazilian portuguese with demisyllabic analysis-based units and a hierarchical model of rhythm production
Plínio A. Barbosa, Fábio Violaro, Eleonora C. Albano, Flávio Simoes, Patrícia Aquino, Sandra Madureira, Edson Francozo

A parser-based text preprocessor for romanian language TTS synthesis
Dragos Burileanu, Claudius Dan, Mihai Sima, Corneliu Burileanu

Nparse - a shallow n-gram-based grammatical-phrase parser
Alice Carlberger

A language-independent probabilistic model for automatic conversion between graphemic and phonemic transcription of words
Evangelos Dermatas, George Kokkinakis

Acquisition of an extensive rule set for slovene grapheme-to-allophone transcription
Jerneja Gros, F. Mihelic

Voice conversion between UK and US accented English
Ching-Hsiang Ho, Saeed Vaseghi, Aimin Chen

Development of speech design tool "SESIGN99" to enhance synthesized speech
Hideyuki Mizuno, Masanobu ABE, Shin'ya Nakajima

Automation of the training procedures for neural networks performing multi-lingual grapheme to phoneme conversion
Horst-Udo Hain

Parsing hungarian sentences in order to determine their prosodic structures in a multilingual TTS system
Ilona Koutny

Text-to-speech synthesis of estonian
Meelis Mihkla, Arvo Eek, Einar Meister

Development of an emotional speech synthesiser in Spanish
J. M. Montero, J. Gutiérrez-Arriola, J. Colás, J. Macías-Guarasa, E. Enríquez, Juan Manuel Pardo

S5: the SQEL slovene speech synthesis system
N. Pavesic, Jerneja Gros

A multilingual text processing engine for the PAPAGENO text-to-speech synthesis system
Matej Rojc, Janez Stergar, Ralph Wilhelm, Horst-Udo Hain, Martin Holzapfel, Bogomir Horvat

Toshiba English text-to-speech synthesizer (TESS)
Chang K. Suh, Takehiko Kagoshima, Masahiro Morita, Shigenobu Seto, Masami Akamine

Towards the generation of French phonetic inflected forms
Frédérique Sannier, Véronique Aubergé

Canadian French text-to-speech synthesis: modeling an optimal set of realizations for dialect markers
Evelyne Tzoukermann, Lucie Ménard, Marise Ouellet

Machine learning of word pronunciation: the case against abstraction
Bertjan Busser, Walter Daelemans, Antal van den Bosch







Speech Generation and Synthesis - Acoustic Synthesis and Units


Sinusoidal representation and auditory model-based parametric matching and smoothing and its application in speech analysis/synthesis
Oscar C. Au, Wanggen Wan, Cyan L. Keung, Chi H. Yim

Choose the best to modify the least: a new generation concatenative synthesis system
Marcello Balestri, Alberto Pacchiotti, Silvia Quazza, Pier Luigi Salza, Stefano Sandri

Selection of waveform units for corpus-based Mandarin speech synthesis based on decision trees and prosodic modification costs
Fu-chiang Chou, Chiu-yu Tseng, Lin-shan Lee

Improving quality in a speech synthesizer based on the MBROLA algorithm
B. Etxebarria, I. Hernáez, I. Madariaga, E. Navas, J. C. Rodríguez, R. Gándara

A novel model TD-PSPTP for speech synthesis
Yan Huang, Bo Xu

Detection of non-stationarity in speech signals and its application to time-scaling
David Kapilow, Yannis Stylianou, Juergen Schroeter

A v-CV waveform based speech synthesis using global minimization of pitch conversion and concatenation distortion in v-CV unit sequence
Takao Koyama, Jun-ichi Takahashi

Stable speech synthesis using recurrent radial basis functions
Iain Mann, Steve McLaughlin

Efficient weight training for selection based synthesis
Yoram Meron, Keikichi Hirose

Speech synthesis using HMM-based acoustic unit inventory
Jindrich Matousek

An enhanced ABS/OLA sinusoidal model for waveform synthesis in TTS
Michael W. Macon, Mark A. Clements

High vowel /i y u/ in canadian and continental French: an analysis for a TTS system
Marise Ouellet, Evelyne Tzoukermann, Lucie Ménard

Speech production based on the mel-frequency cepstral coefficients
Zbynìk Tychtl, Josef Psutka

Exploiting improved parameter smoothing within a hybrid concatenative/LPC speech synthesizer
Erhard Rank

Synchronization of speech frames based on phase data with application to concatenative speech synthesis
Yannis Stylianou

Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis
Takayoshi Yoshimura, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura


Speech and Noise 1


A CASA-labelling model using the localisation cue for robust cocktail-party speech recognition
Hervé Glotin, Frédéric Berthommier, Emmanuel Tessier

Noise-invariant representation for speech signals
Aruna Bayya, B. Yegnanarayana

Natural-quality background noise coding using residual substitution
Khaled El-Maleh, Peter Kabal

Microphone array design for robust speech acquisition and recognition
Julian Fernández, Eduardo Lleida, Enrique Masgrau

Study of the influence of noise pre-processing on the performance of a low bit rate parametric speech coder
Gwénaél Guilmin, Régine Le Bouquin-Jeannès, Philippe Gournay

MLP network for enhancement of noisy MFCC vectors
Hemmo Haverinen, Petri Salmela, Juha Häkkinen, Mikko Lehtokangas, Jukka Saarinen

Hands-free voice activation in noisy car environment
J. Iso-Sipilä, K. Laurila, Ramalingam Hariharan, Olli Viikki

A wavelet denoising technique to improve endpoint detection in adverse conditions
Lamia Karray, Emmanuel Polard

Speech enhancement for linear-predictive-analysis-by-synthesis coders
Marcin Kuropatwinski, Dieter Leckschat, Kristian Kroschel, Andrzej Czyzewski, Chaz Hales

Robust HMM to variation of noisy environments based on variance extension of noise models
Hiroshi Matsumoto, Hiroaki Ubukata

The fourth-order cumulant of speech signals with application to voice activity detection
Elias Nemer, Rafik Goubran, Samy Mahmoud

The dependence of feature vectors under adverse noise
Woei-Chyang Shieh, Sen-Chia Chang

Speech detection and SNR prediction basing on amplitude modulation pattern recognition
Jürgen Tchorz, Birger Kollmeier

Fast active noise control for robust speech acquisition
Luis Vicente, Stephen J. Elliott, Enrique Masgrau

Missing data theory, spectral subtraction and signal-to-noise estimation for robust ASR: an integrated study
Ascension Vizinho, Phil Green, M. Cooke, Ljubomir Josifovski

Single channel speech enhancement using principal component analysis and MDL subspace selection
Rolf Vetter, Nathalie Virag, Philippe Renevey, Jean-Marc Vesin




Speech Recognition - Adaptation


Prosodic effects on segmental durations in greek
Antonis Botinis, Marios Fourakis, Irini Prinou

Within-utterance correlation for speech recognition
Mats Blomberg

Techniques for robust speech recognition in the car environment
Philippe Gelin, Jean-Claude Junqua

An on-line acoustic compensation technique for robust speech recognition
Diego Giuliani

Using adaptive signal limiter together with noise-robust techniques for noisy speech recognition
Wei-Wen Hung, Hsiao-Chuan Wang

A robust environment-effects suppression training algorithm for adverse Mandarin speech recognition
Wei-Tyng Hong, Sin-Horng Chen

Robust speaker adaptation of continuous density HMMS using multilayer perceptron network
Mikko Harju, Petri Salmela, Olli Viikki, Mikko Lehtokangas, Jukka Saarinen

Regression class selection and speaker adaptation with MLLR in Mandarin continuous speech recognition
Chengrong Li, Jingdong Chen, Bo Xu

Regression transformation of prior means for speaker adaptation
Guoqiang Li, Limin Du, Ziqiang Hou

Linguistic tree based maximum likelihood model interpolation
Liu Feng, Chi-wei Che, Peng Yu, Zuoying Wang

Model-based speaker normalization methods for speech recognition
Masaki Naito, Li Deng, Yoshinori Sagisaka

Maximum likelihood eigenspace and MLLR for speech recognition in noisy environments
Patrick Nguyen, Christian Wellekens, Jean-Claude Junqua

A study of speaker adaptation for speaker independent speech recognition method using phoneme similarity vector
Yoshio Ono, Maki Yamada, Masakatsu Hoshimi

An investigation into vocal tract length normalisation
L. F. Uebel, P. C. Woodland

Adaptation to environment and speaker using maximum likelihood neural networks
Zong Suk Yuk, James Flanagan, Mahesh Krishnamoorthy, Krishna Dayanidhi

Corrective training for speaker adaptation
Xiuyang Yu, Wayne Ward

A robust speaker-independent CPU-based ASR system
R. Obradovic, D. Pekar, S. Krco, V. Delic, V. Senk


Enhancements, Echo Cancellation, and Quality Measures


Delay estimation for transform domain acoustical echo cancellation
Rabih Abouchakra, Peter Kabal

Noise reduction using perceptual spectral change
C. Beaugeant, Pascal Scalart

Intelligibility improvements using diverse sub-band processing applied to noisy speech
Amir Hussain, Douglas R. Campbell

Recognizing simultaneous speech: a genetic algorithm approach
Athanasios Koutras, Evangelos Dermatas, George Kokkinakis

Speech enhancement system for hands-free telephone based on the psychoacoustically motivated filter bank with allpass frequency transformation #
Krzysztof Bielawski, Alexander A. Petrovsky

Speech enhancement using a multi-microphone sub-band adaptive griffiths-jim noise canceller
P. W. Shields, Douglas R. Campbell

Qualiphone-a: a perceptual speech quality evaluation system for analog mobile networks
M. Szarvas, T. Fegyó, P. Tatai, Géza Gordos

Speech enhancement using nonlinear microphone array under nonstationary noise conditions
Hiroshi Saruwatari, Shoji Kajita, Kazuya Takeda, Fumitada Itakura

Auditory masking threshold estimation for broadband noise sources with application to speech enhancement
Ruhi Sarikaya, John H. L. Hansen

Segregation of vowel in background noise using the model of segregating two acoustic sources based on auditory scene analysis
Masashi Unoki, Masato Akagi

Analysis and on-line detection of audible distortions in GSM telephony
Christophe Veaux, Pascal Scalart, André Gilloire

A parameter-based 2-talker detection apparatus for echo cancellation
Wen Rong Ru, Shih-Chen Lin, Po-Cheng Chen, Chun-Hung Kuo

Co-channel speech separation in the presence of correlated and uncorrelated noises
Kuan-Chieh Yen, Jun Huang, Yunxin Zhao


Speech and Noise 2


Speech enhancement using a mixture-maximum model
David Burshtein, Sharon Gannot

Concurrent speakers separation through binaural processing of stereo recordings
Joaquin Gonzalez-Rodriguez, Santiago Cruz-Llanas, Javier Ortega-Garcia

Spectral subtraction with adaptive averaging of the gain function
Harald Gustafsson, Sven Nordholm, Ingvar Claesson

A reliability criterion for time-frequency labeling based on periodicity in an auditory scene
François Gaillard, Frédéric Berthommier, Gang Feng, Jean-Luc Schwartz

Broadband noise cancellation systems: new approach to working performance optimization
Serguei Koval, Mikhail Stolbov, Mikhail Khitrov

Noise subtraction with parametric recursive gain curves
Klaus Linhard, Tim Haulick

Performance comparison of several adaptive schemes for microphone array beamforming
Enrique Masgrau, Luis Aguilar, Eduardo Lleida

An objective distortion estimator for hearing aids and its application to noise reduction
Mitsunori Mizumachi, Masato Akagi

Speech enhancement using fourth-order cumulants and time-domain optimal filters
Elias Nemer, Rafik Goubran, Samy Mahmoud

Missing feature theory and probabilistic estimation of clean speech components for robust speech recognition
Philippe Renevey, Andrzej Drygajlo

Distortion effects of several cumulant-based wiener filtering algorithms
Josep M. Salavedra, Xavier Bou

Combined noise suppression system for monaural cochlear implants
Milan Svoboda, Pavel Sovka, Petr Pollák

Objective prediction of speech intelligibility at high ambient noise levels using the speech transmission index
Sander J. van Wijngaarden, Herman J. M. Steeneken

Noise-regularized adaptive filtering for speech enhancement
Eric A. Wan, Rudolph van der Merwe

Speech enhancement using karhunen-loeve transformation and wiener filtering in critical bands
F. Zarubin, A. Kovtonyuk, K. Zadiraka






Speech and Noise 3


A robust isolated word recognizer for highly non-stationary environments. recognition results
A. Álvarez, R. Martínez, P. Gómez, V. Nieto, M. M. Pérez

Sequential bias compensation for robust speech recognition
Mohamed Afify

Use of simulated data for robust telephone speech recognition
Coianiz Tarcisio, Falavigna Daniele, Gretter Roberto, Orlandi Marco

On the use of time alignments for noisy speech recognition
Y. Hauptman, Y. Bistritz

Improved feature vector normalization for noise robust connected speech recognition
Juha Häkkinen, J. Suontausta, Ramalingam Hariharan, M. Vasilache, K. Laurila

State based imputation of missing data for robust speech recognition and speech enhancement
Ljubomir Josifovski, Martin Cooke, Phil Green, Ascension Vizinho

A comparison of two strategies for ASR in additive noise: missing data and spectral subtraction
Christopher Kermorvant, Andrew Morris

A comparison of techniques for tone compensation in payphone-based speech recognition
Ben Milner, Mark Farrell

Front-end improvements to reduce stationary & variable channel and noise distortions in continuous speech recognition tasks
Xavier Menéndez-Pidal, Ruxin Chen, Duanpei Wu, Mick Tanaka

Speech recognition in noisy reverberant rooms using a frequency domain blind deconvolution method
G. Nokas, E. Dermatas

Optimization of a speech recognizer for aircraft environments
Volker Schless, Fritz Class, Peter Sandl

Temporal constraints in viterbi alignment for speech recognition in noise
Nestor Becerra Yoma, Lee Luan Ling, Sandra Dotto Stump

HMM composition of segmental unit input HMM for noisy speech recognition
Kazumasa Yamamoto, Seiichi Nakagawa

Robust connected word speech recognition using weighted viterbi algorithm and context-dependent temporal constraints
Nestor Becerra Yoma, Lee Luan Ling, Sandra Dotto Stump

Liftered forward masking procedure for robust digits recognition
Kaisheng Yao, Bertram Shi, Pascale Fung, Zhigang Cao

Channel identification and spectrum estimation for robust automatic speech recognition
Yunxin Zhao


×

Keynotes

Speech Recognition, Adaptation 1

Prosody - Prosodic Features in Dialogues

Speech Recognition - Confidence Measures

Speech Recognition - Acoustic Processing

Articulatory Measurements and Modelling

First and Second Language Learning

Speech Recognition - Adaptation 2

Prosody - Prosodic Phrasing and Interruptions

Assessment

Speech Recognition - Confidence Measures 2

Speech Analysis and Tools

Language Identification

Speech Recognition - Speaking Rate

Speech Acoustics

Speech Recognition - Search and Pronunciation Modelling

Prosody - Stress, Accent and Prominence Phrasing

Speech Disorders & Speech for Disabled

Speech Recognition - Multi-stream ASR

Speech Generation and Synthesis - Concatenation

Speech Communication Education

Speech Recognition - Broadcast News

Prosody - Temporal and/or Intonational Features

Speaker Recognition - Acoustic Features and Robustness

Speech Recognition - Large Vocabulary Continuous Speech Recognition (LVCSR)

Speech Generation and Synthesis - Systems and Evaluation

Speech Technology for Language Learning

Speech Recognition - Multilinguality

Systems, Architectures, Interfaces

Speaker Recognition - Scoring and Decision

Speech Generation and Synthesis - Acoustic Synthesis

Disorders in Speech Production and/or Speech Perception

Speech Recognition - Acoustic Modelling 1

Dialogue 1

Speaker Recognition and Topic Detection

Speech Recognition - Search

Systems, Architectures

Audio-Visual Speech

Speech Recognition - Acoustic Modelling 2

Dialogue 2

Speech Coding

Dialogue

Wideband and Perceptually Based Coding

Speech Recognition - Language Modelling

Prosody - Study of Prosody for Speech Synthesis

Speech Perception 1

Multimodal Interaction

Joint Source-Channel Coding

Speech Generation and Synthesis - Prosody

Speech Perception 2

Speech Recognition - Language Modelling 1

Speech and Noise

Text-Dependent Speaker Verification

Speech Understanding - Miscellaneous Topics

Speech Generation and Synthesis - Systems, Linguistic Processing

Speech & the Internet

Speech Recognition - Language Modelling 2

Speech Signal Processing

Text-Independent Speaker Verification and Tracking

Corpora

Speech Generation and Synthesis - Acoustic Synthesis and Units

Speech and Noise 1

Speech Translation

Topic Detection and Tracking

Speech Recognition - Adaptation

Enhancements, Echo Cancellation, and Quality Measures

Speech and Noise 2

Spoken Dialogue Systems

Speech Perception

Speech Recognition - Training

Speech Analysis and Segmentation

Speech and Noise 3