doi: 10.21437/ICSLP.2002
Evaluation of a noise-robust DSR front-end on Aurora databases
Duncan Macho, Laurent Mauuary, Bernhard Noé, Yan Ming Cheng, Doug Ealey, Denis Jouvet, Holly Kelleher, David Pearce, Fabien Saadoun
Qualcomm-ICSI-OGI features for ASR
Andre Adami, Lukás Burget, Stephane Dupont, Hari Garudadri, Frantisek Grezl, Hynek Hermansky, Pratibha Jain, Sachin Kajarekar, Nelson Morgan, Sunil Sivadas
Improving word accuracy with Gabor feature extraction
Michael Kleinschmidt, David Gelbart
Evaluation of SPLICE on the Aurora 2 and 3 tasks
Jasha Droppo, Li Deng, Alex Acero
Performance of discriminatively trained auditory features on Aurora2 and Aurora3
Brian Mak, Yik-Cheung Tam
Feature extraction combining spectral noise reduction and cepstral histogram equalization for robust ASR
José C. Segura, M.C. Benítez, Ángel de la Torre, Antonio J. Rubio
Bell labs approach to Aurora evaluation on connected digit recognition
Jingdong Chen, Dimitris Dimitriadis, Hui Jiang, Qi Li, Tor André Myrvoll, Olivier Siohan, Frank K. Soong
Algorithms for distributed speech recognition in a noisy automobile environment
Hong Kook Kim, Richard C. Rose
Quantile based histogram equalization for online applications
Florian Hilger, Sirko Molau, Hermann Ney
Frontend post-processing and backend model enhancement on the Aurora 2.0/3.0 databases
Chia-Ping Chen, Karim Filali, Jeff A. Bilmes
HMM COmposition-based rapid model adaptation using a priori noise GMM adaptation evaluation on Aurora2 corpus
Masaki Ida, Satoshi Nakamura
Data-driven temporal filters obtained via different optimization criteria evaluated on Aurora2 database
Jeih-weih Hung, Lin-shan Lee
Efficient additive and convolutional noise reduction procedures
Bojan Kotnik, Damjan Vlaj, Zdravko Kacic, Bogomir Horvat
Progress with the philips continuous ASR system on the Aurora 2 noisy digits database
Markus Lieb, Alexander Fischer
An environment compensated minimum classification error training approach and its evaluation on Aurora2 database
Jian Wu, Qiang Huo
Evaluation of a noise adaptive speech recognition system on the Aurora 3 database
Kaisheng Yao, Dong-Lai Zhu, Satoshi Nakamura
Distributed speech recognition over IP networks on the Aurora 3 database
Laura Docío-Ferández, Carmen García-Mateo
Evaluation of noisy speech recognition based on noise reduction and acoustic model adaptation on the Aurora2 tasks
M. Fujimoto, Yasuo Ariki
Improvements to the IBM Aurora 2 multi-condition system
George Saon, Juan M. Huerta
Distributed speech recognition using noise-robust MFCC and traps-estimated manner features
Pratibha Jain, Hynek Hermansky, Brian Kingsbury
Evaluation of spectral subtraction with smoothing of time direction on the Aurora 2 task
Norihide Kitaoka, Seiichi Nakagawa
Evaluation of noise robust features on the Aurora databases
Xiaodong Cui, Markus Iseli, Qifeng Zhu, Abeer Alwan
Computationally efficient noise compensation for robust automatic speech recognition assessed under the Aurora 2/3 framework
Nicholas W. D. Evans, John S. Mason
Mel-scaled wavelet filter based features for noisy unvoiced phoneme recognition
O. Farooq, S. Datta
Filter bank subtraction for robust speech recognition
Kazuo Onoe, Hiroyuki Segi, Takeshi Kobayakawa, Shoei Sato, Toru Imai, Akio Ando
Low cost duration modelling for noise robust speech recognition
Andrew C. Morris, Simon Payne, Hervé Bourlard
A comparative study of approximations for parallel model combination of static and dynamic parameters
Yifan Gong
Noise estimation for efficient speech enhancement and robust speech recognition
Petr Motícek, Lukás Burget
The 2001 GMTK-based SPINE ASR system
Özgür Çetin, Harriet J. Nock, Katrin Kirchhoff, Jeff A. Bilmes, Mari Ostendorf
Using adaptive signal limiter together with weighting techniques for noisy speech recognition
Wei-Wen Hung
Spectral subtraction in noisy environments applied to speaker adaptation based on HMM sufficient statistics
Shingo Yamade, Kanako Matsunami, Akira Baba, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano
Robust speech recognition against short-time noise
Manhung Siu, Yu-Chung Chan
Word endpoints detection in the presence of non-stationary noise
M. Toma, A. Lodi, R. Guerrieri
Comparison and combination of RASTA-PLP and FF features in a hybrid HMM/MLP speech recognition system
Pere Pujol Marsal, Susagna Pol Font, Astrid Hagen, Hervé Bourlard, Climent Nadeu
Robust MMSE-FW-LAASR scheme at low SNRs
Tao Xu, Zhigang Cao
Robust speech recognition using a voiced-unvoiced feature
András Zolnay, Ralf Schlüter, Hermann Ney
Accumulated kullback divergence for analysis of ASR performance in the presence of noise
Febe de Wet, Johan de Veth, Bert Cranen, Lou Boves
A hybrid HMM/traps model for robust voice activity detection
Brian Kingsbury, Pratibha Jain, Andre Adami
Run time information fusion in speech recognition
Chengyi Zheng, Yonghong Yan
Using observation uncertainty in HMM decoding
Jon A. Arrowood, Mark A. Clements
Combining a Gaussian mixture model front end with MFCC parameters
M. N. Stuttle, M. J. F. Gales
Noise from corrupted speech log mel-spectral energies
Jasha Droppo, Alex Acero, Li Deng
Improving the role of unvoiced speech segments by spectral normalisation in robust speech recognition
Carlos Lima, Luís B. Almeida, João L. Monteiro
Building an ASR system for noisy environments: SRIs 2001 SPINE evaluation system
Venkata Ramana Rao Gadde, Andreas Stolcke, Dimitra Vergyri, Jing Zheng, Kemal Sönmez, Anand Venkataraman
Evidence for efficiency in vowel production
Rob J. J. H. van Son, Louis C. W. Pols
Stochastic suprasegmentals: relationship between the spectral characteristics of vowels, redundancy and prosodic structure
Matthew P. Aylett
Motor specifications of a baby robot via the analysis of infants² vocalizations
J. Serkhane, Jean-Luc Schwartz, Louis Jean Boë, B. Davis, C. Matyear
Oral-laryngeal control patterns for fricatives in 5-year-olds and adults
Laura L. Koenig, Jorge C. Lucero
French nasal vowels: acoustic and articulatory properties
Véronique Delvaux, Thierry Metens, Alain Soquet
Maximum likelihood estimation of eigenvoices and residual variances for large vocabulary speech recognition tasks
P. Kenny, G. Boulianne, Pierre Dumouchel
Rapid speaker adaptation using speaker clustering
Ernest J. Pusateri, Timothy J. Hazen
Adaptive model combination for dynamic speaker selection training
Chao Huang, Tao Chen, Eric Chang
Unsupervised n-best based model adaptation using model-level confidence measures
Ka-Yan Kwan, Tan Lee, Chen Yang
LU factorization for feature transformation
Patrick Nguyen, Luca Rigazio, Christian Wellekens, Jean-Claude Junqua
Implementing vocal tract length normalization in the MLLR framework
Guo-Hong Ding, Yi-Fei Zhu, Chengrong Li, Bo Xu
Markov models based on speaker space model evolution
Dong Kook Kim, Nam Soo Kim
Robust speech recognition using inter-speaker and intra-speaker adaptation
Baojie Li, Keikichi Hirose, Nobuaki Minematsu
Continuous environmental adaptation of a speech recogniser in telephone line conditions
Carlos Lima, Luís B. Almeida, João L. Monteiro
Tree-structured maximum a posteriori adaptation for a segment-based speech recognition system
Irina Illina
Robust time-synchronous environmental adaptation for continuous speech recognition systems
Thomas Plötz, Gernot A. Fink
Unsupervised language model adaptation for lecture speech transcription
Thomas Niesler, Daniel Willett
Incremental on-line feature space MLLR adaptation for telephony speech recognition
Yongxin Li, Hakan Erdogan, Yuqing Gao, Etienne Marcheret
Enhanced histogram normalization in the acoustic feature space
Sirko Molau, Florian Hilger, Daniel Keysers, Hermann Ney
Blind normalization of speech from different channels and speakers
David N. Levin
Unsupervised acoustic model adaptation based on phoneme error minimization
Jun Ogata, Yasuo Ariki
Improved structural maximum likelihood eigenspace mapping for rapid speaker adaptation
Bowen Zhou, John H. L. Hansen
Statistical adaptation of acoustic models to noise conditions for robust speech recognition
Ángel de la Torre, Dominique Fohr, Jean-Paul Haton
Issues in automatic transcription of historical audio data
F. Brugnara, M. Cettolo, M. Federico, D. Giuliani
Same talker, different language: a replication
Verna Stockmal, Zinny S. Bond
Automatic language identification using acoustic sub-word units
A. K. V. Sai Jayram, V. Ramasubramanian, T. V. Sreenivas
Factors in human language identification
Ian Maddieson, Ioana Vasilescu
Approaches to language identification using Gaussian mixture models and shifted delta cepstral features
Pedro A. Torres-Carrasquillo, Elliot Singer, Mary A. Kohler, Richard J. Greene, Douglas A. Reynolds, J. R. Deller Jr.
Methods to improve Gaussian mixture model based language identification system
Eddie Wong, Sridha Sridharan
Part-of-speech tagging in French text-to-speech synthesis: experiments in tagset selection
Hongyan Jing, Evelyne Tzoukermann
Grapheme-to-phoneme conversion using pseudo-morphological units
Ulla Uebler
Investigations on joint-multigram models for grapheme-to-phoneme conversion
M. Bisani, Hermann Ney
Pronunciation of proper names with a joint n-gram model for bi-directional grapheme-to-phoneme conversion
Lucian Galescu, James F. Allen
The AT&t German text-to-speech system: realistic linguistic description
Matthias Jilka, Ann K. Syrdal
Generating script using statistical information of the context variation unit vector
Haiping Li, Fangxin Chen, Liqin Shen
Efficient and scalable methods for text script generation in corpus-based TTS design
Chih-Chung Kuo, Jing-Yi Huang
A statistically motivated database pruning technique for unit selection synthesis
Peter Rutten, Matthew P. Aylett, Justin Fackrell, Paul Taylor
A new method of building decision tree based on target information
Yi-Jian Wu, Yu Hu, Xiaoru Wu, Ren-Hua Wang
A context clustering technique for average voice model in HMM-based speech synthesis
Junichi Yamagishi, Masatsune Tamura, Takashi Masuko, Keiichi Tokuda, Takao Kobayashi
Feature extraction for unit selection in concatenative speech synthesis: comparison between AIM, LPC, and MFCC
Minoru Tsuzaki, Hisashi Kawai
Combined prosody and candidate unit selections for corpus-based text-to-speech systems
Francisco Campillo-Díaz, Eduardo R. Banga
Automatic segmentation combining an HMM-based approach and spectral boundary correction
Yeon-Jun Kim, Alistair Conkie
Refined speech segmentation for concatenative speech synthesis
Abhinav Sethy, Shrikanth S. Narayanan
Refocussing on the text normalisation process in text-to-speech systems
Andrew Breen, Barry Eggleton, Peter Dion, Steve Minnis
A text-to-speech synthesis system for telugu
Jithendra Vepa, Jahnavi Ayachitam, K. V. K. Kalpana Reddy
Towards an intonation module for a portuguese TTS system
Diamantino Freitas, Daniela Braga
Applying a hybrid intonation model to a seamless speech synthesizer
Takashi Saito, Masaharu Sakamoto
Using start/end timings of spectral transitions between phonemes in concatenative speech synthesis
Toshio Hirai, Seiichi Tenpaku, Kiyohiro Shikano
Design of a Mandarin sentence set for corpus-based speech synthesis by use of a multi-tier algorithm taking account of the varied prosodic and spectral characteristics
Jinfu Ni, Hisashi Kawai
A data-driven approach to source-formant type text-to-speech system
Hiroki Mori, Takahiro Ohtsuka, Hideki Kasuya
Power spectral density based channel equalization of large speech database for concatenative TTS system
Yu Shi, Eric Chang, Hu Peng, Min Chu
CU VOCAL: corpus-based syllable concatenation for Chinese speech synthesis across domains and dialects
Helen M. Meng, Chi Kin Keung, Kai Chung Siu, Tien Ying Fung, P. C. Ching
Perceptual evaluation of naturalness due to substitution of Chinese syllable for concatenative speech synthesis
Jinlin Lu, Hisashi Kawai
Reducing the footprint of the IBM trainable speech synthesis system
Dan Chazan, Ron Hoory, Zvi Kons, Dorel Silberstein, Alexander Sorin
Computationally efficient time-scale modification of speech using 3 level clipping
Sung-Joo Lee, Hyung Soon Kim
A miniature Chinese TTS system based on tailored corpus
Zhi-Wei Shuang, Yu Hu, Zhen-Hua Ling, Ren-Hua Wang
Phonetic normalization using z-score in segmental prosody estimation for corpus-based TTS system
Hoeun Song, Jaein Kim, Kyongrok Lee, Jinyoung Kim
On F0 trajectory optimization for very high-quality speech manipulation
Hideki Kawahara, Parham Zolfaghari, Alain de Cheveigné
Modeling tones in continuous Cantonese speech
Tan Lee, Greg Kochanski, Chilin Shih, Yujia Li
Pitch contour model for Chinese text-to-speech using CART and statistical model
Minghui Dong, Kim-Teng Lua
Basque intonation modelling for text to speech conversion
Eva Navas, Inmaculada Hernáez, Juan María Sánchez
Application of microprosody models in text to speech synthesis
Phuay Hui Low, Saeed Vaseghi
Prosodic phrasing with inductive learning
Sheng Zhao, Jianhua Tao, Lianhong Cai
Speech reconstruction from mel-frequency cepstral coefficients using a source-filter model
Ben Milner, Xu Shao
Designing Japanese speech database covering wide range in prosody for hybrid speech synthesizer
Hiromichi Kawanami, Tsuyoshi Masuda, Tomoki Toda, Kiyohiro Shikano
Flexible multimodal human-machine interaction in mobile environments
Dirk Bühler, Wolfgang Minker, Jochen Häußler, Sven Krüger
Implementation testing of a hybrid symbolic/statistical multimodal architecture
Edward C. Kaiser, Philip R. Cohen
Belief network based disambiguation of object reference in spoken dialogue system for robot
Yoko Yamakata, Tatsuya Kawahara, Hiroshi G. Okuno
Specification and realisation of multimodal output in dialogue systems
Jonas Beskow, Jens Edlund, Magnus Nordstrand
Gestural trajectory symmetries and discourse segmentation
Francis Quek, Yingen Xiong, David McNeill
Gestural spatialization in natural discourse segmentation
Francis Quek, David McNeill, Robert Bryll, Mary Harper
Real-time sound source localization and separation for robot audition
Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroaki Kitano
CU animate tools for enabling conversations with animated characters
Jiyong Ma, Jie Yan, Ronald Cole
Multiparty multimodal interaction: a preliminary analysis
Philip R. Cohen, Rachel Coulston, Kelly Krout
Distributed audio-visual speech synchronization
Peter Poller, Jochen Müller
Lip-reading based on a fully automatic statistical model
Philippe Daubias, Paul Deléglise
Audio-visual continuous speech recognition using a coupled hidden Markov model
Xiaoxing Liu, Yibao Zhao, Xiaobo Pi, Luhong Liang, Ara V. Nefian
Data, annotation schemes and coding tools for natural interactivity
Laila Dybkjær, Niels Ole Bernsen
VisSTA: a tool for analyzing multimodal discourse data
Francis Quek, Yang Shi, Cemil Kirbas, Shunguang Wu
The influence of identification training on identification and production of the american English mid and low vowels by native speakers of Japanese
Stephen Lambacher, William Martens, Kazuhiko Kakehi
Perceptual learning of second-language syllable rhythm by elderly listeners
Keiichi Tajima, Reiko Akahane-Yamada, Tsuneo Yamada
Perceptual adjustment to foreign-accented English with short term exposure
Constance M. Clarke
Absolute pitch and lexical tones: tone perception by non-musician, musician, and absolute pitch non-tonal language speakers
Denis K. Burnham, Ron Brooker
Comprehension of non-native speech: inaccurate phoneme processing and activation of lexical competitors
Mirjam Broersma
Overview on recent activities in speech understanding and dialogue systems evaluation
Wolfgang Minker
DARPA communicator: cross-system results for the 2001 evaluation
Marilyn A. Walker, Alexander I. Rudnicky, Rashmi Prasad, John Aberdeen, Elizabeth Owen Bratt, John S. Garofolo, Helen Hastie, Audrey N. Le, Bryan Pellom, Alex Potamianos, Rebecca Passonneau, Salim Roukos, Gregory A. Sanders, Stephanie Seneff, David Stallard
DARPA communicator evaluation: progress from 2000 to 2001
Marilyn A. Walker, Alexander I. Rudnicky, John Aberdeen, Elizabeth Owen Bratt, John S. Garofolo, Helen Hastie, Audrey N. Le, Bryan Pellom, Alex Potamianos, Rebecca Passonneau, Rashmi Prasad, Salim Roukos, Gregory A. Sanders, Stephanie Seneff, David Stallard
Effects of word error rate in the DARPA communicator data during 2000 and 2001
Gregory A. Sanders, Audrey N. Le, John S. Garofolo
Subset languages for conversing with collaborative interface agents
Candace L. Sidner, Clifton Forlines
Transformation of spectral envelope for voice conversion based on radial basis function networks
Tomomi Watanabe, Takahiro Murakami, Munehiro Namba, Tetsuya Hoya, Yoshihisa Ishida
Subband based voice conversion
Oytun Turk, Levent M. Arslan
Evaluation of cross-language voice conversion using bilingual and non-bilingual databases
Mikiko Mashimo, Tomoki Toda, Hiromichi Kawanami, Hideki Kashioka, Kiyohiro Shikano, Nick Campbell
Voice transformations for improving children²s speech recognition in a publicly available dialogue system
Joakim Gustafson, Kåre Sjölander
The ISL meeting corpus: the impact of meeting type on speech style
Susanne Burger, Victoria MacLaren, Hua Yu
A new method for testing dialogue systems based on simulations of real-world conditions
R. López-Cózar, Ángel de la Torre, José C. Segura, Antonio J. Rubio, J. M. López-Soler
Comfort noise detection and GSM-FR-codec detection for speech-quality evaluations in telephone networks
Thorsten Ludwig
Validation and improvement of automatic phonetic transcriptions
Catia Cucchiarini, Diana Binnenpoorte
Development of Japanese infant speech database and speaking rate analysis
Shigeaki Aman, Kazumi Kato, Tadahisa Kondo
Automatic prosodic break labeling for Mandarin Chinese speech data
Minghui Dong, Kim-Teng Lua
Orientel: speech-based interactive communication applications for the mediterranean and the middle east
Imed Zitouni, Joseph Olive, Dorota Iskra, Khalid Choukri, Ossama Emam, Oren Gedge, Emmanuel Maragoudakis, Herbert Tropf, Asunción Moreno, Albino Nogueiras Rodriguez, Barbara Heuft, Rainer Siemund
The reliability of the ITU-t p.85 standard for the evaluation of text-to-speech systems
Yolanda Vazquez Alvarez, Mark Huckvale
Automatic generation of phonetic transcriptions for large speech corpora
Kris Demuynck, Tom Laureys, Steven Gillis
Overview on recent activities in speech understanding and dialogue systems evaluation
Wolfgang Minker
The carnegie mellon communicator corpus
Christina Bennett, Alexander I. Rudnicky
Globalphone: a multilingual speech and text database developed at karlsruhe university
Tanja Schultz
On developing new text and audio corpora and speech recognition tools for the turkish language
Özgül Salor, Bryan Pellom, Tolga Çiloglu, Kadri Hacioglu, Mübeccel Demirekler
FORM: an extensible, kinematically-based gesture annotation scheme
Craig Martell
Automatic phoneme alignment based on acoustic-phonetic modeling
John-Paul Hosom
Extracting clauses for spoken language understanding in conversational systems
Narendra K. Gupta, Srinivas Bangalore, Mazin Rahim
Issues in the development of a stochastic speech understanding system
F. Lefèvre, H. Bonneau-Maynard
10 years of phondat-II: a reassessment
Hartmut R. Pfitzinger
Risk based lattice cutting for segmental minimum Bayes-risk decoding
Shankar Kumar, William Byrne
Dynamic search-space pruning for time-constrained speech recognition
Sascha Wendt, Gernot A. Fink, Franz Kummert
A Gaussian selection method for multi-mixture HMM based continuous speech recognition
Raymond H. Lee, Eric H. C. Choi
On use of duration modeling for continuous digits speech recognition
Rong Dong, Jie Zhu
Arc minimization in finite state decoding graphs with cross-word acoustic context
Geoffrey Zweig, George Saon, F. Yvon
Fast hierarchical grammar optimization algorithm toward time and space efficiency
Jing Zheng, Horacio Franco
Dynamic tuning of language model score in speech recognition using a confidence measure
Sherif Abdou, Michael Scordilis
Minimum perfect hashing for fast n-gram language model lookup
Xiao Zhang, Yunxin Zhao
Combining search spaces of heterogeneous recognizers for improved speech recogniton
Xiang Li, Rita Singh, Richard M. Stern
Transmission characteristics of outer ear canal
Karel Pellant, Jan Mejzlík, Karel Prikryl, Zdenek Skvor
Hearing-aid benefits and limitations: predictions from a cochlear model
James M. Kates
A psychoacoustic basis for spectral sharpening
Peggy B. Nelson, Jeffrey J. DiGiovanni, Robert S. Schlauch
Model-based predictions of intensity discrimination for normal- and impaired-hearing listeners
Lisa G. Huettel, Leslie M. Collins
Modeling the perception of frequency-shifted vowels
Peter F. Assmann, Terrance M. Nearey, Jack M. Scott
The relationship between pure-tone sequential stream segregation and perceptual separation of male and female talkers by listeners with hearing loss
Carol L. Mackersie
A phoneme recognizer for the hearing impaired
Mathias Johansson, Mats Blomberg, Kjell Elenius, Lars-Erik Hoffsten, Anders Torberger
Likelihood combination and recognition output voting for the decoding of non-native speech with multilingual HMMs
V. Fischer, E. Janke, S. Kunzmann
Stochastic trajectory model analysis for accent classification
Pongtep Angkititrakul, John H. L. Hansen
Multilingual pronunciation modeling for improving multilingual speech recognition
Jilei Tian, Juha Häkkinen, Olli Viikki
On text-based language identification for multilingual speech recognition systems
Jilei Tian, Juha Häkkinen, Søren Riis, Kåre Jean Jensen
Multilingual speech recognition with language identification
Bin Ma, Cuntai Guan, Haizhou Li, Chin-Hui Lee
Robust HMM training for unified dutch and German speech recognition
Rathi Chengalvarayan
Using cross-language cues for story-specific language modeling
Sanjeev Khudanpur, Woosung Kim
Full-text story alignment models for Chinese-English bilingual news corpora
Bing Zhao, Stephan Vogel
Comparison of acoustic distance measures for automatic cross-language phoneme mapping
Jayren J. Sooful, Elizabeth C. Botha
Maximum expected likelihood based model selection and adaptation for nonnative English speakers
Xiaodong He, Yunxin Zhao
Integration of MLLR adaptation with pronunciation proficiency adaptation for non-native speech recognition
Nobuaki Minematsu, Gakuto Kurata, Keikichi Hirose
Native and vietnamese production of compound and phrasal stress patterns
Thu Nguyen, John Ingram
On the function of the late rise and the early fall in dutch dialogue: a perception experiment
Johanneke Caspers
Holds as gestural correlates to empty and filled speech pauses
Anna Esposito, Susan Duncan, Francis Quek
Linguistic and acoustic changes of user²s utterances caused by different dialogue situations
Toshihiko Itoh, Atsuhiko Kai, Tatsuhiro Konishi, Yukihiro Itoh
Automatic user-adaptive speaking rate selection for information delivery
Nigel Ward, Satoshi Nakagawa
Coordination of referring expressions in multimodal human-computer dialogue
Gabriel Skantze
A comparison between feedback strategies in human-to-human and human-machine communication
Loredana Cerrato
Adaptation of users' spoken dialogue patterns in a conversational interface
Courtney Darves, Sharon Oviatt
Unsupervised speaker segmentation of telephone conversations
Aaron E. Rosenberg, Allen Gorin, Zhu Liu, S. Parthasarathy
An effective unsupervised scheme for multiple-speaker-change detection
P. Sivakumaran, A.M. Ariyaeeinia, J. Fortuna
Unknown-multiple speaker clustering using HMM
J. Ajmera, Hervé Bourlard, I. Lapidot, Iain A. McCowan
Speaker utterances tying among speaker segmented audio documents using hierarchical classification: towards speaker indexing of audio databases
Sylvain Meignier, Jean-François Bonastre, Ivan Magrin-Chagnolleau
A comparative study of adaptation methods for speaker verification
Johnny Mariéthoz, Samy Bengio
Speaker verification with data fusion and model adaptation
Kevin R. Farrell
An adaptive speaker verification system with speaker dependent a priori decision thresholds
Nikki Mirghafori, Larry P. Heck
A trainable spoken language understanding system for visual object selection
Deb Roy, Peter Gorniak, Niloy Mukherjee, Josh Juster
Named entity extraction from spontaneous speech in how may i help you?
F. Béchet, Allen Gorin, Jerry Wright, D. Hakkani Tur
Recognition error processing for speech understanding
Caroline Bousquet-Vernhettes, Nadine Vigouroux
Using part-of-speech tags, context thresholding, and trigram contexts to improve the auto-induction of semantic classes
Andrew Pargellis, Eric Fosler-Lussier, Augustine Tsai
Combination of statistical and rule-based approaches for spoken language understanding
Ye-Yi Wang, Alex Acero, Ciprian Chelba, Brendan Frey, Leon Wong
Chinese spoken language analyzing based on combination of statistical and rule methods
Guodong Xie, Chengqing Zong, Bo Xu
A maximum entropy semantic parser using word classes
Norbert Pfannerer
Speech watermarking through parametric modeling
A. Gurijala, J. R. Deller Jr., M. S. Seadle, John H. L. Hansen
An education software in teaching automatic speech recognition (ASR)
Kai Sze Hong, Sh-Hussain Salleh
Multimodal integration patterns in children
Benfang Xiao, Cynthia Girand, Sharon Oviatt
ASR in a human word recognition model: generating phonemic input for shortlist
Odette Scharenborg, Lou Boves, Johan de Veth
Sign language translation using an error tolerant retrieval algorithm
Chung-Hsien Wu, Yu-Hsien Chiu, Kung-Wei Cheng
A sound source classification system based on subband processing
Oytun Turk, Omer Sayli, Helin Dutagaci, Levent M. Arslan
Automatic sign translation
Ying Zhang, Bing Zhao, Jie Yang, Alex Waibel
A study on the classification of whispered and normally phonated speech
Stanley J. Wenndt, Edward J. Cupples, Richard M. Floyd
Experiments on recognition of lavalier microphone speech and whispered speech in real world environments
Kiyoshi Tatara, Taisuke Ito, Parham Zolfaghari, Kazuya Takeda, Fumitada Itakura
An effect of amplitude modulation on perceptual segregation of tone sequences
Mamoru Iwaki, Hiromi Seki
Automatic recognition of dutch dysarthric speech: a pilot study
Eric Sanders, Marina Ruiter, Lilian Beijer, Helmer Strik
Evaluation of a system for concatenative articulatory visual speech synthesis
Olov Engwall
Intrasyllabic articulatory control constraints in verbal working memory
Marc Sato, Jean-Luc Schwartz, Marie-Agnès Cathiard, Christian Abry, Hélène Loevenbruck
Towards a grammar of spoken language: incorporating paralinguistic information
Nick Campbell
An analysis of the causes of increased error rates in children²s speech recognition
Qun Li, Martin J. Russell
A new computer-based analytical speech perception test for prelingually deaf children and children with speech disorders
Anne-Marie Öster
Vocalization age as a clinical tool
Harriet J. Fell, Joel MacAuslan, Linda J. Ferrier, Susan G. Worst, Karen Chenausky
Baldini: baldi speaks italian!
Piero Cosi, Michael M. Cohen, Dominic W. Massaro
Eyebrow movements and voice variations in dialogue situations: an experimental investigation
Christian Cavé, Isabelle Guaïtella, Serge Santi
State clustering improvements for continuous HMMs in a Spanish large vocabulary recognition system
R. Córdoba, J. Macías-Guarasa, J. Ferreiros, J. M. Montero, José M. Pardo
A comparison of HTK, ISIP and julius in slovenian large vocabulary continuous speech recognition
Tomaz Rotovnik, Mirjam Sepesy Maucec, Bogomir Horvat, Zdravko Kacic
Parametric trajectory segment model for LVCSR
Lei Jia, Bo Xu
Efficient precalculation of LM contexts for large vocabulary continuous speech recognition
F. Javier Diéguez-Tirado, Antonio Cardenal-López
Integrating multiple pronunciations during MCE-based acoustic model training for large vocabulary speech recognition
Rathi Chengalvarayan
A hybrid approach to compounds in LVCSR
Tom Laureys, Vincent Vandeghinste, Jacques Duchateau
A confidence measure based on agreement among multiple LVCSR models - correlation between pair of acoustic models and confidence
Takehito Utsuro, Tetsuji Harada, Hiromitsu Nishizaki, Seiichi Nakagawa
Combining lexical and morphological knowledge in language model for inflectional (czech) language
Jan Nouza, Jindra Drabkova
Modeling frequent allophones in Japanese speech recognition
Long Nguyen, Xuefeng Guo, John Makhoul
The structure and its implementation of hidden dynamic HMM for Mandarin speech recognition
Feili Chen, Jie Zhu, Wentao Song
A new lexicon optimization method for LVCSR based on linguistic and acoustic characteristics of words
Takahiro Shinozaki, Sadaoki Furui
Retrieving phrases by selecting the history: application to automatic speech recognition
David Langlois, Kamel Smaïli, Jean-Paul Haton
Compact subnetwork-based large vocabulary continuous speech recognition
Dong-Hoon Ahn, Minhwa Chung
A comparison of four language models for large vocabulary turkish speech recognition
Helin Dutagaci, Levent M. Arslan
Speech recognition for language teaching and evaluating: a study of existing commercial products
Rebecca Hincks
Automatic intelligibility assessment and diagnosis of critical pronunciation errors for computer-assisted pronunciation learning
Antoine Raux, Tatsuya Kawahara
Effects of production training with visual feedback on the acquisition of Japanese pitch and durational contrasts
Yukari Hirata
Acoustic modeling of sentence stress using differential features between syllables for English rhythm learning system development
Nobuaki Minematsu, Satoshi Kobashikawa, Keikichi Hirose, Donna Erickson
Modeling and automatic detection of English sentence stress for computer-assisted English prosody learning system
Kazunori Imoto, Yasushi Tsubota, Antoine Raux, Tatsuya Kawahara, Masatake Dantsuji
Recognition and verification of English by Japanese students for computer-assisted language learning system
Yasushi Tsubota, Tatsuya Kawahara, Masatake Dantsuji
Feedback in computer assisted pronunciation training: technology push or demand pull?
Ambra Neri, Catia Cucchiarini, Helmer Strik
Corpus-based analysis of English spoken by Japanese students in view of the entire phonemic system of English
Nobuaki Minematsu, Gakuto Kurata, Keikichi Hirose
Computer-assisted second-language speech learning: generalization of prosody-focused training
Debra M. Hardison
Predicting oral reading miscues
Jack Mostow, Joseph Beck, S. Vanessa Winter, Shaojun Wang, Brian Tobin
Implementation of an intonational quality assessment system
Chanwoo Kim, Wonyong Sung
English call system with functions of speech segmentation and pronunciation evaluation using speech recognition technology
Yasuo Ariki, Jun Ogata
Perception of tone and vowel quantity in Thai
Hansjörg Mixdorff, Sudaporn Luksaneeyanawin, Hiroya Fujisaki, Patavee Charnvivit
Duration and F0 as perceptual cues to Japanese vowel quantity
Keisuke Kinoshita, Dawn M. Behne, Takayuki Arai
Effects of intra-phrase position on acceptability of changes in segmental duration in sentence speech
Makiko Muto, Hiroaki Kato, Minoru Tsuzaki, Yoshinori Sagisaka
Perception of prosodic phrasing by hearing-impaired listeners
Dragana Barac-Cikoja, Sally Revoile
Processing of temporal cues marking phrasal boundaries in individuals with brain damage
Wendi A. Aasland, Shari R. Baum
A real-time acoustic human-machine front-end for multimedia applications integrating robust adaptive beamforming and stereophonic acoustic echo cancellation
W. Herbordt, J. Ying, H. Buchner, W. Kellermann
Enhancement of single channel speech using perception-based wavelet transform
Ching-Ta Lu, Hsiao-Chuan Wang
Speech enhancement based on a perceptual modification of wiener filtering
L. Lin, W. H. Holmes, E. Ambikairajah
A new approach to speech enhancement by a microphone array using EM and mixture models
Hagai Attias, Li Deng
Acoustic echo cancellation based on m-channel IIR cosine-modulated filter bank
Sang G. Kim, Chang D. Yoo
Speech enhancement in car environment using blind source separation
Hiroshi Saruwatari, Katsuyuki Sawai, Akinobu Lee, Kiyohiro Shikano, Atsunobu Kaminuma, Masao Sakata
Speech enhancement based on combining perceptual enhancement and short-time spectral attenuation
I. Potamitis, Nikos Fakotakis, George Kokkinakis
Suitable design of adaptive beamformer based on average speech spectrum for noisy speech recognition
Takanobu Nishiura, Satoshi Nakamura, Yuka Okada, Takeshi Yamada, Kiyohiro Shikano
Highly oversampled subband adaptive filters for noise cancellation on a low-resource DSP system
King Tam, Hamid Sheikhzadeh, Todd Schneider
A perceptually motivated subspace approach for speech enhancement
Yi Hu, Philipos C. Loizou
Speech enhancement based on generalized singular value decomposition approach
Gwo-hwa Ju, Lin-shan Lee
Subspace speech enhancement using subband whitening filter
Jong Uk Kim, Chang D. Yoo
Speech enhancement using wavelet packet transform
Sungwook Chang, Sungil Jung, Y. Kwon, Sung-il Yang
Sequential MAP noise estimation and a phase-sensitive model of the acoustic environment
Li Deng, Jasha Droppo, Alex Acero
Auditory fovea based speech enhancement and its application to human-robot dialog system
Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroaki Kitano
A spatio-temporal speech enhancement scheme for robust speech recognition
Erik Visser, Manabu Otsuka, Te-Won Lee
Comparative evaluation of CASA and BSS models for subband cocktail-party speech separation
Frédéric Berthommier, Seungjin Choi
Speech enhancement in non-stationary noise environments
Hyoung-Gook Kim, Dietmar Ruwisch
The 2ch hybrid subtractive beamformer applied to line sound sources
Mitsunori Mizumachi, Satoshi Nakamura
High performance digit recognition in real car environments
Umit Yapanel, Xianxian Zhang, John H. L. Hansen
Multiple regression of log-spectra for in-car speech recognition
Tetsuya Shinde, Kazuya Takeda, Fumitada Itakura
Experiments on speaker-independent voice command recognition using in-vehicle hands free speech
Yifan Gong, Lorin Netsch
Application of over-complete blind source separation for robust automatic speech recognition
Shubha Kadambe
Porting channel robustness across languages
Françoise Beaufays, Daniel Boies, Mitch Weintraub
An efficient dialogue control method using decision tree-based estimation of out-of-vocabulary word attributes
Yasuhiro Takahashi, Kohji Dohsaka, Kiyoaki Aikawa
Semantic inference: a data-driven solution for NL interaction
Jerome R. Bellegarda
Unified task knowledge for spoken language understanding and dialog management
Jerry Wright, Alicia Abella, Allen Gorin
Distributed Chinese keyword spotting and verification for spoken dialogues under wireless environment
Yun-Tien Lee, Cheng-Huang Wu, Yumin Lee, Lin-shan Lee
A method for evaluating incremental utterance understanding in spoken dialogue systems
Ryuichiro Higashinaka, Noboru Miyazaki, Mikio Nakano, Kiyoaki Aikawa
Detection and recognition of repaired speech on misrecognized utterances for speech input of car navigation system
Naoko Kakutani, Norihide Kitaoka, Seiichi Nakagawa
Ingressive speech as an indication that humans are talking to humans (and not to machines)
Robert Eklund
Compensating for hyperarticulation by modeling articulatory properties
Hagen Soltau, Florian Metze, Alex Waibel
Forms of introduction in map task dialogues: case of L2 Russian speakers
Olga V. Goubanova
Bridges: regions between discourse segments
Nanette M. Veilleux
Robust semantic confidence scoring
Didier Guillevic, Simona Gandrabur, Yves Normandin
Statistically based approach to rejection of incorrectly recognized words
Ludek Müller, Tomás Bartos
Learning decision trees to determine turn-taking by spoken dialogue systems
Ryo Sato, Ryuichiro Higashinaka, Masafumi Tamoto, Mikio Nakano, Kiyoaki Aikawa
Integration of phonetic length properties in the acoustic models of false starts and out-of-vocabulary words
H. Hamimed, G. Damnati
N-word-sequence frequency noise mitigation for SLM based on binomial distribution
Yibao Zhao, Guojun Zhou
Combining acoustic and language information for emotion recognition
Chul Min Lee, Shrikanth S. Narayanan, Roberto Pieraccini
A figure of merit for the analysis of spoken dialog systems
Kadri Hacioglu, Wayne Ward
Selective back-off smoothing for incorporating grammatical constraints into the n-gram language model
Tomoyosi Akiba, Katunobu Itou, Atsushi Fujii, Tetsuya Ishikawa
Backoff hierarchical class n-gram language modelling for automatic speech recognition systems
Imed Zitouni, Olivier Siohan, Hong-Kwang Jeff Kuo, Chin-Hui Lee
Constructing small language models from grammars
Francis Picard, Dominique Boucher, Guy Lapalme
Improve latent semantic analysis based language model by integrating multiple level knowledge
Rong Zhang, Alexander I. Rudnicky
Individual word language models and the frequency approach
Elvira I. Sicilia-Garcia, Ji Ming, F. Jack Smith
SRILM - an extensible language modeling toolkit
Andreas Stolcke
Efficient construction of long-range language models using log-linear interpolation
E. W. D. Whittaker, D. Klakow
Integration of two stochastic context-free grammars
Anna Corazza
Grammar specialisation meets language modelling
Manny Rayner, Beth Ann Hockey, John Dowding
Maximum entropy model for punctuation annotation from speech
Jing Huang, Geoffrey Zweig
An automatic sentence boundary detector based on a structured language model
Shinsuke Mori
Improved katz smoothing for language modeling in speech recogniton
Genqing Wu, Fang Zheng, Wenhu Wu, Mingxing Xu, Ling Jin
On the use of structures in language models for dialogue
Renato De Mori, Yannick Estève, Christian Raymond
Semantic structured language models
Hakan Erdogan, Ruhi Sarikaya, Yuqing Gao, Michael Picheny
Statistical language modeling with prosodic boundaries and its use for continuous speech recognition
Keikichi Hirose, Nobuaki Minematsu, Makoto Terao
Noise robust speech recognition using F0 contour extracted by hough transform
Koji Iwano, Takahiro Seki, Sadaoki Furui
Sharing relative stress of cross-word syllables and lexical stress to spontaneous speech recognition
Farshad Almasganj, Farhad D. Dehnavi, Mahmood Bijankhan
Automatic punctuation and disfluency detection in multi-party meetings using prosodic and lexical cues
Don Baron, Elizabeth Shriberg, Andreas Stolcke
Pitch accent prediction using ensemble machine learning
Xuejing Sun
Quantitative evaluation of relevant prosodic factors for text-to-speech synthesis in Spanish
D. Escudero-Mancebo, C. González-Ferreras, V. Cardeñoso-Payo
Tone recognition in Thai continuous speech based on coarticulaion, intonation and stress effects
Nuttakorn Thubthong, Boonserm Kijsirikul, Sudaporn Luksaneeyanawin
Combination of pause and F0 information in dependency analysis of Japanese sentences
Kazuyuki Takagi, Hajime Kubota, Kazuhiko Ozeki
Estimating syntactic structure from F0 contour and pause duration in Japanese speech
Yasuo Horiuchi, Tomoko Ohsuga, Akira Ichikawa
Extraction of important sentences using F0 information for speech summarization
Yoichi Yamashita, Akira Inoue
Influence of prosody, context, and word order in the identification of focus in Japanese dialogue
Tatsuya Kitamura, Kayo Itoh, Toshihiko Itoh, Shigeyoshi Kitazawa
Influence of different dialogue situations on user²s behavior in spoken corrections
Atsuhiko Kai, Yukari Nonomura, Toshihiko Itoh, Tatsuhiro Konishi, Yukihiro Itoh
Interpreting meaning from context: modeling the prosody of discourse markers in speech
Li-chiung Yang
Prosodic parameter for speaker identification
Katarina Bartkova, David Le Gac, Delphine Charlet, Denis Jouvet
Juncture segmentation of Japanese prosodic unit based on the spectrographic features
Kitazawa Shigeyoshi, Itoh Toshihiko, Kitamura Tatsuya
Kymographic imaging of the vocal fold oscillations
Jan G. Svec, Frantisek Sram
Assessment of consonant articulation in glossectomee speech by dynamic MRI
K. Mády, R. Sader, A. Zimmermann, P. Hoole, A. Beer, H.-F. Zeilhofer, Ch. Hannig
An EPG therapy protocol for remediation and assessment of articulation disorders
Alan Wrench, Fiona Gibbon, Alison M. McNeill, Sara Wood
How speakers with and without speech impairment mark the question statement contrast
Rupal Patel
Vowel classification for computer-based visual feedback for speech training for the hearing impaired
Stephen A. Zahorian, A. Matthew Zimmer, Fansheng Meng
All-pole modeling of wide-band speech using weighted sum of the LSP polynomials
Paavo Alku, Tom Bäckström
Analysis and synthesis of the phonatory excitation signal by means of a pair of polynomial shaping functions
Jean Schoentgen
Optimal speech signal partition into one-quasiperiodical segments
Taras K. Vintsiuk
Sparse and independent representations of speech signals based on parametric models
Hugo L. Rufiner, Luis F. Rocha, John Goddard Close
Improvement of the ELS-based time-varying complex speech analysis
Keiichi Funaki
Maximum mutual information training of hidden Markov models with vector linear predictors
K. K. Chin, P. C. Woodland
A sparse modeling approach to speech recognition based on relevance vector machines
J. E. Hamaker, J. Picone, A. Ganapathiraju
Mutual information phone clustering for decision tree induction
Ciprian Chelba, Rachel Morton
Rethinking derived acoustic features in speech recognition
Kevin S. Van Horn
Modeling HMM state distributions with Bayesian networks
Konstantin Markov, Satoshi Nakamura
Discriminative linear transforms for feature normalization and speaker adaptation in HMM estimation
Stavros Tsakalidis, Vlasios Doumpiotis, William Byrne
Speaking rate compensation based on likelihood criterion in acoustic model training and decoding
Kozo Okuda, Tatsuya Kawahara, Satoshi Nakamura
Combining maximum likelihood and maximum a posteriori estimation for detailed acoustic modeling of context dependency
Michiel Bacchiani
Large vocabulary conversational speech recognition with the extended maximum likelihood linear transformation (EMLLT) model
Jing Huang, Vaibhava Goel, Ramesh Gopinath, Brian Kingsbury, Peder Olsen, Karthik Visweswariah
Modeling varying pauses to develop robust acoustic models for recognizing noisy conversational speech
Jin-Song Zhang, Satoshi Nakamura
Improving phone-level discrimination in LDA with subphone-level classes
Hwa Jeon Song, Hyung Soon Kim
A combined model of statics-dynamics of speech optimized using maximum mutual information
Zhijian Ou, Zuoying Wang
Syllable recognition using syllable-segment statistics and syllable-based HMM
Nobutoshi Takahashi, Seiichi Nakagawa
Recurrent neural network-enhanced HMM speech recognition systems
J. W. F. Thirion, Elizabeth C. Botha
Sharing trend information of trajectory in segmental-feature HMM
Young-Sun Yun
Framewise phone classification using support vector machines
Jesper Salomon, Simon King, Jesper Salomon
A state-tying approach to building syllable HMMs
Darryl Stewart, Ming Ji, Philip Hanna, F. Jack Smith
Recognition of continuous speech segments of monophone units using support vector machines
Weifeng Lee, C. Chandra Sekhar, Kazuya Takeda, Fumitada Itakura
Construction of decision tree from data driven clustering
Junho Park, Hanseok Ko
Selective multi-path acoustic model based on database likelihoods
Akinobu Lee, Yuuichiro Mera, Hiroshi Saruwatari, Kiyohiro Shikano
Auxiliary variables in conditional Gaussian mixtures for automatic speech recognition
Todd A. Stephenson, Mathew Magimai-Doss, Hervé Bourlard
Constructing shared-state hidden Markov models based on a Bayesian approach
Shinji Watanabe, Yasuhiro Minami, Atsushi Nakamura, Naonori Ueda
Generalization of state-observation-dependency in partly hidden Markov models
Tetsuji Ogawa, Tetsunori Kobayashi
Laryngoscopic analysis of tibetan chanting modes and their relationship to register in sino-tibetan
John H. Esling
A corpus-based study of danish laryngealization
Kathleen Murray, Betina Simonsen
Variability in direction of dorsal movement during production of /l/
Natasha Warner, Allard Jongman, Doris Mücke
Segmentation of glides with tonal alignment as reference
Yi Xu, Fang Liu
Variability in the production of glottalized sonorants: data from yapese
Ian Maddieson, Julie Larson
A phonetic study of vietnamese tones: acoustic and electroglottographic measurements
Vu Ngoc Tuan, Christophe d'Alessandro, Sophie Rosset
Segment duration in spoken korean
Hyunsong Chung
Pause duration and variability in read texts
Elena Zvonik, Fred Cummins
Intrinsic phone durations are speaker-specific
Hartmut R. Pfitzinger
Preaspirated stops in southern Swedish
Mechtild Tronnier
Stop epenthesis at syllable boundaries
Natasha Warner, Andrea Weber
An analysis of transcription consistency in spontaneous speech from the buckeye corpus
William D. Raymond, Mark Pitt, Keith Johnson, Elizabeth Hume, Matthew Makashay, Robin Dautricourt, Craig Hilts
Contextual effects on voicing judgment of stop consonants in Japanese
Makiko Aoyagi
Discrimination of English vowels in consonantal contexts by native speakers of Japanese and its relations to dynamic information of formants
Akiyo Joto, Motohisa Imaishi, Yoshiki Nagase, Seiya Funatsu
Improving spoken language understanding using word confusion networks
Gokhan Tur, Jerry Wright, Allen Gorin, Giuseppe Riccardi, Dilek Hakkani-Tür
Improving latent semantic indexing based classifier with information gain
Li Li, Wu Chou
Discriminative training for call classification and routing
Hong-Kwang Jeff Kuo, Chin-Hui Lee, Imed Zitouni, Eric Fosler-Lussier, Egbert Ammicht
Speech and language processing for a constrained speech translation system
Stephen Cox
Automatic concept identification in goal-oriented conversations
Ananlada Chotimongkol, Alexander I. Rudnicky
Using EM-trained string-edit distances for approximate matching of acoustic morphemes
Michael Levit, Elmar Nöth, Allen Gorin
Speech-enabled natural language call routing: BBN call director
Premkumar Natarajan, Rohit Prasad, Bernhard Suhm, Daniel McCarthy
Weighted graph based decision tree optimization for high accuracy acoustic modeling
Sheng Gao, Jin-Song Zhang, Satoshi Nakamura, Chin-Hui Lee, Tat-seng Chua
Speech recognition using syllable patterns
Li Zhang, William H. Edmondson
A comparison of L1 and african-mother-tongue acoustic models for south african English speech recognition
Janus D. Brink, Elizabeth C. Botha
Speech modeling using variational Bayesian mixture of Gaussians
Panu Somervuo
On the use of Gaussian mixture model for speaker variability analysis
Tao Chen, Chao Huang, Eric Chang, Jingchun Wang
Models of speech dynamics in a segmental-HMM recognizer using intermediate linear representations
Philip J.B. Jackson, Martin J. Russell
Decision tree distribution tying based on a dimensional split technique
Heiga Zen, Keiichi Tokuda, Tadashi Kitamura
Speech synthesis, speech simulation and speech science
Mark Huckvale
Expressive speech synthesis using a concatenative synthesizer
Murtaza Bulut, Shrikanth S. Narayanan, Ann K. Syrdal
Eigenvoices for HMM-based speech synthesis
Kengo Shichiri, Atsushi Sawabe, Takayoshi Yoshimura, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura
Combining information sources for memory-based pitch accent placement
Erwin Marsi, Bertjan Busser, Walter Daelemans, Veronique Hoste, Martin Reynaert, Antal van den Bosch
Eye-fixation as a measure of real-time processing of synthesized words
Mary D. Swift, Ellen Campana, James F. Allen, Michael K. Tanenhaus
User-tailored generation for spoken dialogue: an experiment
Amanda Stent, Marilyn A. Walker, Steve Whittaker, Preetam Maloor
A system that learns to describe objects in visual scenes
Deb Roy
Integration of supra-lexical linguistic models with speech recognition using shallow parsing and finite state transducers
Xiaolong Mou, Stephanie Seneff, Victor Zue
EM training of finite-state transducers and its application to pronunciation modeling
Han Shu, I. Lee Hetherington
Finite-state transducer based hungarian LVCSR with explicit modeling of phonological changes
Máté Szarvas, Sadaoki Furui
Using dynamic WFST composition for recognizing broadcast news
Diamantino Caseiro, Isabel Trancoso
Transducer search space modelings for large-vocabulary speech recognition
Hans J. G. A. Dolfing
A comparison of two LVR search optimization techniques
Stephan Kanthak, Hermann Ney, Michael Riley, Mehryar Mohri
An efficient algorithm for the n-best-strings problem
Mehryar Mohri, Michael Riley
Structural Gaussian mixture models for efficient text-independent speaker verification
Bing Xiang, Toby Berger
Text-dependent speaker verification using lyapunov exponents
A. Petry, Dante A. C. Barone
User-customized password speaker verification based on HMM/ANN and GMM models
Mohamed F. BenZeghiba, Hervé Bourlard
Exploiting support vector machines in hidden Markov models for speaker verification
Dong Xin, Zhaohui Wu, Yingchun Yang
Speaker identification by location in an optimal space of anchor models
Yassine Mami, Delphine Charlet
ASR dependent techniques for speaker identification
Alex Park, Timothy J. Hazen
Factor analyzed Gaussian mixture models for speaker identification
Peng Ding, Yang Liu, Bo Xu
Phonetic speaker identification
Qin Jin, Tanja Schultz, Alex Waibel
DETAC: a discriminative criterion for speaker verification
Jirí Navrátil, Ganesh N. Ramaswamy
Hierarchical Gaussian mixture model for speaker verification
Ming Liu, Eric Chang, Bei-qian Dai
A reverse turing test using speech
Greg Kochanski, Daniel Lopresti, Chilin Shih
On effective speaker verification based on subword model
Sungjoo Ahn, Sunmee Kang, Hanseok Ko
Speaker verification using Gaussian component strings in dynamic trajectory space
Bing Xiang
Combining speaker and speech recognition systems
Larry P. Heck, Dominique Genoud
Automatic enrollment for speaker authentication
Qi Li, Hui Jiang, Qiru Zhou, Jinsong Zheng
Experiments in confidence scoring for word and sentence verification
M. Andorno, P. Laface, Roberto Gemello
Confidence metrics for speaker identification
Mark C. Huggins, John J. Grieco
Characteristics of a low reject mode speaker verification system
Daniel Elenius, Mats Blomberg
Special session: issues in audiovisual spoken language processing (when, where, and how?)
Lynne E. Bernstein, Denis K. Burnham, Jean-Luc Schwartz
Audio-visual speech enhancement with AVCDCN (audio-visual codebook dependent cepstral normalization)
Sabine Deligne, Gerasimos Potamianos, Chalapathy Neti
Audiovisual speech synthesis. from ground truth to models
Gérard Bailly
The stimulus as basis for audiovisual integration
Eric Vatikiotis-Bateson, Harold Hill, Miyuki Kamachi, Karen Lander, Kevin G. Munhall
The perceptual basis for audiovisual speech integration
Lawrence D. Rosenblum
Sources of variability in the perceptual training of /r/ and /l/: interaction of adjacent vowel, word position, talkers² visual and acoustic cues
Debra M. Hardison
Audiovisual perception in L2 learners
Valerie Hazan, Anke Sennema, Andrew Faulkner
Audiovisual integration of speech by children and adults with cochlear implants
Karen Iler Kirk, David B. Pisoni, Lorin Lachs
Auditory-visual speech perception examined by brain imaging and reaction time
Kaoru Sekiyama, Yoichi Sugita
Neurocognitive basis for audiovisual speech perception: evidence from event-related potentials
Curtis W. Ponton, Edward T. Auer, Lynne E. Bernstein
Perception and integration of audiovisual speech in human infants
David J. Lewkowicz
Seeing tongue movements from outside
Gérard Bailly, Pierre Badin
An audio-visual corpus for multimodal speech recognition in dutch language
Jacek C. Wojdel, Pascal Wiggers, Leon J.M. Rothkrantz
Medium vocabulary continuous audio-visual speech recognition
Pascal Wiggers, Jacek C. Wojdel, Leon J.M. Rothkrantz
DCT-based video features for audio-visual speech recognition
Martin Heckmann, Kristian Kroschel, Christophe Savariaux, Frédéric Berthommier
The effect of auditory-visual information and orthographic background in L2 acquisition
V. Dogu Erdener, Denis K. Burnham
Perceptual evaluation of audiovisual cues for prominence
Emiel Krahmer, Zsófia Ruttkay, Marc Swerts, Wieger Wesselink
Audio-visual scene analysis: evidence for a "very-early" integration process in audio-visual speech perception
Jean-Luc Schwartz, Frédéric Berthommier, Christophe Savariaux
Design of an audio-visual speech corpus for the czech audio-visual speech synthesis
Milos Zelezný, Petr Císar, Zdenek Krnoul, Jan Novák
Coordination of hand and orofacial movements for CV sequences in French cued speech
Virginie Attina, Denis Beautemps, Marie-Agnès Cathiard
Controling anticipatory behavior for rounding in French cued speech
Virginie Attina, Marie-Agnès Cathiard, Denis Beautemps
Audio-visual speech sources separation: a new approach exploiting the audio-visual coherence of speech stimuli
David Sodoyer, Laurent Girin, Christian Jutten, Jean-Luc Schwartz
Intonational and visual cues in the perception of interrogative mode in Swedish
David House
A link between cepstral shrinking and the weighted product rule in audio-visual speech recognition
Simon Lucey, Sridha Sridharan, Vinod Chandran
Can confidence scores help users post-editing speech recognizer output?
Taku Endo, Nigel Ward, Minoru Terada
Information retrieval based on speech recognition results
Masatoshi Watanabe, Masahide Sugiyama
Efficient combination of type-in and wizard-of-oz tests in speech interface development process
Saija-Maaria Lemmelä, Péter Pál Boda
Probabilistic retrieval based on document representations
Wolfgang Macherey, Jörg Viechtbauer, Hermann Ney
Radiodoc: a voice-accessible document system
Takuya Nishimoto, Masahiro Araki, Yasuhisa Niimi
Speech completion: on-demand completion assistance using filled pauses for speech input interfaces
Masataka Goto, Katunobu Itou, Satoru Hayamizu
Design of system-initiated digressive proposals for automated banking dialogues
Jenny Wilkie, Mervyn A. Jack, Peter Littlewood
Towards every-citizen²s speech interface: an application generator for speech interfaces to databases
Arthur R. Toth, Thomas K. Harris, James Sanders, Stefanie Shriver, Roni Rosenfeld
Training topic classifiers for conversational speech with limited data
Rukmini Iyer, Jeffrey Ma, Herbert Gish, Owen Kimball
Comparing isolately spoken keywords with spontaneously spoken queries for Japanese spoken document retrieval
Hiromitsu Nishizaki, Seiichi Nakagawa
Choosing speech or touchtone modality for navigation within a telephony natural language system
Jennifer C. Lai, Kwan Min Lee
Multi-scale and multi-model integration for improved performance in Chinese spoken document retrieval
Wai-Kit Lo, Helen M. Meng, P. C. Ching
Development of a GUI-based articulatory speech synthesis system
Kohichi Ogata, Yorinobu Sonoda
Investigation of coarticulation based on electromagnetic articulographic data
Jianwu Dang, Masaaki Honda, Kiyoshi Honda
Frequency dependence of vocal-tract length
Takuya Niikawa, Takanori Ando, Masafumi Matsumura
Functional modeling of face movements during speech
Shinji Maeda, Martine Toda, Andreas J. Carlen, Lyes Meftahi
Control system for talking robot to replicate articulatory movement of natural speech
Takemi Mochida, Masaaki Honda, Kouki Hayashi, Toshiharu Kuwae, Kunihiro Tanahashi, Kazufumi Nishikawa, Atsuo Takanishi
Feed the tiger: a method for evoking reliable jaw stretch reflexes in children
Donald S. Finan, Anne Smith, Michael Ho
Three-dimensional electromagnetic articulograph based on a nonparametric representation of the magnetic field
Tokihiko Kaburagi, Kohei Wakamiya, Masaaki Honda
Introduction of constraints in an acoustic-to-articulatory inversion method based on a hypercubic articulatory table
Yves Laprie, Slim Ouni
Acoustic-to-articulatory inverse mapping using an HMM-based speech production model
Sadao Hiroya, Masaaki Honda
Modeling articulatory dynamics in autoregressive linear system
Kiyoshi Hashimoto
A study of the two-mass model in terms of acoustic parameters
Denisse Sciamarella, Christophe d'Alessandro
Using time-stretched pulses for accurate splitting of speech utterances played back in noisy reverberant environments
Dorothea Kolossa, Qiang Huo
X-JToBI: an extended j-toBI for spontaneous speech
Kikuo Maekawa, Hideaki Kikuchi, Yosuke Igarashi, Jennifer Venditti
Dutch HLT resources: from BLARK to priority lists
Helmer Strik, Walter Daelemans, Diana Binnenpoorte, Janienke Sturm, F. De Vriend, Catia Cucchiarini
ACT: a graphical dialogue annotation comparison tool
Fan Yang, Susan E. Strayer, Peter A. Heeman
A training prompts generation algorithm for connected spoken word recognition
Ha-Jin Yu, Jin Suk Kim
A low-resource, miniature implementation of the ETSI distributed speech recognition front-end
Etienne Cornu, Hamid Sheikhzadeh, Robert Brennan
Memory space reduction for hidden Markov models in low-resource speech recognition systems
Sergey Astrov
Low complexity Mandarin speaker-independent isolated word recognition
Xia Wang, Juha Iso-Sipilä
Low complexity techniques for embedded ASR systems
Imre Kiss, Marcel Vasilache
Optimization of hidden Markov models for embedded systems
Klaus Reinhard, Jochen Junkawitsch, Andreas Kießling, Stefan Dobler
Data-driven vector clustering for low-memory footprint ASR
Karim Filali, Xiao Li, Jeff A. Bilmes
Utterance verification based on neighborhood information and Bayes factors
Hui Jiang, Chin-Hui Lee
Vocabulary independent OOV detection using support vector machines
Tommi Lahti, Janne Suontausta
A multi-class approach for modelling out-of-vocabulary words
Issam Bazzi, James Glass
Unconstrained versus constrained acoustic normalisation in confidence scoring
Jacques Duchateau, Patrick Wambacq
Acoustic and word lattice based algorithms for confidence scores
Daniele Falavigna, Roberto Gretter, Giuseppe Riccardi
Error-tolerant spoken language understanding with confidence measuring
Huei-Ming Wang, Yi-Chung Lin
Comparing intelligibility of several non-native accent classes in noise
Shawn A. Weil
Effect of F0 fluctuation and amplitude modulation of natural vowels on vowel identification in noisy environments
Kentaro Ishizuka, Kiyoaki Aikawa
Similarities of words in noise in Japanese
Kiyoko Yoneyama
The effects of F0 manipulation on the perceived distance of speech
Douglas S. Brungart, Alexander J. Kordik, Koel Das, Arnab K. Shaw
Time-compressing natural and synthetic speech
Esther Janse
Accounting for perceptual identification of consonants and vowels through acoustic dissimilarity
Jianxia Xue, Sumiko Takayanagi, Lynne E. Bernstein
Modeling recognition of speech sounds with minerva2
Travis Wade, Deborah K. Eakin, Russell Webb, Arvin Agah, Frank Brown, Allard Jongman, John Gauch, Thomas A. Schreiber, Joan Sereno
Syllable processing in English
Ruth Kearns, Dennis Norris, Anne Cutler
Perceptual effects of assimilation-induced violation of final devoicing in dutch
Cecile Kuijpers, Wilma van Donselaar, Anne Cutler
Access to homophonic meanings during spoken language comprehension: effects of context and neighborhood density
Michael C.W. Yip
Intelligibility of reverse speech in French: a perceptual study
Ivan Magrin-Chagnolleau, Melissa Barkat, Fanny Meunier
Contextual effects in the perception of fricative place of articulation: a rotational hypothesis
Willy Serniclaes, René Carré
What relationship between protrusion anticipation and auditory perception?
Rudolph Sock, Béatrice Vaxelaire, Véronique Hecker, Fabrice Hirsch
On the role of the "schwa" in the perception of plosive consonants
René Carré, Jean Sylvain Liénard, Egidio Marsico, Willy Serniclaes
The perception of stop consonant sequences in dyslexic and normal children
Noël Nguyen, Ludovic Jankowski, Michel Habib
Submoraic awareness by Japanese school children: evidence from a novel game
Takashi Otake, Akemi Iijima
Speaker intelligibility of adults and children
D. Markham, Valerie Hazan
Acoustical correlates to SD ratings of speaker characteristics in two speaking styles
Yasuki Yamashita, Hiroshi Matsumoto
Subjective assessment of frequency bands for perception of speaker identity
Eda Ormanci, U. Hakan Nikbay, Oytun Turk, Levent M. Arslan
Design for a speech-to-speech translator for field use
David Stallard, Premkumar Natarajan, Mohammed Noamany, Richard Schwartz, John Makhoul
Rapid development of speech-to-speech translation systems
Alan W. Black, Ralf D. Brown, Robert Frederking, Kevin Lenzo, John Moody, Alexander I. Rudnicky, Rita Singh, Eric Steinbrecher
Bilingual corpus cleaning focusing on translation literality
Kenji Imamura, Eiichiro Sumita
Speech to speech translation system for monologues-data driven approach
Hideki Tanaka, Stephen Nightingale, Hideki Kashioka, Kenji Matsumoto, Masamchi Nishiwaki, Tadashi Kumano, Takehiko Maruyama
Using x-grams for speech-to-speech translation
Adrià de Gispert, José B. Mariño
Statistical machine translation decoder based on phrase
Taro Watanabe, Eiichiro Sumita
Reliability measures for translation quality
Eiichiro Sumita, Yasuhiro Akiba, Kenji Imamura
Statistical natural language generation for speech-to-speech machine translation systems
Bowen Zhou, Yuqing Gao, Jeffrey Sorensen, Zijian Diao, Michael Picheny
Improving statistical machine translation for a speech-to-speech translation task
Stephan Vogel, Alicia Tribble
Speech-to-speech translation system evaluation: results for French for the NESPOLE! project first showcase
Solange Rossato, Hervé Blanchon, Laurent Besacier
Interlingua based statistical machine translation
Manuel Kauers, Stephan Vogel, Christian Fügen, Alex Waibel
Separation of voiced source characteristics and vocal tract transfer function characteristics for speech sounds by iterative analysis based on AR-HMM model
Nobuyuki Nishizawa, Keikichi Hirose, Nobuaki Minematsu
Automatic extraction of model parameters from fundamental frequency contours of English utterances
Shuichi Narusawa, Nobuaki Minematsu, Keikichi Hirose, Hiroya Fujisaki
Pitch extraction of speech signals using an eigen-based subspace method
Takahiro Murakami, Munehiro Namba, Tetsuya Hoya, Yoshihisa Ishida
Robust fundamental frequency estimation against background noise and spectral distortion
Tomohiro Nakatani, Toshio Irino
2-d processing of speech with application to pitch estimation
Thomas F. Quatieri
Towards automatic closed captioning : low latency real time broadcast news transcription
Murat Saraclar, Michael Riley, Enrico Bocchieri, Vincent Goffin
Automatic transcription of courtroom speech
Rohit Prasad, Long Nguyen, Richard Schwartz, John Makhoul
Japanese broadcast news transcription
Long Nguyen, Xuefeng Guo, Richard Schwartz, John Makhoul
German broadcast news transcription
Robert Hecht, Jürgen Riedler, Gerhard Backfried
Speech recognition with a re-speak method for subtitling live broadcasts
Toru Imai, Atsushi Matsui, Shinichi Homma, Takeshi Kobayakawa, Kazuo Onoe, Shoei Sato, Akio Ando
Evaluation of the method to detect Japanese local speech rate deceleration applying the variable threshold with a constant term
Keiichi Takamaru, Makoto Hiroshige, Kenji Araki, Koji Tochinai
Tempo modulations in English: selected pilot study results
Sandra P. Kirkham
Modeling durational variability in reading aloud a connected text
Caroline L. Smith
Duration modeling for arabic text to speech synthesis
Yasser Hifny, Mohsen Rashwan
Learning syllable duration and intonation of Mandarin Chinese
Oliver Jokisch, Hongwei Ding, Hans Kruschke, Guntram Strecha
Controlling perceived degradation in spectrum envelope modeling via predistortion
Pushkar Patwardhan, Preeti Rao
Benefit and cost analysis of using the improved vector quantizer design algorithm for glottal source waveform compression
Peter Veprek, Alan B. Bradley
Speech coding and transmission for improved automatic recognition
Xin Zhong, Jon A. Arrowood, Mark A. Clements
Coding speech at very low rates using straight and temporal decomposition
Phu Chien Nguyen, Takao Ochi, Masato Akagi
Floating-point adaptive multi-rate wideband speech codec
Toni P. Nieminen
On improving the performance of analysis-by-synthesis coding using a multi-magnitude algebraic code-book excitation signal
Omar Halmi, Hesham Tolba, Driss Guerchi, Douglas OShaughnessy
Improved performance speech codec for mobile communications
K. Humphreys, R. Lawlor
Fixed-length segment coding of LSF parameters
Evgeni Yakhnich, Yuval Bistritz
Interaction of voice over internet protocol speech coders and disordered speech samples
Vijay Parsa, Donald G. Jamieson
Speech recognition performance comparison between DSR and AMR transcoded speech
Holly Kelleher, David Pearce, Doug Ealey, Laurent Mauuary
The influence of speech coding on recognition performance in telecommunication networks
Hans-Günter Hirsch
Spectral enhancement preprocessing for the HNM coding of noisy speech
Gautam Moharir, Pushkar Patwardhan, Preeti Rao
Contribution to topic identification by using word similarity
Armelle Brun, Kamel Smaïli, Jean-Paul Haton
Speechfind: an experimental on-line spoken document retrieval system for historical audio archives
Bowen Zhou, John H. L. Hansen
Topic tracking using subject templates
Yoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi
Topic detection of an utterance for speech dialogue processing
Katsushi Asami, Toshiyuki Takezawa, Genichiro Kikui
Real-time rich-content transcription of Chinese broadcast news
Daben Liu, Jeffrey Ma, Dongxin Xu, Amit Srivastava, Francis Kubala
Improved Chinese spoken document retrieval with hybrid modeling and data-driven indexing features
Chun-Jen Wang, Berlin Chen, Lin-shan Lee
Exploring sub-word features and linear support vector machines for German spoken document classification
Martha Larson, Stefan Eickeler, Gerhard Paaß, Edda Leopold, Jörg Kindermann
Goal-directed ASR in a multimedia indexing and searching environment (MUMIS)
Mirjam Wester, Judith M. Kessens, Helmer Strik
Confusion-based query expansion for OOV words in spoken document retrieval
Beth Logan, J. M. Van Thong
Cluster identification for speaker-environment tracking
J. T. Wickramaratna, P. C. Woodland
Robust speech / music classification in audio documents
Julien Pinquier, Jean-Luc Rouas, Régine André-Obrecht
Expanded examinations of a low frequency modulation feature for speech/music discrimination
Stefan Karnebäck
Speech, music and songs discrimination in the context of handsets variability
Hassan Ezzaidi, Jean Rouat
Acoustic correlates of task load and stress
Klaus R. Scherer, D. Grandjean, Tom Johnstone, Gudrun Klasmeyer, Thomas Bänziger
Frequency band analysis for stress detection using a teager energy operator based feature
Mandar A. Rahurkar, John H. L. Hansen, James Meyerhoff, George Saviolakis, Michael Koenig
The acoustic realization of anger, fear, joy and sadness in Chinese
Jiahong Yuan, Liqin Shen, Fangxin Chen
Emotional space improves emotion recognition
Raquel Tato, Rocío Santos, Ralf Kompe, J. M. Pardo
Emotion recognition from textual input using an emotional semantic network
Ze-Jing Chuang, Chung-Hsien Wu
Prosody-based automatic detection of annoyance and frustration in human-computer dialog
Jeremy Ang, Rajdip Dhillon, Ashley Krupski, Elizabeth Shriberg, Andreas Stolcke
RUSLANA: a database of Russian emotional utterances
Veronika Makarova, Valery A. Petrushin
A pragmatic confirmation mechanism for an object-based spoken dialogue manager
Ian M. ONeill, Michael F. McTear
Serving complex user wishes with an enhanced spoken dialogue system
Sunna Torge, Stefan Rapp, Ralf Kompe
Integrating speech with keypad input for automatic entry of spelling and pronunciation of new words
Grace Chung, Stephanie Seneff
Reference resolution by human partners in a natural interactive problem-solving task
Ellen Campana, Sarah Brown-Schmidt, Michael K. Tanenhaus
Is the speaker done yet? faster and more accurate end-of-utterance detection using prosody
Luciana Ferrer, Elizabeth Shriberg, Andreas Stolcke
Adding intelligent help to mixed-initiative spoken dialogue systems
Genevieve Gorrell, Ian Lewin, Manny Rayner
Analysis of user behavior under error conditions in spoken dialogs
Jongho Shin, Shrikanth S. Narayanan, Laurie Gerber, Abe Kazemzadeh, Dani Byrd
Production based pitch modification of voiced speech
Yinglong Jiang, Peter Murphy
F0 generation for speech synthesis using a multi-tier approach
Xuejing Sun
From text to prosody without toBI
Volker Strom
Improved corpus-based synthesis of fundamental frequency contours using generation process model
Keikichi Hirose, Masaya Eto, Nobuaki Minematsu
Intonation modelling for the synthesis of structured documents
Jeska Buhmann, Jean-Pierre Martens, Lieve Macken, Bert Van Coile
Applying fallback to prosodic unit selection from a small imitation database
Joram Meron
Clustering and feature learning based F0 prediction for Chinese speech synthesis
Jianhua Tao, Lianhong Cai
Evaluation of formant-like features for ASR
Katrin Weber, Febe de Wet, Bert Cranen, Lou Boves, Samy Bengio, Hervé Bourlard
Entropy of energy operator as feature for large vocabulary Mandarin speaker independent speech recognition
Fadhil H. T. Al-Dulaimy, Zuoying Wang
Improving parametric trajectory modeling by integration of pitch and tone information
Yiyan Zhang, Wenju Liu, Bo Xu, Huayun Zhang
Comparative experiments to evaluate the use of auditory-based acoustic distinctive features and formant cues for automatic speech recognition using a multi-stream paradigm
Hesham Tolba, Sid-Ahmed Selouani, Douglas OShaughnessy
Speech recognition using combined acoustic and articulatory information with retraining of acoustic model parameters
Ka-Yee Leung, Manhung Siu
Improved phone recognition on TIMIT using formant frequency data and confidence measures
N. J. Wilkinson, Martin J. Russell
Speaker independent speech recognition using features based on glottal sound source
Norihide Kitaoka, Daisuke Yamada, Seiichi Nakagawa
An evaluation of using mutual information for selection of acoustic-features representation of phonemes for speech recognition
Mohamed Kamal Omar, Ken Chen, Mark Hasegawa-Johnson, Yigal Brandman
A flexible stream architecture for ASR using articulatory features
Florian Metze, Alex Waibel
Speech recognition using fundamental frequency and voicing in acoustic modeling
Andrej Ljolje
A comparison of front-end analyses for Thai speech recognition
Montri Karnjanadecha, Patimakorn Kimsawad
New model for speech residual signal shaping with static nonlinearity
Jari Turunen, Juha T. Tanttu, Pekka Loula
Formant model estimation and transformation for voice morphing
Ching-Hsiang Ho, Dimitrios Rentzos, Saeed Vaseghi
Production and perception of pauses and their linguistic context in read and spontaneous speech in Swedish
Beáta Megyesi, Sofia Gustafson-Capková
Non-linear techniques for dysphonic voice analysis and correction
Claudia Manfredi, Lorenzo Matassini
Adaptive estimation of time-varying features from high-pitched speech based on an excitation source HMM
Akira Sasou, Kazuyo Tanaka
Lip gestures in English sibilants: articulatory - acoustic relationship
Martine Toda, Shinji Maeda, Andreas J. Carlen, Lyes Meftahi
Bark resolution from speech data
Naren Malayath, Hynek Hermansky
Noise-robust speech recognition in car environments using genetic algorithms and a mel-cepstral subspace approach
Sid-Ahmed Selouani, Douglas OShaughnessy
Modeling with a subspace constraint on inverse covariance matrices
Scott Axelrod, Ramesh Gopinath, Peder Olsen
Improving speech recognition performance of small microphone arrays using missing data techniques
Iain A. McCowan, Andrew C. Morris, Hervé Bourlard
Double the trouble: handling noise and reverberation in far-field automatic speech recognition
David Gelbart, Nelson Morgan
Model-based independent component analysis for robust multi-microphone automatic speech recognition
Laurent Couvreur, Christophe Ris
Compensation of channel effect on line spectrum frequencies
An-Tze Yu, Hsiao-Chuan Wang
Codebook dependent dynamic channel estimation for Mandarin speech recognition over telephone
Huayun Zhang, Zhaobing Han, Bo Xu
Robust multiple resolution analysis for automatic speech recognition
Roberto Gemello, Franco Mana, Paolo Pegoraro, Renato De Mori
HMM-based methods for channel error mitigation in distributed speech recognition
Antonio M. Peinado, Victoria Sánchez, José L. Pérez-Córdoba, José C. Segura, Antonio J. Rubio
Network-based vs. distributed speech recognition in adaptive multi-rate wireless systems
Tim Fingscheidt, Stefanie Aalburg, Sorel Stan, Christophe Beaugeant
Channel noise robustness for low-bitrate remote speech recognition
Alexis Bernard, Abeer Alwan
Influence of transmission errors on ASR systems
C. Peláez-Moreno, A. Gallardo-Antolín, J. Vicente-Peña, F. Díaz-de-María
Robust feature extraction in a variety of input devices on the basis of ETSI standard DSR front-end
Satoru Tsuge, Shingo Kuroiwa, Masami Shishibori, Fuji Ren, Kenji Kita
Channel error protection scheme for distributed speech recognition
Zheng-Hua Tan, Paul Dalsgaard
The effects of speech compression on speech recognition and text-to-speech synthesis
Yeshwant Muthusamy, Yifan Gong, Roshan Gupta
Transform-based feature vector compression for distributed speech recognition
Ben Milner, Xu Shao
Multimodal language processing for mobile information access
Michael Johnston, Srinivas Bangalore, Amanda Stent, Gunaranjan Vasireddy, Patrick Ehlen
SALT: a spoken language interface for web-based multimodal dialog systems
Kuansan Wang
Building voiceXML-based applications
Christina Bennett, Ariadna Font Llitjós, Stefanie Shriver, Alexander I. Rudnicky, Alan W. Black
Operations for context-based multimodal interpretation in conversational systems
Joyce Chai
A distributed multimodal dialogue system based on dialogue system and web convergence
Feng Liu, Antoine Saad, Li Li, Wu Chou
A modality-independent MMI system architecture
Kouichi Katsurada, Yoshihiko Ootani, Yusaku Nakamura, Satoshi Kobayashi, Hirobumi Yamada, Tsuneo Nitta
An architecture for a multi-modal web browser
Cristiana Armaroli, Ivano Azzini, Lorenza Ferrario, Toni Giorgino, Luca Nardelli, Marco Orlandi, Carla Rognoni
Collecting mobile multimodal data for match
Patrick Ehlen, Michael Johnston, Gunaranjan Vasireddy
ISIS: a multi-modal, trilingual, distributed spoken dialog system developed with CORBA, java, XML and KQML
Helen M. Meng, P. C. Ching, Yee Fong Wong, Cheong Chat Chan
An acoustic comparison between american English and australian English vowels
Kimiko Tsukada
A case study of portuguese and English bilinguality
Luis M.T. Jesus, Christine H. Shadle
An IPA vowel diagram approach to analysing L1 effects on vowel production and perception
Olga I. Dioubina, Hartmut R. Pfitzinger
Phonological norms in faroese speech synthesis
Pétur Helgason, Sjúrðhur Gullbein
Studying pronunciation variants in French by using alignment techniques
Philippe Boula de Mareüil, Martine Adda-Decker
Perceived boundary strength
Petra Hansson
Syntax over focus
Sun-Ah Jun
Duration related phase realignment of Thai tones
John J. Ohala, Rungpat Roengpitya
Probabilistic ranking of constraints
Louis ten Bosch
Multi-dimensional analysis of sonority: perception, acoustics, and phonology
Masahiko Komatsu, Shinichi Tokuma, Won Tokuma, Takayuki Arai
On the relevance of bandwidth extension for speaker verification
Marcos Faúndez-Zanuy, Mattias Nilsson, W. Bastiaan Kleijn
Speaker recognition using discriminative features selection
Bogdan Sabac
Designing a speaker-discriminative adaptive filter bank for speaker recognition
Tomi Kinnunen
Divergence-based out-of-class rejection for telephone handset identification
Chi-Leung Tsang, CMan-Wai Mak, Sun-Yuan Kung
A handset identifier using support vector machines
Purdy Ho
Towards the question: why has speaking rate such an impact on speech recognition performance?
Robert Faltlhauser, Günther Ruske, M. Thomae
Robust voiced-unvoiced decision associated to continuous pitch tracking in noisy telephone speech
Mijail Arcienega, Andrzej Drygajlo
Noise adaptive speech recognition with acoustic models trained from noisy speech evaluated on Aurora-2 database
Kaisheng Yao, Kuldip K. Paliwal, Satoshi Nakamura
Recognition of noisy speech using normalized moments
Jingdong Chen, Yiteng (Arden) Huang, Qi Li, Frank K. Soong
Low-resource noise-robust feature post-processing on Aurora 2.0
Chia-Ping Chen, Jeff A. Bilmes, Katrin Kirchhoff
Exploiting variances in robust feature extraction based on a parametric model of speech distortion
Li Deng, Jasha Droppo, Alex Acero
Improving performance of an HMM-based ASR system by using monophone-level normalized confidence measure
Muhammad Ghulam, Takashi Fukuda, Takaharu Sato, Tsuneo Nitta
Model partial pronunciation variations for spontaneous Mandarin speech recognition
Yi Liu, Pascale Fung
Reducing pronunciation lexicon confusion and using more data without phonetic transcription for pronunciation modeling
Fang Zheng, Zhanjiang Song, Pascale Fung, William Byrne
Classification error from the theoretical Bayes classification risk
Erik McDermott, Shigeru Katagiri
Combined binary classifiers with applications to speech recognition
Aldebaro Klautau, Nikola Jevtic, Alon Orlitsky
Optimal selection of speech data for automatic speech recognition systems
Arkadiusz Nagórski, Lou Boves, Herman Steeneken
Hypophonia in parkinson disease: neural correlates of voice treatment with LSVT revealed by PET
Mario Liotti, Lorraine O. Ramig, Deanie Vogel, Pamela New, Chris Cook, Peter Fox
Preliminary data on effects of behavioral and levodopa therapies on speech-accompanying gesture in Parkinson's disease
Susan Duncan
Speech pauses and gestural holds in Parkinson's disease
Francis Quek, Mary Harper, Yonca Haciahmetoglu, Lei Chen, Lorraine O. Ramig
Oro-facial changes in parkinson²s disease following intensive voice therapy (LSVT)
Jennifer L. Spielman, Lorraine O. Ramig, Joan C. Borod
Swallowing and voice effects of lee silverman voice treatment (LSVT)
Jeri Logemann, Ralph Sundin, Jean Sundin
Application of the lee silverman voice treatment (LSVT) to individuals with multiple sclerosis, ataxic dysarthria, and stroke
Leslie Will, Lorraine O. Ramig, Jennifer L. Spielman
Think big, from voice to limb movement therapy
Becky G. Farley
On the estimation of signal-to-noise ratio in continuous speech for abnormal voices
Vijay Parsa, Donald G. Jamieson, Karen Stenning, Herbert A. Leeper
Computationally efficient method of speech enhancement based on block representation of signal in state space and vector quantization
V. Semenov, A. Kovtonyuk, A. Kalyuzhny
Active speech cancellation for cellular speech
Kazuhiro Kondo, Kiyoshi Nakagawa
Warped-LP residual resampling using DCT for pitch modification
R. Muralishankar, A. G. Ramakrishnan, P. Prathibha
Application of real-time AMDF pitch-detection in a voice gender normalisation system
E. Jung, A. Schwarzbacher, K. Humphreys, R. Lawlor
A copy synthesis method to pilot the klatt synthesiser
Yves Laprie, Anne Bonneau
Speaker recognizability evaluation of a voicefont-based text-to-speech system
Masaharu Sakamoto, Takashi Saito
Time-frequency transforms and beamforming for speaker recognition
Antonio Satué-Villar, Juan Fernández-Rubio
Speaker change detection using a new weighted distance measure
Soonil Kwon, Shrikanth S. Narayanan
FPGA hardware for speech recognition using hidden Markov models
José L. Gómez-Cipriano, Roger P. Nunes, Dante A. C. Barone
Evaluation of a speech recognition / generation method based on HMM and straight
Toshio Irino, Yasuhiro Minami, Tomohiro Nakatani, Minoru Tsuzaki, H. Tagawa
Objective distance measures for spectral discontinuities in concatenative speech synthesis
Jithendra Vepa, Simon King, Paul Taylor
Data-driven segment preselection in the IBM trainable speech synthesis system
Wael Hamza, Robert Donovan
Perpetually optimizing the cost function for unit selection in a TTS system with one single run of MOS evaluation
Hu Peng, Yong Zhao, Min Chu
Information-theoretic criteria for unit selection synthesis
Jon Yi, James Glass
Acoustic measures vs. phonetic features as predictors of audible discontinuity in concatenative speech synthesis
Hisashi Kawai, Minoru Tsuzaki
A study of multi-speaker dialogue system for mobile information retrieval
Hsien-Chang Wang, Chieh-Yi Huang, Chung-Hsien Yang, Jhing-Fa Wang
AT&t help desk
Giuseppe Di Fabbrizio, Dawn Dutton, Narendra K. Gupta, Barbara Hollister, Mazin Rahim, Giuseppe Riccardi, Robert Schapire, Juergen Schroeter
Basurde[lite], a machine-driven dialogue system for accessing railway timetable information
Roger Trias-Sanz, José B. Mariño
Amplitude convergence in children²s conversational speech with animated personas
Rachel Coulston, Sharon Oviatt, Courtney Darves
Flexible dialogue management in the talk²ntravel system
David Stallard
E-mail goes mobile: the design and implementation of a spoken language interface to e-mail
Daniela Oria, Esa Koskinen
Wizard of oz evaluation of a dialogue with communicator system in chile
Néstor Becerra Yoma, Angela Cortés, Mauricio Hormazábal, Enrique López
A portable, server-side dialog framework for voiceXML
Bob Carpenter, Sasha Caskey, Krishna Dayanidhi, Caroline Drouin, Roberto Pieraccini
Spoken dialogue system for home health care
S. Takahashi, T. Morimoto, S. Maeda, N. Tsuruta
ACIMET: access to meteorological information by telephone
Jaume Padrell, Javier Hernando
SPIN: language understanding for spoken dialogue systems using a production system approach
Ralf Engel