Rüdiger Hoffmann (2019), ''Nothing but a lung, a glottis, and a mouth'' - The long way of speech synthesis, HSCR
Peter Donhauser (2019), From speech- and sound research to applications and products, HSCR
Christian Huber, Benjamin Fischer, Bernhard Graf (2019), Corpus of Austrian dialect recordings from the 20th century - A cooperation project, HSCR
Christoph Draxler, Jürgen Trouvain (2019), On principles of phonetic archiving: From paleo-phonetics to modern speech data managment, HSCR
Carina Lozo, Jan Luttenberger, Michael Pucher (2019), The thought collective behind thirty years of progress in speech systems, HSCR
Silke Berdux, Alexander Steinbeißer (2019), Speaking apparatus now speaking: A project at the Deutsches Museum in Munich, HSCR
Fabian Brackhane (2019), A 'polyglottal' speech synthesis - Modifications for a replica of Kempelen's speaking machine, HSCR
Takayuki Arai (2019), Sound sources used in speech production research with physical models of the human vocal tract, HSCR
K. S. Nataraj, H. Dasgupta, P. C. Pandey (2019), Early indirect techniques for estimating the vocal tract area function, HSCR
Quintino Lopes, Elisabete Pereira (2019), Armando de Lacerda and experimental phonetics in the inter-war period: Scientific innovation and circulation between Portugal, Germany and Havard, HSCR
Rainer Jäckel (2019), Methodological aspects of early experimental phonetics, HSCR
Micheal Ashby (2019), The acoustic analysis of early speech recordings, HSCR
Angelika Braun (2019), From visible speech to voiceprints - The missing link, HSCR
Rachel Tessmer, Barath Chandrasekaran (2016), Stability and plasticity in the neural representation of linguistic pitch patterns, TAL
Larry Hyman (2016), Lexical vs. Grammatical Tone: Sorting out the Differences, TAL
Jie Zhang (2016), Using Nonce-probe Tests and Auditory Priming to Investigate Speakers' Phonological Knowledge of Tone Sandhi, TAL
Yu-Fu Chien, Joan Sereno, Jie Zhang (2016), Priming the Representation of Taiwanese Tone Sandhi Words, TAL
Jiang Liu, Jie Zhang (2016), The effects of talker variability and variances on incidental learning of lexical tones, TAL
Jennifer Alexander, Yue Wang (2016), Cross-language Lexical-tone Identification, TAL
Xin Li, René Kager, Wentao Gu (2016), Surface vs. Underlying Listening Strategies for Cross-Language Listeners in the Perception of Sandhied Tones in the Nanjing Dialect, TAL
Frank Kügler (2016), Embedded clauses and recursive prosodic phrasing in Akan, TAL
Laura McPherson (2016), Cyclic spell-out and the interaction of Seenku tonal processes, TAL
James Kirby, D. Robert Ladd (2016), Tone-melody correspondence in Vietnamese popular song, TAL
Qian Luo, Karthik Durvasula, Yen-Hwei Lin (2016), Inconsistent Consonantal Effects on F- in Cantonese and Mandarin, TAL
Jeremy Perkins, Seunghun Lee, Julián Villegas (2016), An Interplay Between F0 and Phonation in Du'an Zhuang Tone, TAL
Marc Brunelle (2016), Intonational phrase marking in Southern Vietnamese, TAL
Donna Erickson, Ray Iwata, Atsuo Suemitsu (2016), Jaw displacement and phrasal stress in Mandarin Chinese, TAL
Xiaomei Wang, Yen-Hwei Lin (2016), Metrical Structure and Tone Sandhi: Evidence from Ei Tonal Reduction, TAL
Albert Lee, Yi Xu (2016), Effect of speech rate on pre-low raising in Cantonese, TAL
Kathryn Franich (2016), Perception of Tonal Contours in Medʉmba, TAL
Mao-Hsu Chen (2016), Production and Perception of a Tonal Neutralization Case in Taiwan Souther Min, TAL
Wenyi Ling, Amy Schafer (2016), Tone Pair Similarity and the Perception of Mandarin Tones by Mandarin and English Listeners, TAL
Hong Zhang (2016), Boundary effects on allophonic creaky voice: A case study of Mandarin lexical tones, TAL
Katrina Connell, Annie Tremblay, Jie Zhang (2016), The Timing of Acoustic vs. Perceptual Availability of Segmental and Suprasegmental Information, TAL
Zenghui Liu, Hans Van de Velde, Aoju Chen (2016), Prosodic focus marking in Dali Mandarin, TAL
Aaron Carter-Ényi, Quintina Carter-Ényi (2016), Perception of Syntagmatic Tone Intervals in Ìgbò and Yorùbá, TAL
Kalyan Das, Shakuntala Mahanta (2016), Tonal Alignment and Prosodic Word domain in Boro, TAL
Yang Li (2016), Complete and incomplete neutralisation in Fuzhou tone sandhi, TAL
Xiaole Sun (2016), Interaction between breathy tones and aspirated consonants in S'gaw Karen, TAL
Joshua Benn (2016), Consonant-Tone-Phonation Interactions in Guienagati Zapotec, TAL
Xiaoluan Liu, Yi Xu (2016), Pitch perception and surprise in Mandarin Chinese: Evidence for parallel encoding via additive division of pitch range, TAL
Mengyue Wu, Brett Baker, Janet Fletcher, Rikke Bundgaard-Nielsen (2016), How Pitch Moves: Production of Cantonese Tones by Speakers with Different Tonal Experiences, TAL
Xunan Huang, Caicai Zhang, Feng Shi, Nan Yan, Lan Wang (2016), Impaired Vowel Discrimination in Mandarin-speaking Congenital Amusics, TAL
Amalesh Gope, Shakuntala Mahanta (2016), Perception of Lexical Tones in Sylheti, TAL
Jia Tian, Jianjing Kuang (2016), Revisiting the Register Contrast in Shanghai Chinese, TAL
Asim Twaha, Shakuntala Mahanta (2016), Phonetic cues to contrastive focus in Standard Colloquial Assamese, TAL
Wei Lai, Jianjing Kuang (2016), Prosodic grouping in Chinese trisyllabic structures by multiple cues—tone coarticulation, tone sandhi and consonant lenition, TAL
Teresa Proto (2016), Methods of analysis for tonal text-setting. The case study of Fe'Fe' Bamileke, TAL
Una Chow, Stephen Winters (2016), The Role of the Final Tone in Singaling Statements and Questions in Mandarin, TAL
Lee-Feng Chien (2002), Information retrieval techniques for spoken language processing, ISCSLP
Kuansan Wang (2002), Speech recognition, understanding and dialog modeling, ISCSLP
Hsiao-Chuan Wang (2002), Application of speech technology to the assistance of speech and auditory training, ISCSLP
Zuoying Wang, Xi Xiao (2002), The inhomogeneous hidden Markov models and its training and recognition algorithms of speech recognition, ISCSLP
Haizhou Li (2002), Concatenative Chinese speech synthesis and quality evaluation, ISCSLP
Helen Meng (2002), Intelligent speech for information systems (ISIS): a multi-modal, trilingual, distributed conversational system with combined interaction and delegation dialogs, ISCSLP
Jhing-Fa Wang (2002), Challenges and advances in semantic representation and interpretation, ISCSLP
Der-Jenq Liu, Chin-Teng Lin (2002), A generalized common vector approach for robust speaker independent automatic speech recognition, ISCSLP
Yiqing Zu, Yingzhi Chen, Yaxin Zhang, Lei Zhou, Ming Shen, Jingjing Huang (2002), A super phonetic system and multi-dialect Chinese speech corpus for speech recognition, ISCSLP
Xuan Zhu, Rui Wang, Yining Chen, Jia Liu, Run-Sheng Liu (2002), Acoustic model comparison for an embedded phoneme-based Mandarin name dialing system, ISCSLP
Huayun Zhang, Bo Xu, Taiyi Huang (2002), Improving performance of telephone-based Mandarin speech recognition, ISCSLP
Jian Shan, Yuanyuan Shi, Jia Liu, Runsheng Liu (2002), Comparative study of linear feature transformation techniques for Mandarin digit string recognition, ISCSLP
Ming-Shing YU, Neng-Huang PAN, Ming-Jer WU (2002), A statistical model with hierarchical structure for predicting prosody in a Mandarin text-to-speech system, ISCSLP
Zhenli Yu, Dongjian Yue, Jian-Cheng Huang (2002), Concatenative Mandarin TTS accommodating isolated English words, ISCSLP
Wei-Chih Kuo, Yih-Ru Wang, Hung-Mao Lu, Sin-Horng Chen (2002), An NN-based approach to prosody generation for English word spelling in English-Chinese bilingual TTS, ISCSLP
Jian-Hua Tao, Sheng Zhao, Lian-Hong Cai (2002), Automatic stress prediction of Chinese speech synthesis, ISCSLP
Jing Li, Mingxing Xu, Wenhu Wu (2002), Study on framework for Chinese pronunciation variation modeling, ISCSLP
Mei-Fang Huang, Kuan-Ting Chen, Hsin-Min Wang (2002), Towards retrieval of video archives based on the speech content, ISCSLP
Lee-Feng Chien, Chien-Chung Huang, Jei-Wen Teng, Shui-Lung Chuang (2002), Automatic taxonomy generation for speech archives, ISCSLP
Chun-Jen Wang, Berlin Chen, Lin-Shan Lee (2002), A data-driven indexing approach for Chinese spoken document retrieval, ISCSLP
Hsien-Chang Wang, Chieh-Yi Huang, Chung-Hsien Yang, Jhing-Fa Wang (2002), Multi-speaker dialogue for mobile information retrieval, ISCSLP
Chih-Hsing Hsu, Miaw-Ru Hsu, Cher-Yao Yang, Sen-Chia Chang (2002), On the construction of a voiceXML voice browser, ISCSLP
Hao-jiang Deng, Li-min Du, Hong-jie Wan (2002), Hybrid text-independent speaker recognition using character-based background HMMs and GMMs for Mandarin speech, ISCSLP
Yih-Ru Wang, Shin-Ming Fan (2002), An improvement of the GMM speaker identification method by using two-state HMM and discriminative training, ISCSLP
Ze-Jing Chuang, Chung-Hsien Wu (2002), Emotion recognition via acoustic features and semantic contents in speech, ISCSLP
Chun-Jen Lee, Jason S. Chang (2002), Rapid prototyping an operator assisted call routing system, ISCSLP
Xavier Menendez-Pidal, Lei Duan, Jingwen Lu, Beatriz Dukes, Mike Emonts, Gustavo Hernandez-Abrego, Lex Olorenshaw (2002), Efficient phone based recognition engines for Chinese and English isolated command applications, ISCSLP
Fadhil H. T. Al-Dulaimy, Zuoying Wang (2002), Time-frequency distributions of spectrum energy operator in large vocabulary Mandarin speaker independent speech recognition system, ISCSLP
Tommy Sheu, Bor-Shen Lin (2002), Dynamic and goal-oriented interaction for multi-modal service agents, ISCSLP
Li-Wei Wang, Zuo-Ying Wang (2002), Testing the hypothesis of multivariate normality in bayesian approaches to speaker adaptation, ISCSLP
Tieyan Fu, Qixiu Hu, Guangyou Xu (2002), Incorporating probability into support vector machine for speaker recognition, ISCSLP
Guo-Hong Ding, Chengrong Li, Bo Xu (2002), Comparisons of MLLR and CDCN for speech recognition in additive noise by experiments, ISCSLP
Cailian Miao, Yangsheng Wang (2002), The efficient PMC for robust speech recognition in noisy environments, ISCSLP
Xue Wen, Runsheng Liu (2002), Enhancing the stability of speaker verification with compressed templates, ISCSLP
Ching-Tang Hsieh, Chih-Hsu Hsu (2002), Speech detection based on discrete wavelet transform, ISCSLP
Anhong Wang, Shinan Lu, Ming Chen (2002), Pitch declination in the statement sentence in Mandarin, ISCSLP
Yuling Zheng, Huaiqiao Bao (2002), Research on the semivowel by dynamic palatogram in standard Chinese, ISCSLP
Yujia Li, Tan Lee, Yao Qian (2002), Acoustical F0 analysis of continuous cantonese speech, ISCSLP
Chuan Jia, Bo Xu (2002), An improved entropy-based endpoint detection algorithm, ISCSLP
Ye Tian, Zuoying Wang, Dajin Lu (2002), Robust speech detection with heteroscedastic discriminant analysis applied to the time-frequency energy, ISCSLP
Dong Wang, Xiaoyan Hzu, Ying Liu (2002), A new normalization for MFCC: multi layer strategy and rrcursive progress, ISCSLP
Li Wang, Xin Lv, Tie-Jun Zhao, Zhan-Yi Liu (2002), A pitch detection algorithm based on special points and area, ISCSLP
Dong Wang, Yi-Ning Chen, Jia Liu (2002), An algorithm for voiced / unvoiced decision and pitch estimation in speech feature extraction, ISCSLP
Shaohui Zhu, Wenju Liu, Bo Xu (2002), Comparison between the spectral estimation techniques by different spectral-distortion measures, ISCSLP
Yi-Yan Zhang, Wen-Ju Liu, Bo Xu (2002), Accuracy improving method for parametric trajectory modeling and its use in a* search, ISCSLP
Zhuo Wang, Peng Ding, Bo Xu (2002), Some issues on the study of vocal tract normalization, ISCSLP
Ching-Tang Hsieh, Eugene Lai, Wan-Chen Chen, You-Chuang Wan (2002), Compact speech features based on wavelet transform and PCA with application to speaker identification, ISCSLP
Cheng-Huang Wu, Yumin Lee, Lin-Shan Lee (2002), Distributed Mandarin speech recognition under wireless environment, ISCSLP
Jyh-Shing Roger Jang, Shiuan-Sung Lin (2002), Optimization of viterbi beam search in speech recognition, ISCSLP
Jhing-Fa Wang, Shi-Huang Chen (2002), A voice activity detection algorithm based on perceptual wavelet packet transform and teager energy operator, ISCSLP
Ching-Ta Lu, Hsiao-Chuan Wang (2002), Speech enhancement using wavelet transform with constrained thresholds, ISCSLP
Chuan Jia, Jian Zhang, Bo Xu (2002), Constrained maximum a posteriori approach for speech enhancement, ISCSLP
Kok-Wee Gan, Chi-Yung Wang, Brian Mak (2002), Knowledge-based sense pruning using the hownet: an alternative to word sense disambiguation, ISCSLP
Min Zhang, Cuntai Guan, Haizhou Li (2002), Equivalent node-based speech grammar optimization, ISCSLP
Wen-Jie Cao, Bo Xu, Juha Iso-Sipila (2002), Linguistic and acoustic analysis of Chinese person names, ISCSLP
Bonnie Mok, Helen M. Meng (2002), Improvements on a belief network framework for natural language understanding of domain-specific Chinese queries, ISCSLP
Bo-Xing Chen, Li-Min Du (2002), Automatic construction of English-Chinese translation lexicon from parallel spoken language corpus, ISCSLP
Yifei Zhu, Chengrong Li, Bo Xu (2002), Improvement of the post-processing method for isolated word OOV rejection, ISCSLP
Jin Zhang, Jia Liu, Run-Sheng Liu (2002), Real-time viterbi searching for practical telephone speech recognition systems, ISCSLP
Zhi-yu Wang, Yuan Wen, Ming Li (2002), Two-pass continuous digit string decoder, ISCSLP
Yi Liu, Pascale Fung (2002), Partial change phone models for pronunciation variations in spontaneous Mandarin speech, ISCSLP
Bin Ma, Cuntai Guan, Haizhou Li (2002), Likelihood probability mismatch analysis and normalization in multilingual speech applications, ISCSLP
Zhenyu Xiong, Mingxing Xu, Wenhu Wu (2002), Comparison and combination of confidence measures in isolated word recognition, ISCSLP
Ping Lv, Zuo-Ying Wang, Da-Jin Lu (2002), Confidence measures for large vocabulary continuous speech recognition, ISCSLP
Xiu Ping Wang, Chuan-Qi Zhu, Zong-Ge Li (2002), A comparative study on wavelet packet based front-end in connected Mandarin digit recognition, ISCSLP
Dali Yang, Mingxing Xu, Wenhu Wu (2002), Study on the strategy for hierarchical speech recognition, ISCSLP
Rui Wang, Xuan Zhu, Yining Chen, Jia Liu, Runsheng Liu (2002), Fast likelihood computation method using block-diagonal covariance matrices in hidden Markov model, ISCSLP
Pui-Fung Wong, Man-Hung Siu (2002), Integration of tone related feature for Mandarin speech recognition by a one-pass search algorithm, ISCSLP
Lifu Yi, Jing Tian, Jingcheng Sun (2002), Applying source-filter model in Chinese speech synthesis, ISCSLP
Zi-Rong Zhang, Min Chu, Eric Chang (2002), An efficient way to learn rules for grapheme-to-phoneme conversion in Chinese, ISCSLP
Hongwei Ding, Oliver Jokisch, Hans Kruschke (2002), Modeling duration and intonation in Mandarin Chinese synthesis with a neural network, ISCSLP
Hung-Yan Gu, Shiue-Jen Li (2002), Hakka pitch-contour parameter generation using a Mandarin-trained pitch-contour model, ISCSLP
Ben-Feng Chen, Guo-Ping Hu, Ren-Hua Wang (2002), Large lexicon construction for TTS system, ISCSLP
Zhen-Hua Ling, Yu Hu, Zhi-Wei Shuang, Ren-Hua Wang (2002), Decision tree based unit pre-selection in Mandarin Chinese synthesis, ISCSLP
Hui Sun, Mingxing Xu, Wenhu Wu (2002), Study on detection of prosodic phrase boundaries in spontaneous speech, ISCSLP
Hao Tang, Bo Yin, Ren-Hua Wang (2002), Design of embedded application oriented distributed speech synthesis system with high naturalness, ISCSLP
Ming Li, Zhiyu Wang, Yuan Wen, Zhen Hou, Tiecheng Yu (2002), A novel approach for pitch modification on time domain, ISCSLP
Minghui Dong, Kim-Teng Lua (2002), Prosodic phrase detection for Chinese TTS using CART and statistical model, ISCSLP
Dan-Ning Jiang, Jian-Hua Tao, Lian-Hong Cai (2002), Voice quality analysis under the pitch effect, ISCSLP
Zheng-Yu Zhou, Jian-Feng Gao, Eric Chang (2002), Improving language modeling by combining heteogeneous corpora, ISCSLP
Bin She, Mingxing Xu, Wenhu Wu (2002), Phoneagent: a conversational interface for telephone exchange system, ISCSLP
Wei-Tek Hsu, Huei-Ming Wang, Yi-Chun Lin (2002), The design of a multi-domain Chinese dialogue system, ISCSLP
Ke-Song Han, Gui-Lin Chen (2002), A spoken dialogue model based on extended lambek calculus, ISCSLP
Xiaojun Wu, Mingxing Xu, And Wenhu Wu (2002), Preparing for evaluation of a flight spoken dialogue system, ISCSLP
Guoliang Zhang, Pengju Yan, Mingxing Xu, Wenhu Wu (2002), An automatic speech recognition strategy directed by the semantic knowledge in dialogue system, ISCSLP
Guo-Ping Hu, Ben-Feng Chen, Ren-Hua Wang (2002), Developing Chinese TAK for computer directly, ISCSLP
Yan Zhang, Chengqing Zong, Bo Xu (2002), An approach to automatic identification of Chinese base noun phrases, ISCSLP
Wenjie Cao, Chengqing Zong, Juha Iso-Sipila, Bo Xu (2002), Chinese person name identification based on rules and statistics, ISCSLP
Rile Hu, Chengqing Zong, Juha Iso-Sipila, Bo Xu (2002), Investigation and analysis on designing Chinese balance corpus, ISCSLP
Genqing Wu, Fang Zheng, Wenhu Wu (2002), A compression method used in language modeling for handheld devices, ISCSLP
Xuelin Cheng, Kaizheng Wu, Han Wang, Zongge Li (2002), Spoken language identification using bigram, ISCSLP
Bin Ma, Qiang Huo (2002), A comparative study of several incremental adaptation algorithms for speaker adaptation, ISCSLP
Zhao-Bing Han, Hua-Yun Zhang, Bo Xu (2002), Structure-based compensation using an improved statistical linear approximation for Mandarin speech recognition over telephone, ISCSLP
Jian Wu, Qiang Huo (2002), A comparative study of quickprop and GPD optimization algorithms for MCELR adaptation of CDHMM parameters, ISCSLP
An-Tze Yu, Hsiao-Chuan Wang (2002), Integration of model adaptation and missing feature theory for robust speech recognition, ISCSLP
Wei-Tyng Hong, Ke-Shiu Chen (2002), An investigation on wireless speech recognition by data contamination and robust training techniques, ISCSLP
Tien-Ying Fung, Helen Meng (2002), The effect of tonal context on cantonese concatenative speech synthesis, ISCSLP
Ling Sun, Wei Lai, Ren-Hua Wang (2002), Face synthesis driven by audio speech input based on HMMs, ISCSLP
Rui Cai, Zhi-Yong Wu, Lian-Hong Cai (2002), Annotation of Chinese prosodic level based on probabilistic model, ISCSLP
Janice Fon (2002), A cross-linguistic study on discourse and syntactic boundary cues in spontaneous speech: using duration as an example, ISCSLP
Huaiqiao Bao, Anhong Wang, Shinan Lu (2002), A study of evaluation method for synthetic Mandarin speech, ISCSLP
Shrikanth Narayanan (2010), Enriching speech engineering, SpeechProsody
Diane Brentari (2010), Sign language prosodic cues in first and second language acquisition, SpeechProsody
Mari Ostendorf (2010), Representations of prosody in computational models for language processing, SpeechProsody
Steven Mithen (2010), The co-evolution of music and language, SpeechProsody
Aniruddh D. Patel (2010), Hidden connections between linguistic and musical melody, SpeechProsody
Tuuli Morrill Adams (2010), Prosodic transfer and phonological learning in a second language fluent speech segmentation task, SpeechProsody
Rachel E. Baker (2010), Non-native perception of native English prominence, SpeechProsody
Elina Banzina, Laura C. Dilley (2010), Context speech rate and duration as cues to native and non-native perception of casually-spoken words in Russian, SpeechProsody
Angela Cooper, Yue Wang (2010), The role of musical experience in Cantonese lexical tone perception by native speakers of Thai, SpeechProsody
Carlos Gussenhoven, Inyang Udofot (2010), Word melodies vs. pitch accents: a perceptual evaluation of terracing contours in British and Nigerian English, SpeechProsody
Marc Swerts, Sabine Zerbian (2010), Prosodic transfer in Black South African English, SpeechProsody
Evangelia Adamou, Amalia Arvaniti (2010), Language-specific and universal patterns in narrow focus marking in Romani, SpeechProsody
Charlotte Alazard, Corine Astésano, Michel Billières (2010), The implicit prosody hypothesis applied to foreign language learning: from oral abilities to reading skills, SpeechProsody
Ayla Bozkurt Applebaum (2010), Perceptual cues to yes/no question intonation in Kabardian, SpeechProsody
Pablo Arantes, Plinio A. Barbosa (2010), Production–perception entrainment in speech rhythm, SpeechProsody
Amalia Arvaniti, Tristie Ross (2010), Rhythm classes and speech perception, SpeechProsody
Lluïsa Astruc, Elinor Payne, Brechtje Post, Pilar Prieto, Maria del Mar Vanrell (2010), Word prosody in early child Catalan, Spanish and English, SpeechProsody
Cyril Auran, Caroline Bouzon (2010), A multi-level approach to speech rate in British English: towards an analysis-by-synthesis method, SpeechProsody
Ladan Baghai-Ravary (2010), Automatic differentiation between accents of native and non-native English, and the significance of prosody, SpeechProsody
Plínio A. Barbosa (2010), Automatic duration-related salience detection in Brazilian Portuguese read and spontaneous speech, SpeechProsody
Matthew Benton (2010), A preliminary analysis of the relationship of speech rate to speech-timing metrics as applied to large corpora of non-laboratory speech in English and Chinese broadcast news, SpeechProsody
Steven Brown, Kyle Weishaar (2010), Speech is "heterometric": the changing rhythms of speech, SpeechProsody
Chun-Mei Chen (2010), Typology of Paiwan interrogative prosody, SpeechProsody
Sally Chen, Janice Fon (2010), A corpus-based study on prosodic grouping and boundary tones in Mandarin learners' English, SpeechProsody
Ivan Chow, Steven Brown, Matthew Poon, Kyle Weishaar (2010), A musical template for phrasal rhythm in spoken Cantonese, SpeechProsody
Ian R. Cushing, Volker Dellwo (2010), The role of speech rhythm in attending to one of two simultaneous speakers, SpeechProsody
Hongwei Ding, Oliver Jokisch, Rüdiger Hoffmann (2010), Perception and production of Mandarin tones by German speakers, SpeechProsody
David Escudero-Mancebo, C. González-Ferreras, Juan María Garrido Almiñana, E. Rodero, Lourdes Aguilar, Antonio Bonafonte (2010), Combining greedy algorithms with expert guided manipulation for the definition of a balanced prosodic Spanish-catalan radio news corpus, SpeechProsody
Ingo Feldhausen, Christoph Gabriel, Andrea Pešková (2010), Prosodic phrasing in Argentinean Spanish: Buenos Aires and Neuquén, SpeechProsody
Caroline Féry, Gerrit Kentner (2010), The prosody of embedded coordinations in German and Hindi, SpeechProsody
Kieu-Phuong Ha, Martine Grice (2010), Modelling the interaction of intonation and lexical tone in Vietnamese, SpeechProsody
Yunjuan He, Ratree Wayland (2010), The production of Mandarin coarticulated tones by inexperienced and experienced English speakers of Mandarin, SpeechProsody
Fang Hu, Ziyu Xiong (2010), Lhasa tones, SpeechProsody
Toshiko Isei-Jaakkola (2010), Durational variability of vowel quantity boundary for Japanese, Finnish and Czech speakers in perception, SpeechProsody
Anna Kaglik, Philippe Boula de Mareüil (2010), Polish-accented French prosody in perception and production: transfer or universal acquisition process?, SpeechProsody
Okim Kang (2010), Salient prosodic features on judgments of second language accent, SpeechProsody
Catherine Lai, Yanyan Sui, Jiahong Yuan (2010), A corpus study of the prosody of polysyllabicwords in Mandarin Chinese, SpeechProsody
Javier Latorre, Sabine Buchholz, Masami Akamine (2010), Usages of an external duration model for HMM-based speech synthesis., SpeechProsody
Adrian Leemann, Beat Siebenhaar (2010), Statistical modeling of F0 and timing of Swiss German dialects, SpeechProsody
Pärtel Lippus (2010), Variation in vowel quality as a feature of Estonian quantity, SpeechProsody
M. Sri Harish Reddy, Bayya Yegnanarayana (2010), Incorporation of excitation source and duration variations in speech synthesized at different speaking rates, SpeechProsody
Alexsandro R. Meireles, João Paulo Tozetti, Rogério R. Borges (2010), Speech rate and rhythmic variation in Brazilian Portuguese, SpeechProsody
Hansjörg Mixdorff, Bistra Andreeva, Jacques Koreman (2010), Quantitative modeling of Norwegian tonal accents in different focus conditions, SpeechProsody
Peggy Pik-Ki Mok, Peggy Wai-Yi Wong (2010), Production of the merging tones in Hong Kong Cantonese: preliminary data on monosyllables, SpeechProsody
Rena Nemoto, Martine Adda-Decker, Jacques Durand (2010), Investigation of lexical F0 and duration patterns in French using large broadcast news speech corpora, SpeechProsody
Shu-chen Ou (2010), Identification and discrimination of word stress by Taiwanese EFL learners, SpeechProsody
Michael L. O'Dell, Tommi Nieminen, Liisa Mustanoja (2010), Assessing rhythmic differences with synchronous speech, SpeechProsody
Pilar Prieto, Maria del Mar Vanrell, Lluïsa Astruc, Elinor Payne, Brechtje Post (2010), Speech rhythm as durational marking of prosodic heads and edges. evidence from Catalan, English, and Spanish, SpeechProsody
Rosa Giordano, Leandro D'Anna (2010), A comparison of rhythm metrics in different speaking styles and in fifteen regional varieties of Italian, SpeechProsody
Christopher Sappok (2010), The quantitative organization of speech, SpeechProsody
Chilin Shih, Hsin-Yi Dora Lu (2010), Prosody transfer and suppression: stages of tone acquisition, SpeechProsody
Stavros Skopeteas, Caroline Féry (2010), Effect of narrow focus on tonal realization in Georgian, SpeechProsody
Anne Tortel, Daniel Hirst (2010), Rhythm metrics and the production of English L1/L2, SpeechProsody
Christiane Ulbrich (2010), Belfast intonation in L2 speech, SpeechProsody
Riikka Ullakonoja (2010), Pitch contours in Russian yes/no questions by Finns, SpeechProsody
Akira Utsugi, Masatoshi Koizumi, Reiko Mazuka (2010), The perception of non-native lexical pitch accent by speakers of 'accentless' Japanese dialects, SpeechProsody
Maria del Mar Vanrell, Pilar Prieto, Lluïsa Astruc, Elinor Payne, Brechtje Post (2010), Early acquisition of F0 alignment and scaling patterns in Catalan and Spanish, SpeechProsody
Petra Wagner (2010), Two sides of the same coin? investigating iambic and trochaic timing and prominence in German poetry, SpeechProsody
Xia Wang, Aijun Li, Jia Sun, Yun Mai (2010), Prosodic analysis on English mild imperatives of Chinese EFL learners, SpeechProsody
Xia Wang, Aijun Li, Xiaoli Ji (2010), Perception and production of prominence distribution patterns of Chinese EFL learners, SpeechProsody
E-Chin Wu, Janice Fon (2010), The effect of Min proficiency on the realization of Mandarin tones in Mandarin-min bilinguals, SpeechProsody
Tae-Jin Yoon (2010), Capturing inter-speaker invariance using statistical measures of rhythm, SpeechProsody
Jiahong Yuan, Yue Jiang, Ziang Song (2010), Perception of foreign accent in spontaneous L2 English speech, SpeechProsody
Vahideh Abolhasani Zadeh, Carlos Gussenhoven, Mahmood Bijankhan (2010), The position of clitics in Persian intonational structure, SpeechProsody
Victoria Zavyalova, Marina Polyanskaya (2010), English rhythmic structure and tone-units perception by the speakers of Chinese, SpeechProsody
Sabine Zerbian, Etienne Barnard (2010), Word-level prosody in Sotho-Tswana, SpeechProsody
Annie C. Gilbert, Victor J. Boucher, Boutheina Jemel (2010), Exploring the rhythmic segmentation of heard speech using evoked potentials, SpeechProsody
Greg Kochanski, Anastassia Loukina, Elinor Keane, Chilin Shih, Burton Rosner (2010), Long-range prosody prediction and rhythm, SpeechProsody
Cyrille Magne, Reyna L. Gordon, Swati Midha (2010), Influence of metrical expectancy on reading words: an ERP study, SpeechProsody
Laurence White, Lukas Wiget, Olesya Rauch, Sven L. Mattys (2010), Segmentation cues in spontaneous and read speech, SpeechProsody
Andreas Hilbert, Hansjörg Mixdorff, Hongwei Ding, Hartmut R. Pfitzinger, Oliver Jokisch (2010), Prosodic analysis of German produced by Russian and Chinese learners, SpeechProsody
Hussein Hussein, Si Wei, Hansjörg Mixdorff, Daniel Külls, Shu Gong, Guoping Hu (2010), Development of a computer-aided language learning system for Mandarin – tone recognition and pronunciation error detection, SpeechProsody
Florian Hönig, Anton Batliner, Karl Weilhammer, Elmar Nöth (2010), Automatic assessment of non-native prosody for English as L2, SpeechProsody
Philippe Martin (2010), Learning the prosodic structure of a foreign language with a pitch visualizer, SpeechProsody
Andrew Rosenberg, Julia Hirschberg, Kim Manis (2010), Perception of English prominence by native Mandarin Chinese speakers, SpeechProsody
Chilin Shih, Hsin-Yi Dora Lu, Lu Sun, Jui-Ting Huang, Jerry Packard (2010), An adaptive training program for tone acquisition, SpeechProsody
Joseph Tepperman, Theban Stanley, Kadri Hacioglu, Bryan Pellom (2010), Testing suprasegmental English through parroting, SpeechProsody
Shuang Zhang, Kun Li, Wai-Kit Lo, Helen Meng (2010), Perception of English suprasegmental features by non-native Chinese learners, SpeechProsody
Jordi Adell, Antonio Bonafonte, David Escudero-Mancebo (2010), Modelling filled pauses prosody to synthesise disfluent speech, SpeechProsody
Jáchym Kolář, Yang Liu (2010), Comparing and combining modeling techniques for sentence segmentation of spoken Czech using textual and prosodic information, SpeechProsody
Valery A. Petrushin, Liliya I. Tsirulnik, Veronika Makarova (2010), Whispered speech prosody modeling for TTS synthesis, SpeechProsody
Alexander Schmitt, Tim Polzehl, Wolfgang Minker (2010), Modeling a-priori likelihoods for angry user turns with hidden Markov models, SpeechProsody
Hanna Silén, Elina Helander, Jani Nurminen, Moncef Gabbouj (2010), Analysis of duration prediction accuracy in HMM-based speech synthesis, SpeechProsody
Diana V. Dimitrova, Laurie A. Stowe, Gisela Redeker, John C. J. Hoeks (2010), Focus particles and prosody processing in Dutch: evidence from ERPs, SpeechProsody
Yong-cheol Lee, Yi Xu (2010), Phonetic realization of contrastive focus in Korean, SpeechProsody
Fang Liu (2010), Single vs. double focus in English statements and yes/no questions, SpeechProsody
Maria O'Reilly, Amelie Dorn, Ailbhe Ní Chasaide (2010), Focus in Donegal Irish (gaelic) and Donegal English bilinguals, SpeechProsody
Ritva Torppa, Andrew Faulkner, Juhani Järvikivi, Martti Vainio (2010), Acquisition of focus by normal hearing and cochlear implanted children: the role of musical experience, SpeechProsody
Wing Li Wu, Yi Xu (2010), Prosodic focus in Hong Kong Cantonese without post-focus compression, SpeechProsody
Sabine Zerbian, Susanne Genzel, Frank Kügler (2010), Experimental work on prosodically-marked information structure in selected african languages (afroasiatic and Niger-congo), SpeechProsody
Juan María Garrido Almiñana (2010), A tool for automatic F0 stylisation, annotation and modelling of large corpora, SpeechProsody
Noam Amir, Hansjörg Mixdorff, Ofer Amir, Daniel Rochman, Gary M. Diamond, Hartmut R. Pfitzinger, Tami Levi-Isserlish, Shira Abramson (2010), Unresolved anger: prosodic analysis and classification of speech from a therapeutic setting, SpeechProsody
Sebastian Andersson, Kallirroi Georgila, David Traum, Matthew Aylett, Robert A. J. Clark (2010), Prediction and realisation of conversational characteristics by utilising spontaneous speech for unit selection, SpeechProsody
Nicolas Audibert, Véronique Aubergé, Albert Rilliard (2010), Prosodic correlates of acted vs. spontaneous discrimination of expressive speech: a pilot study, SpeechProsody
Hossein Behbood, Seyyed Ali Seyyedsalehi, Hamid Reza Tohidypour (2010), A new bidirectional neural network model for the acoustic- articulatory inversion mapping for speech recognition, SpeechProsody
Hossein Behbood, Seyyed Ali SeyyedSalehi, Hamid Reza Tohidypour (2010), A novel feature extraction for neural–based modes in acoustic-articulatory inversion mapping, SpeechProsody
Grégory Beller (2010), Expresso: transformation of expressivity in speech, SpeechProsody
Sylvain Le Beux, Christophe d'Alessandro, Albert Rilliard, Boris Doval (2010), Calliphony: a system for real-time gestural modification of intonation and rhythm, SpeechProsody
Claire Brierley, Eric Atwell (2010), Complex vowels as boundary correlates in a multi-speaker corpus of spontaneous English speech, SpeechProsody
Yu-Lun Chou, Chen-Yu Chiang, Yih-Ru Wang, Hsiu-Min Yu, Sin-Horng Chen (2010), Prosody labeling and modeling for Mandarin spontaneous speech, SpeechProsody
Donna Erickson (2010), Perception by Japanese, Korean and American listeners to a Korean speaker's recollection of past emotional events: some acoustic cues, SpeechProsody
Michele Gubian, Francesco Cangemi, Lou Boves (2010), Automatic and data driven pitch contour manipulation with functional data analysis, SpeechProsody
Jorge Gurlekian, Hansjörg Mixdorff, Diego Evin, Humberto Torres, Hartmut R. Pfitzinger (2010), Alignment of F0 model parameters with final and non-final accents in Argentinean Spanish, SpeechProsody
David Harwath, Mark Hasegawa-Johnson (2010), Phonetic landmark detection for automatic language identification, SpeechProsody
Jui-Ting Huang, Po-Sen Huang, Yoonsook Mo, Mark Hasegawa-Johnson, Jennifer Cole (2010), Prosody-dependent acoustic modeling using variable-parameter hidden Markov models, SpeechProsody
Hussein Hussein, Guntram Strecha, Rüdiger Hoffmann (2010), Resynthesis of prosodic information using the cepstrum vocoder, SpeechProsody
Tommi Jantunen, Markus Koskela, Jorma Laaksonen, Päivi Rainò (2010), Towards the automated visualization and analysis of signed language motion – method and linguistic issues, SpeechProsody
Doina Jitca, Vasile Apopei, Magdalena Jitca (2010), How can a functional perspective be used in intonation modelling?, SpeechProsody
Takayuki Kagomiya, Seiji Nakagawa (2010), An evaluation of bone-conducted ultrasonic hearing aid regarding perception of paralinguistic information, SpeechProsody
Caroline Kaufhold, Elmar Nöth (2010), Using prosodic features for predicting phrase boundaries, SpeechProsody
Tatsuya Kawahara, Zhi-Qiang Chang, Katsuya Takanashi (2010), Analysis on prosodic features of Japanese reactive tokens in poster conversations, SpeechProsody
Kornel Laskowski (2010), A frame-synchronous prosodic decoder for text-independent dialog act recognition, SpeechProsody
Bogdan Ludusan, Antonio Origlia, Francesco Cutugno (2010), Syllable classification using static matrices and prosodic features, SpeechProsody
Dang-Khoa Mac, Véronique Aubergé, Albert Rilliard, Eric Castelli (2010), Cross-cultural perception of Vietnamese audio-visual prosodic attitudes, SpeechProsody
Anna Margolis, Mari Ostendorf, Karen Livescu (2010), Cross-genre training for automatic prosody classification, SpeechProsody
Caroline Menezes, Donna Erickson, Clayton Franks (2010), Comparison between linguistic and affective perception of sad and happy – a cross-linguistic study, SpeechProsody
Nobuaki Minematsu (2010), A modulation-demodulation model of speech communication, SpeechProsody
Donata Moers, Petra Wagner, Bernd Möbius, Filip Müllers, Igor Jauk (2010), Integrating a fast speech corpus in unit selection speech synthesis: experiments on perception, segmentation, and duration prediction, SpeechProsody
Helena Moniz, Fernando Batista, Hugo Meinedo, Alberto Abad, Isabel Trancoso, Ana Isabel Mata, Nuno Mamede (2010), Prosodically-based automatic segmentation and punctuation, SpeechProsody
Carlos Monzo, Angel Calzada, Ignasi Iriondo, Joan Claudi Socoro (2010), Expressive speech style transformation: voice quality and prosody modification using a harmonic plus noise model, SpeechProsody
D. Neiberg, P. Laukka, G. Ananthakrishnan (2010), Classification of affective speech using normalized time-frequency cepstra, SpeechProsody
Raymond W. M. Ng, Cheung-Chi Leung, Tan Lee, Bin Ma, Haizhou Li (2010), An entropy-based approach for comparing prosodic properties in tonal and pitch accent languages, SpeechProsody
Nicolas Obin, Pierre Lanchantin, Mathieu Avanzi, Anne Lacheret-Dujour, Xavier Rodet (2010), Toward improved HMM-based speech synthesis using high-level syntactical features, SpeechProsody
Keiko Ochi, Keikichi Hirose, Nobuaki Minematsu (2010), Realization of prosodic focuses in corpus-based generation of fundamental frequency contours of Japanese based on the generation process model, SpeechProsody
Hiroki Oohashi, Tomoko Ohsuga, Yasuo Horiuchi, Hideaki Kikuchi, Akira Ichikawa (2010), Prosody, supporting real-time conversation, SpeechProsody
Antonio Origlia, Vincenzo Galatà, Bogdan Ludusan (2010), Automatic classification of emotions via global and local prosodic features on a multilingual emotional database, SpeechProsody
Tim Polzehl, Alexander Schmitt, Florian Metze (2010), Approaching multi-lingual emotion recognition from speech - on language dependency of acoustic/prosodic features for anger recognition, SpeechProsody
Heather Pon-Barry, Stuart Shieber (2010), Assessing self-awareness and transparency when classifying a speaker's level of certainty, SpeechProsody
Kishore Prahallad, E. Veera Raghavendra, Alan W. Black (2010), Semi-supervised learning of acoustic driven prosodic phrase breaks for text-to-speech systems, SpeechProsody
S. R. M. Prasanna, D. Govind, K. Sreenivasa Rao, Bayya Yegnanarayana (2010), Fast prosody modification using instants of significant excitation, SpeechProsody
Uwe D. Reichel, Raphael Winkelmann (2010), Removing micromelody from fundamental frequency contours, SpeechProsody
Sophie Roekhaut, Jean-Philippe Goldman, Anne Catherine Simon (2010), A model for varying speaking style in TTS systems, SpeechProsody
Jan Romportl (2010), Automatic prosodic phrase annotation in a corpus for speech synthesis, SpeechProsody
Susanne Schötz, Jonas Beskow, Gösta Bruce, Björn Granström, Joakim Gustafson (2010), Simulating intonation in regional varieties of Swedish, SpeechProsody
Dino Seppi, Anton Batliner, Stefan Steidl, Björn Schuller, Elmar Nöth (2010), Word accent and emotion, SpeechProsody
Mostafa Al Masum Shaikh, Antonio Rui Ferreira Rebordao, Keikichi Hirose (2010), Improving TTS synthesis for emotional expressivity by a prosodic parameterization of affect based on linguistic analysis, SpeechProsody
Stefanie Shattuck-Hufnagel, Pei Lin Ren, Elizabeth Tauscher (2010), Are torso movements during speech timed with intonational phrases?, SpeechProsody
Marie Tahon, Laurence Devillers (2010), Acoustic measures characterizing anger across corpora collected in artificial or natural context, SpeechProsody
Bogdan Vlasenko, Ronald Böck, Andreas Wendemuth (2010), Modeling affected user behavior during human-machine interaction, SpeechProsody
Agnieszka Wagner, Katarzyna Klessa (2010), F0 contour and segmental duration modeling using prosodic features, SpeechProsody
Agnieszka Wagner (2010), Acoustic cues for automatic determination of phrasing, SpeechProsody
Miaomiao Wang, Keikichi Hirose, Nobuaki Minematsu (2010), Generation of fundamental frequency contours of Mandarin in HMM-based speech synthesis using generation process model, SpeechProsody
Gabriel Webster, Sabine Buchholz, Javier Latorre (2010), Automatic feature selection from a large number of features for phone duration prediction, SpeechProsody
Yi Xu, Andrew Kelly (2010), Perception of anger and happiness from resynthesized speech with size-related manipulations, SpeechProsody
Li-chiung Yang (2010), Meaning and context: prosodic variation of interjections in conversational speech, SpeechProsody
Xiaojun Zou, Xiao Bao, Lidong Luo (2010), Integration of intonation in F0 trajectory prediction using MSD-HMMs, SpeechProsody
Benjamin Parrell, Sungbok Lee, Dani Byrd (2010), Evaluation of juncture strength using articulatory synthesis of prosodic gestures and functional data analysis, SpeechProsody
Martha E. Tyrone, Hosung Nam, Elliot Saltzman, Gaurav Mathur, Louis Goldstein (2010), Prosody and movement in American sign language: a task-dynamics approach, SpeechProsody
Meghan E. Armstrong (2010), Intonational encoding of pragmatic meaning in Puerto Rican Spanish interrogatives, SpeechProsody
Štefan Beňuš, Katalin Mády (2010), Effects of lexical stress and speech rate on the quantity and quality of Slovak vowels, SpeechProsody
Nicholas C. Henriksen (2010), Nuclear rises and final rises in Manchego peninsular Spanish yes/no questions, SpeechProsody
Izumi Takiguchi, Hajime Takeyasu, Mikio Giriko (2010), Effects of a dynamic F0 on the perceived vowel duration in Japanese, SpeechProsody
Maria del Mar Vanrell, Ignasi Mascaró, Francesc Torres-Tamarit, Pilar Prieto (2010), When intonation plays the main character: information- vs. confirmation-seeking questions in Majorcan Catalan, SpeechProsody
Jonathan Barnes, Nanette Veilleux, Alejna Brugos, Stefanie Shattuck-Hufnagel (2010), The effect of global F0 contour shape on the perception of tonal timing contrasts in American English intonation, SpeechProsody
Joan Borràs-Comes, Maria del Mar Vanrell, Pilar Prieto (2010), The role of pitch range in establishing intonational contrasts in Catalan, SpeechProsody
Ernst Dombrowski, Oliver Niebuhr (2010), Shaping phrase-final rising intonation in German, SpeechProsody
Mariapaola D'Imperio, Barbara Gili Fivela, Oliver Niebuhr (2010), Alignment perception of high intonational plateaux in Italian and German, SpeechProsody
Barbara Gili Fivela, Mariapaola D'Imperio (2010), High peaks versus high plateaux in the identification of two pitch accents in Pisa Italian, SpeechProsody
Caterina Petrone (2010), At the interface between phonetics and pragmatics: non-local F0 effects on the perception of Cosenza Italian tunes, SpeechProsody
Saandia Ali (2010), Analysis by synthesis of tonal alignment patterns in British English, SpeechProsody
Anja Arnhold, Martti Vainio, Antti Suni, Juhani Järvikivi (2010), Intonation of Finnish verbs, SpeechProsody
Mathieu Avanzi, Cédric Gendrot, Anne Lacheret-Dujour (2010), Is there a prosodic difference between left-dislocated and heavy subjects? evidence from spontaneous French, SpeechProsody
Stefan Baumann, Arndt Riester (2010), Annotating information status in spontaneous speech, SpeechProsody
Claire Beyssade, Barbara Hemforth, Jean-Marie Marandin, Cristel Portes (2010), Information focus in French, SpeechProsody
Lisa Brunetti, Mariapaola D'Imperio, Francesco Cangemi (2010), On the prosodic marking of contrast in Romance sentence topic: evidence from Neapolitan Italian, SpeechProsody
Geneviève Caelen-Haumont (2010), F0 prominences (melisms) in French: a deeper insight about morphophonology, SpeechProsody
Luciana Castro, Ben Serridge, João Antônio de Moraes, Myrian Freitas (2010), Characterizing variation in fundamental frequency contours of professional speaking styles, SpeechProsody
Luciana Castro, Myrian Freitas, João Antônio de Moraes, Ben Serridge (2010), Listeners' ability to identify professional speaking styles based on prosodic cues, SpeechProsody
Aoju Chen, Emilie Destruel (2010), Intonational encoding of focus in Toulousian French, SpeechProsody
Candise Chen, Min Wang, Hua Shu, Han Wu, Chu Chu Li (2010), Development of tone sensitivity in young Chinese children, SpeechProsody
Szu-wei Chen, Jane Tsay (2010), Phonetic realization of suffix vs. non-suffix morphemes in Taiwanese, SpeechProsody
Yu-Ying Chuang, Janice Fon (2010), The effect of prosodic prominence on the realizations of voiceless dental and retroflex sibilants in Taiwan Mandarin spontaneous speech, SpeechProsody
Jennifer Cole, Jose I. Hualde, Michael Blasingame, Yoonsook Mo (2010), Shifting Chicago vowels: prosody and sound change, SpeechProsody
Verònica Crespo-Sendra, Maria del Mar Vanrell, Pilar Prieto (2010), Information-seeking questions and incredulity questions: gradient or categorical contrast?, SpeechProsody
Nicole Dehé (2010), The timing of nuclear and prenuclear Icelandic pitch accents, SpeechProsody
Mariapaola D'Imperio, Amandine Michelas (2010), Embedded register levels and prosodic phrasing in French, SpeechProsody
David Escudero-Mancebo, Lourdes Aguilar (2010), Procedure for assessing the reliability of prosodic judgements using sp-TOBI labeling system, SpeechProsody
Letania Ferreira (2010), High initial tones and plateaux in Brazilian Portuguese: implications for stress in Portuguese and Spanish, SpeechProsody
Susanne Genzel, Frank Kügler (2010), The prosodic expression of contrast in Hindi, SpeechProsody
James German, Mariapaola D'Imperio (2010), Focus, phrase length, and the distribution of phrase-initial rises in French, SpeechProsody
Jean-Philippe Goldman, Antoine Auchlin, Sophie Roekhaut, Anne Catherine Simon, Mathieu Avanzi (2010), Prominence perception and accent detection in French. a corpus-based account, SpeechProsody
Anja Gollrad, Esther Sommerfeld, Frank Kügler (2010), Prosodic cue weighting in disambiguation: case ambiguity in German, SpeechProsody
Markus Greif (2010), Contrastive focus in Mandarin Chinese, SpeechProsody
Stella Gryllia, Frank Kügler (2010), What does prosody tell us about relative clause attachments in German?, SpeechProsody
Ran Han, Jeung-Yoon Choi (2010), Analysis of prosodic classes using voice source measurements, SpeechProsody
Nancy Hedberg, Juan M. Sosa, Emrah Görgülü, Morgan Mameni (2010), The prosody and meaning of wh-questions in American English, SpeechProsody
Hans Henrich Hock, Indranil Dutta (2010), Prosody vs. syntax: prosodic rebracketing of final vocatives in English, SpeechProsody
Hae-Sung Jeon, Francis Nolan (2010), Segmentation of the accentual phrase in Seoul Korean, SpeechProsody
Yuan Jia, Aijun Li, Ziyu Xiong (2010), A phonetic and phonological analysis of dual and multiple focuses in standard Chinese, SpeechProsody
Constantijn Kaland, Vincent J. van Heuven (2010), The structure-prosody interface of restrictive and appositive relative clauses in Dutch and German, SpeechProsody
Felicitas Kleber, Oliver Niebuhr (2010), Semantic-context effects on lexical stress and syllable prominence, SpeechProsody
Eneida de Goes Leal, Raquel Santana Santos (2010), Post-tonic syllables and prosodic boundaries in Brazilian portuguese, SpeechProsody
Suk-Myung Lee, Jeung-Yoon Choi (2010), Analysis of emotion in speech using perceived and automatically extracted prosodic features, SpeechProsody
Heike Lehnert-LeHouillier, Joyce McDonough, Stephen McAleavey (2010), Prosodic strengthening in American English domain-initial vowels, SpeechProsody
Britta Lintfert, Antje Schweitzer, Lukasz Wolski, Bernd Möbius (2010), Quantifying developmental changes of prosodic categories, SpeechProsody
Conxita Lleó, Martin Rakow (2010), Sorting out the phonetics and phonology of intonation: typological and acquisition data, SpeechProsody
Céline De Looze, Daniel Hirst (2010), Integrating changes of register into automatic intonation analysis, SpeechProsody
Luciana Lucente, Plínio A. Barbosa (2010), The role of alignment and height in the perception of LH contours, SpeechProsody
Ana Isabel Mata, Ana Lúcia Santos (2010), On the intonation of confirmation-seeking requests in child-directed speech, SpeechProsody
Amandine Michelas, Mariapaola D'Imperio (2010), Durational cues and prosodic phrasing in French: evidence for the intermediate phrase, SpeechProsody
Yoonsook Mo, Jennifer Cole, Mark Hasegawa-Johnson (2010), Prosodic effects on temporal structure of monosyllabic CVC words in American English, SpeechProsody
Peggy Pik-Ki Mok, Peggy Wai-Yi Wong (2010), Perception of the merging tones in Hong Kong Cantonese: preliminary data on monosyllables, SpeechProsody
Katalin Mády, Felicitas Kleber (2010), Variation of pitch accent patterns in Hungarian, SpeechProsody
Irina Nesterenko, Stephane Rauzy, Roxane Bertrand (2010), Prosody in a corpus of French spontaneous speech: perception, annotation and prosody-syntax interaction, SpeechProsody
Elinor Payne, Brechtje Post, Lluïsa Astruc, Pilar Prieto, Maria del Mar Vanrell (2010), A cross-linguistic study of prosodic lengthening in child-directed speech, SpeechProsody
Santitham Prom-on, Yi Xu (2010), The qTA toolkit for prosody: learning underlying parameters of communicative functions through modeling, SpeechProsody
Christine Tanja Röhr, Stefan Baumann (2010), Prosodic marking of information status in German, SpeechProsody
Renata Savy, Miriam Voghera (2010), A corpus-based study on syntactic and phonetic prosodic phrasing boundaries in spontaneous Italian speech, SpeechProsody
Vered Silber-Varod (2010), Phonological aspects of hesitation disfluencies, SpeechProsody
Alexandra Vella (2010), Asking or not asking in Maltese, that is the question, SpeechProsody
Charlotte Wollermann, Ulrich Schade, Bernhard Fisseni, Bernhard Schröder (2010), Accentuation, uncertainty and exhaustivity – towards a model of pragmatic focus interpretation, SpeechProsody
Chunsheng Yang (2010), Prosodic marking of topic constructions in Mandarin Chinese, SpeechProsody
Chia-Hsin Yeh (2010), Comparison of phonetic naturalness between rising-falling and falling-rising tonal patterns in Taiwan Mandarin, SpeechProsody
Shani H. Abada, Karsten Steinhauer, John E. Drury, Shari R. Baum (2010), Age differences in electrophysiological correlates of cross-modal phrasal interpretation, SpeechProsody
Daniel Alves, Cristina Name (2010), Phonological phrase boundaries restrictions in lexical access by BP adult speakers, SpeechProsody
Bernadette Cardoso, César Reis (2010), The speech prosody of people with stutteringaand developmental apraxia: the efficacy of an intervention program, SpeechProsody
Allison Blodgett, Melissa K. Fox, C. Anton Rytting, Alina Twist (2010), Non-contrastive voice quality characteristics of Northern Vietnamese tones, SpeechProsody
Eugene H. Buder, Anne S. Warlaumont, D. Kimbrough Oller, Lesya B. Chorna (2010), Dynamic indicators of mother-infant prosodic and illocutionary coordination, SpeechProsody
Jane Chandlee, Nanette Veilleux (2010), Gestural cues of discourse segmentation, SpeechProsody
Andy Christen, Didier Grandjean (2010), Temporal dynamics of amygdala and orbitofrontal responses to emotional prosody using intracerebral local field potentials in humans, SpeechProsody
Chinar Dara, Marc D. Pell (2010), Hemispheric contributions for processing pitch and speech rate cues to emotion: fMRI data, SpeechProsody
Jeremy Day-O'Connell (2010), “minor third, who?”: the intonation of the knock-knock joke, SpeechProsody
Heeyeon Y. Dennison, Amy J. Schafer (2010), Online construction of implicature through contrastive prosody, SpeechProsody
Janet Fletcher, Deborah Loakes (2010), Interpreting rising intonation in Australian English, SpeechProsody
Sven Grawunder, Bodo Winter (2010), Acoustic correlates of Politeness: prosodic and voice quality measures in Polite and informal speech of Korean and German speakers, SpeechProsody
Edward Holsinger, David Cheng-Huan Li, Elsi Kaiser, Dani Byrd (2010), Visual grouping and prosodic grouping: effects of spatial information on prosodic boundary strength, SpeechProsody
Carlos T. Ishi, Hiroshi Ishiguro, Norihiro Hagita (2010), Acoustic, electroglottographic and paralinguistic analyses of “rikimi” in expressive speech, SpeechProsody
Khalil Iskarous, Marianne Pouplier, Stefania Marin, Jonathan Harrington (2010), The interaction between prosodic boundaries and accent in the production of sibilants, SpeechProsody
Maciej Karpiński, Ewa Jarmołowicz-Nowikow (2010), Prosodic and gestural features of phrase-internal disfluencies in Polish spontaneous utterances, SpeechProsody
Heejin Kim, Mark Hasegawa-Johnson, Adrienne Perlman (2010), Acoustic cues to lexical stress in spastic dysarthria, SpeechProsody
Inyoung Kim (2010), Stressed and unstressed morphemes in Korean spontaneous speech, SpeechProsody
Inyoung Kim, Catherine Mathon, Georges Boulakia (2010), Rhetorical prosody in French courtroom discourse, SpeechProsody
Deok-Hee Kim-Dufor, Emmanuel Ferragne, Olivier Dufor, Corine Astésano, Jean-Luc Nespoulous (2010), Perception and comprehension of linguistic and affective prosody in children with Landau-kleffner syndrome, SpeechProsody
Audrey Leclercq, Kathy Huet, Myriam Piccaluga, Bernard Harmegnies (2010), Assessment of prosody disturbances in stutterers by means of phonetic indices, SpeechProsody
Aijun Li, Rushen Shi, Wu Hua (2010), Prosodic cues to noun and verb categories in infant-directed Mandarin speech, SpeechProsody
Aveliny Mantovan Lima-Gregio, Plínio A. Barbosa (2010), Laryngealizations in cleft and non-cleft speech: acoustics and prosodic considerations, SpeechProsody
Zofia Malisz, Maciej Karpiński (2010), Multimodal aspects of positive and negative responses in Polish task-oriented dialogues, SpeechProsody
Philippe Martin (2010), Prosodic structure revisited: a cognitive approach - the example of French, SpeechProsody
João Antônio de Moraes, Albert Rilliard, Bruno Alberto de Oliveira Mota, Takaaki Shochi (2010), Multimodal perception and production of attitudinal meaning in Brazilian Portuguese, SpeechProsody
Oliver Niebuhr, Hartmut R. Pfitzinger (2010), On pitch-accent identification – the role of syllable duration and intensity, SpeechProsody
Benjamin Parrell, Louis Goldstein, Sungbok Lee, Dani Byrd (2010), Articulatory evidence for functional coupling of speech and non-speech motor tasks, SpeechProsody
Aniruddh D. Patel, Yi Xu, Bei Wang (2010), The role of F0 variation in the intelligibility of Mandarin sentences, SpeechProsody
Sona Patel, Klaus R. Scherer, Johan Sundberg, Eva Björkner (2010), Acoustic markers of emotions based on voice physiology, SpeechProsody
Marc D. Pell, Abhishek Jaywant, Laura Monetta, Sonja A. Kotz (2010), The contributions of prosody and semantic context in emotional speech processing, SpeechProsody
Marcela Perrone, Marion Dohen, Hélène Loevenbruck, Marc Sato, Cédric Pichat, Gaëtan Yvert, Monica Baciu (2010), An fMRI study of the perception of contrastive prosodic focus in French, SpeechProsody
Tea Pršir (2010), Reset inclination and prosodic parallelism in expressive speech, SpeechProsody
K. Sreenivasa Rao, Ramu Reddy, Sudhamay Maity, Shashidhar G. Koolagudi (2010), Characterization of emotions using the dynamics of prosodic features, SpeechProsody
Karine Rigaldie, Jean Luc Nespoulous, Nadine Vigouroux (2010), The effect of levodopa on speech in Parkinson's disease: musical' scale study, SpeechProsody
Erich R. Round (2010), Tone height binarity and register in intonation: the case from Kayardild (australian), SpeechProsody
Benjamin Roustan, Marion Dohen (2010), Co-production of contrastive prosodic focus and manual gestures: temporal coordination and effects on the acoustic and articulatory correlates of focus, SpeechProsody
Susanne Schötz, Gösta Bruce (2010), Phrase-initial pitch patterns in South Swedish, SpeechProsody
Chiu-yu Tseng, Zhao-yu Su, Lin-shan Lee (2010), Prosodic patterns of information structure in spoken discourse - a preliminary study of Mandarin spontaneous lecture vs. read speech, SpeechProsody
Soroush Vosoughi, Brandon C. Roy, Michael C. Frank, Deb Roy (2010), Effects of caregiver prosody on child language acquisition, SpeechProsody
Michael Wagner, Serena Crivellaro (2010), Relative prosodic boundary strength and prior bias in disambiguation, SpeechProsody
Michael Wagner, M. Breen, E. Flemming, Stefanie Shattuck-Hufnagel, E. Gibson (2010), Prosodic effects of discourse salience and association with focus, SpeechProsody
Chen-huei Wu, Chilin Shih (2010), Articulatory effort in different speaking rates, SpeechProsody
Nan Xu, Denis Burnham (2010), Tone hyperarticulation and intonation in Cantonese infant directed speech, SpeechProsody
Li-chiung Yang (2010), Harmony and tension in Mandarin Chinese prosody: constraints and opportunities of lexical tones in discourse markers, SpeechProsody
Margaret Zellers, Brechtje Post (2010), Aperiodicity at topic structure boundaries, SpeechProsody
Erin Cvejic, Jeesun Kim, Chris Davis (2010), It's all the same to me: prosodic discrimination across speakers and face areas, SpeechProsody
Eitan Globerson, Michal Lavidor, Ofer Golan, Liat Kishon-Rabin, Noam Amir (2010), Psychoacoustic abilities as predictors of vocal emotion recognition, SpeechProsody
Vikram Ramanarayanan, Dani Byrd, Louis Goldstein, Shrikanth Narayanan (2010), A joint acoustic-articulatory study of nasal spectral reduction in read versus spontaneous speaking styles, SpeechProsody
Takaaki Shochi, Gwenaëlle Gagnié, Albert Rilliard, Donna Erickson, Véronique Aubergé (2010), Learning effect of prosodic social affects for Japanese learners of French language, SpeechProsody
Nigel G. Ward, Alejandro Vega, David G. Novick (2010), Lexico-prosodic anomalies in dialog, SpeechProsody
Diana V. Dimitrova, Laurie A. Stowe, Gisela Redeker, John C. J. Hoeks (2010), ERP correlates of focus accentuation in Dutch, SpeechProsody
Simone Falk, Tamara Rathcke (2010), On the speech-to-song illusion: evidence from German, SpeechProsody
Jun Gao, Rushen Shi, Aijun Li (2010), Categorization of lexical tones in Mandarin-learning infants, SpeechProsody
Jelena Krivokapic (2010), Speech planning and prosodic phrase length, SpeechProsody
Carmen Kung, Dorothee J. Chwilla, Carlos Gussenhoven, Sara Bögels, Herbert Schriefers (2010), What did you say just now, bitterness or wife? an ERP study on the interaction between tone, intonation and context in Cantonese Chinese, SpeechProsody
Amandine Michelas, Mariapaola D'Imperio (2010), Accentual phrase boundaries and lexical access in French, SpeechProsody
Giovanni Abete, Francesco Cutugno, Bogdan Ludusan, Antonio Origlia (2010), Pitch behavior detection for automatic prominence recognition, SpeechProsody
Denis Arnold, Petra Wagner, Bernd Möbius (2010), The effect of priming on the correlations between prominence ratings and acoustic features, SpeechProsody
Mathieu Avanzi, Anne-Cathérine. Simon, Jean-Philippe Goldman, Antoine Auchlin (2010), C-PROM: an annotated corpus for French prominence study, SpeechProsody
Mathieu Avanzi, Anne Lacheret-Dujour, Bernard Victorri (2010), A corpus-based learning method for prominence detection in spontaneous speech, SpeechProsody
Donna Erickson (2010), An articulatory account of rhythm, prominence, and phrasal organization, SpeechProsody
Gero Kunter (2010), Perception of prominence patterns in English nominal compounds, SpeechProsody
Britta Lintfert, Bernd Möbius (2010), Acquisition of syllabic prominence in German speaking children, SpeechProsody
Massimo Moneglia, Tommaso Raso, Maryualê Malvessi-Mittmann, Heliana Mello (2010), Challenging the perceptual relevance of prosodic breaks in multilingual spontaneous speech corpora: c-ORAL-BRASIL / c-ORAL-ROM, SpeechProsody
Philippe Martin (2010), Prominence detection without syllabic segmentation, SpeechProsody
Samer Al Moubayed, G. Ananthakrishnan, Laura Enflo (2010), Automatic prominence classification in Swedish, SpeechProsody
Andrew Rosenberg, Julia Hirschberg (2010), Production of English prominence by native Mandarin Chinese speakers, SpeechProsody
Tae-Jin Yoon (2010), Speaker consistency in the realization of prosodic prominence in the Boston University Radio Speech Corpus, SpeechProsody
Johan Liljencrants, Gunnar Fant, Anita Kruckenberg (2000), Subglottal pressure and prosody in Swedish, ICSLP
Kiyoshi Honda, Shinobu Masaki, Yasuhiro Shimada (2000), Observation of laryngeal control for voicing and pitch change by magnetic resonance imaging technique, ICSLP
Hiroya Fujisaki, Ryou Tomana, Shuichi Narusawa, Sumio Ohno, Changfu Wang (2000), Physiological mechanisms for fundamental frequency control in standard Chinese, ICSLP
René Carré (2000), On vocal tract asymmetry/symmetry, ICSLP
Olov Engwall (2000), Are static MRI measurements representative of dynamic speech? results from a comparative study using MRI, EPG and EMA, ICSLP
Shinan Lu, Lin He, Yufang Yang, Jianfen Cao (2000), Prosodic control in Chinese TTS system, ICSLP
Yuqing Gao, Raimo Bakis, Jing Huang, Bing Xiang (2000), Multistage coarticulation model combining articulatory, formant and cepstral features, ICSLP
Osamu Fujimura (2000), Rhythmic organization and signal characteristics of speech, ICSLP
Sven E. G. Öhman (2000), Oral culture in the 21st century: the case of speech processing, ICSLP
Jintao Jiang, Abeer Alwan, Lynne E. Bernstein, Patricia Keating, Ed Auer (2000), On the correlation between facial movements, tongue movements and speech acoustics, ICSLP
S. P. Whiteside, E. Rixon (2000), Coarticulation patterns in identical twins: an acoustic case study, ICSLP
Philip Hanna, Darryl Stewart, Ji Ming, F. Jack Smith (2000), Improved lexicon formation through removal of co-articulation and acoustic recognition errors, ICSLP
Anders Lindström, Anna Kasaty (2000), A two-level approach to the handling of foreign items in Swedish speech technology applications, ICSLP
Yasuharu Den, Herbert H. Clark (2000), Word repetitions in Japanese spontaneous speech, ICSLP
Allard Jongman, Corinne B. Moore (2000), The role of language experience in speaker and rate normalization processes, ICSLP
Achim F. Müller, Jianhua Tao, Rüdiger Hoffmann (2000), Data-driven importance analysis of linguistic and phonetic information, ICSLP
Hiroya Fujisaki, Katsuhiko Shirai, Shuji Doshita, Seiichi Nakagawa, Keikichi Hirose, Shuichi Itahashi, Tatsuya Kawahara, Sumio Ohno, Hideaki Kikuchi, Kenji Abe, Shinya Kiriyama (2000), Overview of an intelligent system for information retrieval based on human-machine dialogue through spoken language, ICSLP
Li-chiung Yang (2000), The expression and recognition of emotions through prosody, ICSLP
Marc Swerts, Miki Taniguchi, Yasuhiro Katagiri (2000), Prosodic marking of information status in tokyo Japanese, ICSLP
Britta Wrede, Gernot A. Fink, Gerhard Sagerer (2000), Influence of duration on static and dynamic properties of German vowels in spontaneous speech, ICSLP
Bo Zheng, Bei Wang, Yufang Yang, Shinan Lu, Jianfen Cao (2000), The regular accent in Chinese sentences, ICSLP
Odile Mella, Dominique Fohr, Laurent Martin, Andreas Carlen (2000), A tool for the synchronization of speech and mouth shapes: LIPS, ICSLP
Mohamed-Zakaria Kurdi (2000), Semantic tree unification grammar: a new formalism for spoken language processing, ICSLP
Akira Kurematsu, Yousuke Shionoya (2000), Identification of utterance intention in Japanese spontaneous spoken dialogue by use of prosody and keyword information, ICSLP
Sherif Abdou, Michael Scordilis (2000), Improved speech understanding using dialogue expectation in sentence parsing, ICSLP
Helen M. Meng, Carmen Wai, Roberto Pieraccini (2000), The use of belief networks for mixed-initiative dialog modeling, ICSLP
Michael F. McTear, Susan Allen, Laura Clatworthy, Noelle Ellison, Colin Lavelle, Helen McCaffery (2000), Integrating flexibility into a structured dialogue model: some design considerations, ICSLP
Yasuhisa Niimi, Tomoki Oku, Takuya Nishimoto, Masahiro Araki (2000), A task-independent dialogue controller based on the extended frame-driven method, ICSLP
Wei Xu, Alex Rudnicky (2000), Language modeling for dialog system, ICSLP
Kallirroi Georgila, Nikos Fanotakis, George Kokkinakis (2000), Building stochastic language model networks based on simultaneous word/phrase clustering, ICSLP
Li-chiung Yang, Richard Esposito (2000), Prosody and topic structuring in spoken dialogue, ICSLP
Stéphane H. Maes (2000), Elements of conversational computing - a paradigm shift, ICSLP
Ludek Müller, Filip Jurcicek, Lubos Smidl (2000), Rejection and key-phrase spottin techniques using a mumble model in a czech telephone dialog system, ICSLP
Tim Paek, Eric Horvitz, Eric Ringger (2000), Continuous listening for unconstrained spoken dialog, ICSLP
Stefanie Shriver, Alan W. Black, Ronald Rosenfeld (2000), Audio signals in speech interfaces, ICSLP
Péter Pál Boda (2000), Visualisation of spoken dialogues, ICSLP
Mary Zajicek (2000), The construction of speech output to support elderly visually impaired users starting to use the internet, ICSLP
Kazuyuki Takagi, Rei Oguro, Kazuhiko Ozeki (2000), Effects of word string language models on noisy broadcast news speech recognition, ICSLP
Xiaoqiang Luo, Martin Franz (2000), Semantic tokenization of verbalized numbers in language modeling, ICSLP
Kazuomi Kato, Hiroaki Nanjo, Tatsuya Kawahara (2000), Automatic transcription of lecture speech using topic-independent language modeling, ICSLP
Rocio Guillén, Randal Erman (2000), Extending grammars based on similar-word recognition, ICSLP
E. W. D. Whittaker, P. C. Woodland (2000), Particle-based language modelling, ICSLP
W. N. Choi, Y. W. Wong, Tan Lee, P. C. Ching (2000), Lexical tree decoding with a class-based language model for Chinese speech recognition, ICSLP
K. Visweswariah, H. Printz, M. Picheny (2000), Impact of bucketing on performance of linearly interpolated language models, ICSLP
Shuwu Zhang, Hirofami Yamamoto, Yoshinori Sagisaka (2000), An embedded knowledge integration for hybrid language modelling, ICSLP
Lucian Galescu, James Allen (2000), Hierarchical statistical language models: experiments on in-domain adaptation, ICSLP
Hirofumi Yamamoto, Kouichi Tanigaki, Yoshinori Sagisaka (2000), A language model for conversational speech recognition using information designed for speech translation, ICSLP
Bob Carpenter, Sol Lerner, Roberto Pieraccini (2000), Optimizing BNF grammars through source transformations, ICSLP
Jian Wu, Fang Zheng (2000), On enhancing katz-smoothing based back-off language model, ICSLP
Wei Xu, Alex Rudnicky (2000), Can artificial neural networks learn language models?, ICSLP
Guergana Savova, Michael Schonwetter, Sergey Pakhomov (2000), Improving language model perplexity and recognition accuracy for medical dictations via within-domain interpolation with literal and semi-literal corpora, ICSLP
Karl Weilhammer, Günther Ruske (2000), Placing structuring elements in a word sequence for generating new statistical language models, ICSLP
Yannick Estève, Frédéric Béchet, Renato de Mori (2000), Dynamic selection of language models in a dialogue system, ICSLP
Magne H. Johnsen, Trym Holter, Torbjørn Svendsen, Erik Harborg (2000), Stochastic modeling of semantic content for use IN a spoken dialogue system, ICSLP
Tomio Takara, Eiji Nagaki (2000), Spoken word recognition using the artificial evolution of a set of vocabulary, ICSLP
Eric Horvitz, Tim Paek (2000), Deeplistener: harnessing expected utility to guide clarification dialog in spoken language systems, ICSLP
Yunbin Deng, Bo Xu, Taiyi Huang (2000), Chinese spoken language understanding across domain, ICSLP
Sven C. Martin, Andreas Kellner, Thomas Portele (2000), Interpolation of stochastic grammar and word bigram models in natural language understanding, ICSLP
Satoru Kogure, Seiichi Nakagawa (2000), A portable development tool for spoken dialogue systems, ICSLP
Yi-Chung Lin, Huei-Ming Wang (2000), Error-tolerant language understanding for spoken dialogue systems, ICSLP
Akinori Ito, Chiori Hori, Masaharu Katoh, Masaki Kohda (2000), Language modeling by stochastic dependency grammar for Japanese speech recognition, ICSLP
Ruiqiang Zhang, Ezra Black, Andrew Finch, Yoshinori Sagisaka (2000), A tagger-aided language model with a stack decoder, ICSLP
Julia Hirschberg, Diane Litman, Marc Swerts (2000), Generalizing prosodic prediction of speech recognition errors, ICSLP
Jerome R. Bellegarda, Kim E. A. Silverman (2000), Toward unconstrained command and control: data-driven semantic inference, ICSLP
Ken Hanazawa, Shinsuke Sakai (2000), Continuous speech recognition with parse filtering, ICSLP
Martine Adda-Decker, Gilles Adda, Lori Lamel (2000), Investigating text normalization and pronunciation variants for German broadcast transcription, ICSLP
Mirjam Wester, Eric Fosler-Lussier (2000), A comparison of data-derived and knowledge-based modeling of pronunciation variation, ICSLP
Judith M. Kessens, Helmer Strik, Catia Cucchiarini (2000), A bottom-up method for obtaining information about pronunciation variation, ICSLP
Jiyong Zhang, Fang Zheng, Mingxing Xu, Ditang Fang (2000), Semi-continuous segmental probability modeling for continuous speech recognition, ICSLP
Christos A. Antoniou, T. Jeff Reynolds (2000), Acoustic modelling using modular/ensemble combinations of heterogeneous neural networks, ICSLP
Hsiao-Wuen Hon, Shankar Kumar, Kuansan Wang (2000), Unifying HMM and phone-pair segment models, ICSLP
Ming Li, Tiecheng Yu (2000), Multi-group mixture weight HMM, ICSLP
Tetsuro Kitazoe, Tomoyuki Ichiki, Makoto Funamori (2000), Application of pattern recognition neural network model to hearing system for continuous speech, ICSLP
Nathan Smith, Mahesan Niranjan (2000), Data-dependent kernels in svm classification of speech patterns, ICSLP
S. Umesh, Richard C. Rose, S. Parthasarathy (2000), Exploiting frequency-scaling invariance properties of the scale transform for automatic speech recognition, ICSLP
Masahiro Fujimoto, Jun Ogata, Yasuo Ariki (2000), Large vocabulary continuous speech recognition under real environments using adaptive sub-band spectral subtraction, ICSLP
Liang Gu, Kenneth Rose (2000), Perceptual harmonic cepstral coefficients as the front-end for speech recognition, ICSLP
Yik-Cheung Tam, Brian Mak (2000), Optimization of sub-band weights using simulated noisy speech in multi-band speech recognition, ICSLP
Robert Faltlhauser, Thilo Pfau, Günther Ruske (2000), On the use of speaking rate as a generalized feature to improve decision trees, ICSLP
Jun Toyama, Masaru Shimbo (2000), Syllable recognition using glides based on a non-linear transformation, ICSLP
Kemal Sönmez, Madelaine Plauché, Elizabeth Shriberg, Horacio Franco (2000), Consonant discrimination in elicited and spontaneous speech: a case for signal-adaptive front ends in ASR, ICSLP
Khalid Daoudi, Dominique Fohr, Christophe Antoine (2000), A new approach for multi-band speech recognition based on probabilistic graphical models, ICSLP
Hervé Glotin, Frédéric Berthommier (2000), Test of several external posterior weighting functions for multiband full combination ASR, ICSLP
Kanji Okada, Takayuki Arai, Noburu Kanederu, Yasunori Momomura, Yuji Murahara (2000), Using the modulation wavelet transform for feature extraction in automatic speech recognition, ICSLP
Qifeng Zhu, Abeer Alwan (2000), AM-demodulation of speech spectra and its application io noise robust speech recognition, ICSLP
Astrid Hagen, Andrew Morris (2000), Comparison of HMM experts with MLP experts in the full combination multi-band approach to robust ASR, ICSLP
Astrid Hagen, Hervé Bourlard (2000), Using multiple time scales in the framework of multi-stream speech recognition, ICSLP
Hua Yu, Alex Waibel (2000), Streamlining the front end of a speech recognizer, ICSLP
Bhiksha Raj, Michael L. Seltzer, Richard M. Stern (2000), Reconstruction of damaged spectrographic features for robust speech recognition, ICSLP
Janienke Sturm, Hans Kamperman, Lou Boves, Els den Os (2000), Impact of speaking style and speaking task on acoustic models, ICSLP
Shubha Kadambe, Ron Burns (2000), Encoded speech recognition accuracy improvement in adverse environments by enhancing formant spectral bands, ICSLP
Jon Barker, Ljubomir Josifovski, Martin Cooke, Phil Green (2000), Soft decisions in missing data techniques for robust automatic speech recognition, ICSLP
Jian Liu, Tiecheng Yu (2000), New tone recognition methods for Chinese continuous speech, ICSLP
Bo Zhang, Gang Peng, William S.-Y. Wang (2000), Reliable bands guided similarity measure for noise-robust speech recognition, ICSLP
Tsuneo Nitta, Masashi Takigawa, Takashi Fukuda (2000), A novel feature extraction using multiple acoustic feature planes for HMM-based speech recognition, ICSLP
Fang Zheng, Guoliang Zhang (2000), Integrating the energy information into MFCC, ICSLP
Omar Farooq, Sekharjit Datta (2000), Speaker independent phoneme recognition by MLP using wavelet features, ICSLP
Laurent Couvreur, Christophe Couvreur, Christophe Ris (2000), A corpus-based approach for robust ASR in reverberant environments, ICSLP
Issam Bazzi, James R. Glass (2000), Modeling out-of-vocabulary words for robust speech recognition, ICSLP
Bojana Gajic, Richard C. Rose (2000), Hidden Markov model environmental compensation for automatic speech recognition on hand-held mobile devices, ICSLP
Andrew C. Morris, Ljubomir Josifovski, Hervé Bourlard, Martin Cooke, Phil Green (2000), A neural network for classification with incomplete data: application to robust ASR, ICSLP
Shigeki Matsuda, Mitsuru Nakai, Hiroshi Shimodaira, Shigeki Sagayama (2000), Feature-dependent allophone clustering, ICSLP
Qian Yang, Jean-Pierre Martens (2000), Data-driven lexical modeling of pronunciation variations for ASR, ICSLP
Dat Tran, Michael Wagner (2000), Fuzzy entropy hidden Markov models for speech recognition, ICSLP
Carl Quillen (2000), Adjacent node continuous-state HMM’s, ICSLP
Janienke Sturm, Eric Sanders (2000), Modelling phonetic context using head-body-tail models for connected digit recognition, ICSLP
Issam Bazzi, Dina Katabi (2000), Using support vector machines for spoken digit recognition, ICSLP
Jiping Sun, Xing Jing, Li Deng (2000), Data-driven model construction for continuous speech recognition using overlapping articulatory features, ICSLP
Marcel Vasilache (2000), Speech recognition using HMMs with quantized parameters, ICSLP
Yingyong Qi, Jack Xin (2000), A perception and PDE based nonlinear transformation for processing spoken words, ICSLP
Reinhard Blasig, Georg Rose, Carsten Meyer (2000), Training of isolated word recognizers with continuous speech, ICSLP
Shu-Chuan Tseng (2000), Repair patterns in spontaneous Chinese dialogs: morphemes, words, and phrases, ICSLP
Jianwu Dang, Kiyoshi Honda (2000), Improvement of a physiological articulatory model for synthesis of vowel sequences, ICSLP
Kunitoshi Motoki, Xavier Pelorson, Pierre Badin, Hiroki Matsuzaki (2000), Computation of 3-d vocal tract acoustics based on mode-matching technique, ICSLP
Lucie Ménard, Louis-Jean Boë (2000), Exploring vowel production strategies from infant to adult by means of articulatory inversion of formant data, ICSLP
Gavin Smith, Tony Robinson (2000), Segmentation of a speech waveform according to glottal open and closed phases using an autoregressive-HMM, ICSLP
Rosemary Orr, Bert Cranen, Felix de Jong, Lou Boves (2000), Comparison of inverse filtering of the flow signal and microphone signal, ICSLP
Markus R. Iseli, Abeer Alwan (2000), Inter- and intra-speaker variability of glottal flow derivative using the LF model, ICSLP
Philippe Blache, Daniel Hirst (2000), Multi-level annotation for spoken language corpora, ICSLP
Aijun Li, Fang Zheng, William Byrne, Pascale Fung, Terri Kamm, Yi Liu, Zhanjiang Song, Umar Ruhi, Veera Venkataramani, XiaoXia Chen (2000), CASS: a phonetically transcribed corpus of mandarin spontaneous speech, ICSLP
Kazuhide Yamamoto, Eiichiro Sumita (2000), Multiple decision-tree strategy for input-error robustness: a simulation of tree combinations, ICSLP
Zheng Chen, Kai-Fu Lee, Ming-jing Li (2000), Discriminative training on language model, ICSLP
Jianfeng Gao, Mingjing Li, Kai-Fu Lee (2000), N-gram distribution based language model adaptation, ICSLP
Francisco Palou, P. Bravetti, O. Emam, V. Fischer, Eric Janke (2000), Towards a common phone alphabet for multilingual speech recognition, ICSLP
Robert Belvin, Ron Burns, Cheryl Hein (2000), What²s next: a case study in the multidimensionality of a dialog system, ICSLP
Masanobu Higashida, Kumiko Ohmori (2000), A new dialogue control method based on human listening process to construct an interface for ascertaining a user²s inputs, ICSLP
XianFang Wang, LiMin Du (2000), Spoken language understanding in a Chinese spoken dialogue system engine, ICSLP
Satya Dharanipragada, Martin Franz, J. Scott McCarley, K. Papineni, Salim Roukos, T. Ward, W.-J. Zhu (2000), Statistical methods for topic segmentation, ICSLP
Berlin Chen, Hsin-min Wang, Lin-shan Lee (2000), Retrieval of mandarin broadcast news using spoken queries, ICSLP
John H. L. Hansen, Jay Plucienkowski, Stephen Gallant, Bryan Pellom, Wayne Ward (2000), CU-move: robust speech processing for in-vehicle speech systems, ICSLP
Ji-Hwan Kim, Philip C. Woodland (2000), A rule-based named entity recognition system for speech input, ICSLP
Mohammad Reza Sadigh, Hamid Sheikhzadeh, M. R. Jahangir, Arash Farzan (2000), A rule-based approach to farsi language text-to-phoneme conversion, ICSLP
Allard Jongman, Yue Wang, Joan Sereno (2000), Acoustic and perceptual properties of English fricatives, ICSLP
Stefanie Shattuck-Hufnagel, Nanette Veilleux (2000), The special phonological characteristics of monosyllabic function words in English, ICSLP
Miren Karmele López de Ipiña, Inés Torres, Lourdes Oñederra, Amparo Varona, Luis Javier Rodríguez (2000), Selection of sublexical units for continuous speech recognition of basque, ICSLP
Madelaine C. Plauché, Kemal Sönmez (2000), Machine learning techniques for the identification of cues for stop place, ICSLP
Christina Widera (2000), Strategies of vowel reduction - a speaker-dependent phenomenon, ICSLP
Michelle A. Fox (2000), Syllable-final /s/ lenition in the LDC's callhome Spanish corpus, ICSLP
Akira Kurematsu, Takeaki Nakazaki (2000), Meaning extraction based on frame representation for Japanese spoken dialogue, ICSLP
Johanneke Caspers (2000), Pitch accents, boundary tones and turn-taking in dutch map task dialogues, ICSLP
Yoichi Yamashita, Michiyo Murai (2000), An annotation scheme of spoken dialogues with topic break indexes, ICSLP
Nanette Veilleux (2000), Application of the centering framework in spontaneous dialogues, ICSLP
Hiroki Mori, Hideki Kasuya (2000), Automatic lexicon generation and dialogue modeling for spontaneous speech, ICSLP
Maria Wolters, Hansjörg Mixdorff (2000), Evaluating radio news intonation - autosegmental versus superpositional modelling, ICSLP
Daniele Falavigna, Roberto Gretter, Marco Orlandi (2000), A mixed language model for a dialogue system over ihe telephone, ICSLP
Linda Bell, Joakim Gustafson (2000), Positive and negative user feedback in a spoken dialogue corpus, ICSLP
Anne Cutler, Mariëtte Koster (2000), Stress and lexical activation in dutch, ICSLP
Safa Nasser Eldin, Hanna Abdel Nour, Rajouani Abdenbi (2000), Automatic modeling and implementation of intonation for the arabic language in TTS systems, ICSLP
Venkata Ramana Rao Gadde (2000), Modeling word durations, ICSLP
Jennifer J. Venditti, Jan P. H. van Santen (2000), Japanese intonation synthesis using superposition and linear alignment models, ICSLP
Toshimitsu Minowa, Ryo Mochizuki, Hirofumi Nishimura (2000), Improving the naturalness of synthetic speech by utilizing the prosody of natural speech, ICSLP
Sin-Horng Chen, Chen-Chung Ho (2000), A hybrid statistical/RNN approach to prosody synthesis for taiwanese TTS, ICSLP
Nobuaki Minematsu, Yukiko Fujisawa, Seiichi Nakagawa (2000), Performance comparison among HMM, DTW, and human abilities in terms of identifying stress patterns of word utterances, ICSLP
Juan Manuel Montero, Ricardo Córdoba, José A. Vallejo, Juana Gutiérrez-Arriola, Emilia Enríquez, Juan Manuel Pardo (2000), Restricted-domain female-voice synthesis in Spanish: from database design to ANN prosodic modeling, ICSLP
Xavier Fernández-Salgado, Eduardo R. Banga (2000), A hierarchical intonation model for synthesising F0 contours in galician language, ICSLP
Ted H. Applebaum, Nick Kibre, Steve Pearson (2000), Features for F0 contour prediction, ICSLP
Zhenglai Gu, Hiroki Mori, Hideki Kasuya (2000), Prosodic variation of focused syllables of disyllabic word in Mandarin Chinese, ICSLP
Stephen M. Chu, Thomas S. Huang (2000), Automatic head gesture learning and synthesis from prosodic cues, ICSLP
Martti Vainio, Toomas Altosaar, Stefan Werner (2000), Measuring the importance of morphological information for finnish speech synthesis, ICSLP
Oliver Jokisch, Hansjörg Mixdorff, Hans Kruschke, Ulrich Kordon (2000), Learning the parameters of quantitative prosody models, ICSLP
Shuichi Narusawa, Hiroya Fujisaki, Sumio Ohno (2000), A method for automatic extraction of parameters of the fundamental frequency contour, ICSLP
Tetsuro Kitazoe, Sung-Ill Kim, Yasunari Yoshitomi, Tatsuhiko Ikeda (2000), Recognition of emotional states using voice, face image and thermal image of face, ICSLP
Keiko Watanuki, Susumu Seki, Hideo Miyoshi (2000), Turn taking and multimodal information in two-people dialog, ICSLP
Hamid Reza Abutalebi, Mahmood Bijankhan (2000), Implementation of a text-to-speech system for farsi language, ICSLP
Richard Huber, Anton Batliner, Jan Buckow, Elmar Nöth, Volker Warnke, Heinrich Niemann (2000), Recognition of emotion in a realistic dialogue scenario, ICSLP
Johanna Barry, Peter Blamey, Kathy Lee, Dilys Cheung (2000), Differentiation in tone production in cantonese-speaking hearing-impaired children, ICSLP
Martine van Zundert, Jacques Terken (2000), Learning effects for phonetic properties of synthetic speech, ICSLP
Laura Mayfield Tomokiyo, Le Wang, Maxine Eskenazi (2000), An empirical study of the effectiveness of speech-recognition-based pronunciation training, ICSLP
Olivier Deroo, Christophe Ris, Sofie Gielen, Johan Vanparys (2000), Automatic detection of mispronounced phonemes for language learning tools, ICSLP
Horacio Meza Escalona, Ingrid Kirschning, Ofelia Cervantes Villagómez (2000), Estimation of duration models for phonemes in m exican speech synthesis, ICSLP
Xiaoru Wu, Renhua Wang, Guoping Hu (2000), Special text processing based external descriptor rule, ICSLP
Zhenli Yu, Shangcui Zeng (2000), Articulatory synthesis using a vocal-tract model of variable length, ICSLP
Philippe Boula de Mareüil (2000), Linguistic-prosodic processing for text-to-speech synthesis in italian, ICSLP
Matthias Eichner, Matthias Wolff, Rüdiger Hoffmann (2000), A unified approach for speech synthesis and speech recognition using stochastic Markov graphs, ICSLP
Andrew Breen, James Salter (2000), Using F0 within a phonologically motivated method of unit selection, ICSLP
Christophe J. Blouin, Paul C. Bagshaw (2000), Analysis of the degradation of French vowels induced by the TD-PSOLA algorithm, in text-to-speech context, ICSLP
Artur Janicki (2000), Automatic construction of acoustic inventory for the concatenative speech synthesis for polish, ICSLP
Diane Hirschfeld, Matthias Wolff (2000), Universal and multilingual unit selection for DRESS, ICSLP
Davis Pan, Brian Heng, Shiufun Cheung, Ed Chang (2000), Improving speech synthesis for high intelligibility under adverse conditions, ICSLP
Nobuyuki Nishizawa, Nobuaki Minematsu, Keikichi Hirose (2000), Development of a formant-based analysis-synthesis system and generation of high quality liquid sounds of Japanese, ICSLP
Oliver Jokisch, Matthias Eichner (2000), Synthesizing and evaluating an artificial language: klingon, ICSLP
Craig Olinsky, Alan W. Black (2000), Non-standard word and homograph resolution for asian language text analysis, ICSLP
Zhang Sen, Katsuhiko Shirai (2000), Re-estimation of LPC coefficients in the sense of l&inf; criterion, ICSLP
Sung-Kyo Jung, Yong-Soo Choi, Young-Cheol Park, Dae-Hee Youn (2000), An efficient codebook search algorithm for EVRC, ICSLP
Jong-Kuk Kim, Jeong-Jin Kim, Myung-Jin Bae (2000), The reduction of the search time by the pre-determination of the grid bit in the g.723.1 MP-MLQ, ICSLP
Sebastian Möller, Hervé Bourlard (2000), Real-time telephone transmission simulation for speech recognizer and dialogue system evaluation and improvement, ICSLP
Rathinavelu Chengalvarayan, David L. Thomson (2000), HMM-based echo and announcement modeling approaches for noise suppression avoiding the problem of false triggers, ICSLP
Fangxin Chen (2000), Speaker information enhancement, ICSLP
Hans Dolfing (2000), Exhaustive search for lower-bound error-rates in vocal tract length normalization, ICSLP
Dusan Macho, Climent Nadeu (2000), Use of voicing information to improve the robustness of the spectral parameter set, ICSLP
Kaisheng Yao, Bertram E. Shi, Satoshi Nakamura, Zhigang Cao (2000), Residual noise compensation by a sequential EM algorithm for robust speech recognition in nonstationary noise, ICSLP
Hui Ye, Pascale Fung, Taiyi Huang (2000), Principal mixture speaker adaptation for improved continuous speech recognition, ICSLP
Toomas Altosaar, Martti Vainio (2000), Reduced impedance mismatch in speech database access, ICSLP
Jiapeng Tian, Jouji Miwa (2000), Internet training system for listening and pronunciation of Chinese stop consonants, ICSLP
Carlos Toshinori Ishi, Keikichi Hirose, Nobuaki Minematsu (2000), Identification of Japanese double-mora phonemes considering speaking rate for the use in CALL systems, ICSLP
Roy D. Patterson, Stefan Uppenkamp, Dennis Norris, William Marslen-Wilson, Ingrid Johnsrude, Emma Williams (2000), Phonological processing in the auditory system: a new class of stimuli and advances in fmri techniques, ICSLP
Itaru F. Tatsumi, Michio Senda, Kenji Ishii, Masahiro Mishina, Masashi Oyama, Hinako Toyama, Keiichi Oda, Masayuki Tanaka, Yasuyuki Gondo (2000), Brain regions responsible for word retrieval, speech production and deficient word fluency in elderly people: a PET activation study, ICSLP
Paavo Alku, Hannu Tiitinen, Kalle J. Palomäki, Päivi Sivonen (2000), MEG-measurements of brain activity reveal the link between human speech production and perception, ICSLP
Karalyn Patterson, Matthew A. Lambon Ralph, Helen Bird, John R. Hodges, James L. McClelland (2000), Normal and impaired processing in quasi-regular domains of language: the case of English past-tense verbs, ICSLP
Nadine Martin, Eleanor M. Saffran, Gary S. Dell, Myrna F. Schwartz, Prahlad Gupta (2000), Neuropsychological and computational evidence for a model of lexical processing, verbal short-term memory and learning, ICSLP
Takao Fushimi, Mutsuo Ijuin, Naoko Sakuma, Masayuki Tanaka, Tadahisa Kondo, Shigeaki Amano, Karalyn Patterson, Itaru F. Tatsumi (2000), Normal and impaired reading of Japanese kanji and kana, ICSLP
Mutsuo Ijuin, Takao Fushimi, Karalyn Patterson, Naoko Sakuma, Masayuki Tanaka, Itaru Tatsumi, Tadahisa Kondo, Shigeaki Amano (2000), A connectionist approach to naming disorders of Japanese in dyslexic patients, ICSLP
Taeko N. Wydell, Takako Shinkai (2000), Impaired pronunciations of kanji words by Japanese CVA patients, ICSLP
Akira Uno, M. Kaneko, N. Haruhara, M. Kaga (2000), Disability of phonological versus visual information processes in Japanese dyslexic children, ICSLP
Xiaolin Zhou, Yanxuan Qu (2000), Lexical tone in the spoken word recognition of Chinese, ICSLP
Xiaolin Zhou, Jie Zhuang (2000), Lexical tone in the speech production of Chinese words, ICSLP
Yu Hu, Qin-Feng Liu, Ren-Hua Wang (2000), Prosody generation in Chinese synthesis using the template of quantified prosodic unit and base intonation contour, ICSLP
Yiqiang Chen, Wen Gao, Tingshao Zhu, Jiyong Ma (2000), Multi-strategy data mining on Mandarin prosodic patterns, ICSLP
Werner Verhelst, Dirk van Compernolle, Patrick Wambacq (2000), A unified view on synchronized overlap-add methods for prosodic modifications of speech, ICSLP
Chilin Shih, Greg P. Kochanski (2000), Chinese tone modeling with stem-ML, ICSLP
Colin W. Wightman, Ann K. Syrdal, Georg Stemmer, Alistair Conkie, Mark Beutnagel (2000), Perceptually based automatic prosody labeling and prosodically enriched unit selection improve concatenative text-to-speech synthesis, ICSLP
Achim F. Müller, Jianhua Tao, Rüdiger Hoffmann (2000), Data-driven importance analysis of linguistic and phonetic information, ICSLP
Zhiqiang Li, Degif Petros Banksira (2000), Tonal structure of yes-no question intonation in chaha, ICSLP
Chao Wang, Stephanie Seneff (2000), Improved tone recognition by normalizing for coarticulation and intonation effects, ICSLP
Jin-Song Zhang, Satoshi Nakamura, Keikichi Hirose (2000), Discriminating Chinese lexical tones by anchoring F0 features, ICSLP
Carlos Gussenhoven, Aoju Chen (2000), Universal and language-specific effects in the perception of question intonation, ICSLP
Chiu-Yu Tseng, Da-De Chen (2000), The interplay and interaction between prosody and syntax: evidence from Mandarin Chinese, ICSLP
Hansjörg Mixdorff, Hiroya Fujisaki (2000), A quantitative description of German prosody offering symbolic labels as a by-product, ICSLP
Roni Rosenfeld, Xiaojin Zhu, Arthur Toth, Stefanie Shriver, Kevin Lenzo, Alan W. Black (2000), Towards a universal speech interface, ICSLP
Dale Russell (2000), A domain model centered approach to spoken language dialog systems, ICSLP
Georges Fafiotte, Jian-She Zhai (2000), From multilingual multimodal spoken language acquisition towards on-line assistance to intermittent human interpreting: SIM*, a versatile environment for SLP, ICSLP
Matthias Denecke (2000), Informational characterization of dialogue states, ICSLP
Kenji Abe, Kazushige Kurokawa, Kazunari Taketa, Sumio Ohno, Hiroya Fujisaki (2000), A new method for dialogue management in an intelligent system for information retrieval, ICSLP
Esther Levin, Shrikanth Narayanan, Roberto Pieraccini, Konstantin Biatov, E. Bocchieri, Giuseppe Di Fabbrizio, Wieland Eckert, S. Lee, A. Pokrovsky, Mazin Rahim, P. Ruscitti, M. Walker (2000), The AT&t-DARPA communicator mixed-initiative spoken dialog system, ICSLP
Srinivas Bangalore, Michael Johnston (2000), Integrating multimodal language processing with speech recognition, ICSLP
Alexander I. Rudnicky, Christina Bennett, Alan W. Black, Ananlada Chotimongkol, Kevin Lenzo, Alice Oh, Rita Singh (2000), Task and domain specific modelling in the Carnegie Mellon communicator system, ICSLP
Joakim Gustafson, Linda Bell, Jonas Beskow, Johan Boye, Rolf Carlson, Jens Edlund, Björn Granström, David House, Mats Wirén (2000), Adapt - a multimodal conversational dialogue system in an apartment domain, ICSLP
Kuansan Wang (2000), Implementation of a multimodal dialog system using extended markup languages, ICSLP
Stephanie Seneff, Chian Chuu, D. Scott Cyphers (2000), ORION: from on-line interaction to off-line delegation, ICSLP
Lei Duan, Alexander Franz, Keiko Horiguchi (2000), Practical spoken language translation using compiled feature structure grammars, ICSLP
Helen Meng, Shuk Fong Chan, Yee Fong Wong, Tien Ying Fung, Wai Ching Tsui, Tin Hang Lo, Cheong Chat Chan, Ke Chen, Lan Wang, Ting Yao Wu, Xiaolong Li, Tan Lee, Wing Nin Choi, Yiu Wing Wong, P. C. Ching, Huisheng Chi (2000), ISIS: A multilingual spoken dialog system developed with CORBA and KQML agents, ICSLP
Jun-Ichi Hirasawa, Noboru Miyazaki, Mikio Nakano, Kiyoaki Aikawa (2000), New feature parameters for detecting misunderstandings in a spoken dialogue system, ICSLP
Parham Mokhtari, Frantz Clermont, Kazuyo Tanaka (2000), Toward an acoustic-articulatory model of inter-speaker variability, ICSLP
Pascal Perrier, Joseph Perkell, Yohan Payan, Majid Zandipour, Frank Guenther, Ali Khalighi (2000), Degrees of freedom of tongue movements in speech may be constrained by biomechanics, ICSLP
Béatrice Vaxelaire, Rudolph Sock, Pascal Perrier (2000), Gestural overlap, place of articulation and speech rate - an x-ray investigation, ICSLP
Masaaki Honda, Akinori Fujino (2000), Articulatory compensation and adaptation for unexpected palate shape perturbation, ICSLP
Takuya Niikawa, Masafumi Matsumura, Takashi Tachimura, Takeshi Wada (2000), Modeling of a speech production system based on MRI measurement of three-dimensional vocal tract shapes during fricative consonant phonation, ICSLP
Slim Ouni, Yves Laprie (2000), Improving acoustic-to-articulatory inversion by using hypercube codebooks, ICSLP
Wael M. Hamza, Mohsen A. Rashwan (2000), Concatenative arabic speech synthesis using large speech database, ICSLP
Dong Chen, Jingming Kuang, Yan Zhang (2000), A new speech classifier based on Yinyang compensatory soft computing theory, ICSLP
Sebastian Möller, Ute Jekosch, Alexander Raake (2000), New models predicting conversational effects of telephone transmission on speech communication quality, ICSLP
Jinyu Li, Xin Luo, Ren-Hua Wang (2000), A novel search algorithm for LSF VQ, ICSLP
Stéphane H. Maes, Dan Chazan, Gilad Cohen, Ron Hoory (2000), Conversational networking: conversational protocols for transport, coding, and control, ICSLP
Hiroshi Ohmura, Akira Sasou, Kazuyo Tanaka (2000), A low bit rate speech coding method using a formant-articulatory parameter nomogram, ICSLP
Ning Li, Derek J. Molyneux, Meau Shin Ho, B. M. G. Cheetham (2000), Variable bit-rate sinusoidal transform coding using variable order spectral estimation, ICSLP
Yong-Soo Choi, Sueng-Kyun Ryu, Young-Cheol Park, Dae-Hee Youn (2000), Efficient harmonic-CELP based hybrid coding of speech at low bit rates, ICSLP
Jesper Jensen, John H. L. Hansen (2000), Speech enhancement based on a constrained sinusoidal model, ICSLP
Sang-Wook Park, Seung-Kyun Ryu, Young-Cheol Park, Dae-Hee Youn (2000), A bark coherence function for perceived speech quality estimation, ICSLP
Jinyu Kiang, Kun Deng, Ronghuai Huang (2000), A high-efficiency scheme for secure speech transmission using spatiotemporal chaos synchronization, ICSLP
Leandro Rodríguez Liñares, Carmen García Mateo (2000), Application of speaker authentication technology to a telephone dialogue system, ICSLP
Michel Dutat, Ivan Magrin-Chagnolleau, Frédéric Bimbot (2000), Language recognition using time-frequency principal component analysis and acoustic modeling, ICSLP
Chularat Tanprasert, Varin Achariyakulporn (2000), Comparative study of GMM, DTW, and ANN on Thai speaker identification system, ICSLP
Ludwig Schwardt, Johan du Preez (2000), Efficient mixed-order hidden Markov model inference, ICSLP
Olivier Thyes, Roland Kuhn, Patrick Nguyen, Jean-Claude Junqua (2000), Speaker identification and verification using eigenvoices, ICSLP
Arun C. Surendran, Chin-Hui Lee (2000), A priori threshold selection for fixed vocabulary speaker verification systems, ICSLP
Qin Jin, Alex Waibel (2000), Application of LDA to speaker recognition, ICSLP
Ludwig Schwardt, Johan du Preez (2000), Automatic language identification using mixed-order HMMs and untranscribed corpora, ICSLP
Johan Lindberg, Mats Blomberg (2000), On the potential threat of using large speech corpora for impostor selection in speaker verification, ICSLP
J. Ortega-Garcia, J. G. Rodriguez, D. T. Merino (2000), Phonetic consistency in Spanish for pin-based speaker verification system, ICSLP
Zhimin Liu, Xihong Wu, Bin Zhen, Huisheng Chi (2000), An auditory feature extraction method based on forward-masking and its application in robust speaker identification and speech recognition, ICSLP
S. Douglas Peters, Matthieu Hébert, Daniel Boies (2000), Transition-oriented hidden Markov models for speaker verification, ICSLP
Pang Kuen Tsoi, Pascale Fung (2000), An LLR-based technique for frame selection for GMM-based text-independent speaker identification, ICSLP
Jiyong Ma, Wen Gao (2000), Robust speaker recognition based on high order cumulant, ICSLP
Luo Si, Qi Xiu Hu (2000), Two-stage speaker identification system based on VQ and NBDGMM, ICSLP
Johnny Mariethoz, Johan Lindberg, Frédéric Bimbot (2000), A MAP approach, with synchronous decoding and unit-based normalization for text-dependent speaker verification, ICSLP
Zhibin Pan, Koji Kotani, Tadahiro Ohmi (2000), A fast search method of speaker identification for large population using pre-selection and hierarchical matching, ICSLP
Lan Wang, Ke Chen, Huisheng Chi (2000), Optimal fusion of diverse feature sets for speaker identification: an alternative method, ICSLP
Upendra V. Chaudhari, Jiri Navrátil, Stéphane H. Maes, Ramesh Gopinath (2000), Transformation enhanced multi-grained modeling for text-independent speaker recognition, ICSLP
Takashi Masuko, Keiichi Tokuda, Takao Kobayashi (2000), Imposture using synthetic speech against speaker verification based on spectrum and pitch, ICSLP
Shahla Parveen, Abdul Qadeer, Phil Green (2000), Speaker recognition with recurrent neural networks, ICSLP
Yoshiroh Itoh, Jun Toyama, Masaru Shimbo (2000), Speaker feature extraction from pitch information based on spectral subtraction for speaker identification, ICSLP
Wei-Ho Tsai, Chiwei Che, Wen-Whei Chang (2000), Text-independent speaker identification using Gaussian mixture bigram models, ICSLP
Hassan Ezzaidi, Jean Rouat (2000), Comparison of MFCC and pitch synchronous AM, FM parameters for speaker identification, ICSLP
Marcos Faúndez-Zanu, Adam Slupinski (2000), Speaker verification in mismatch training and testing conditions, ICSLP
Toshiaki Uchibe, Shingo Kuroiwa, Norio Higuchi (2000), Determination of threshold for speaker verification using speaker adaptation gain in likelihood during training, ICSLP
Mingkuan Liu, Bo Xu (2000), Accent-specific Mandarin adaptation based on pronunciation modeling technology, ICSLP
Hyun Bok Lee (2000), In search of paralinguistic features, ICSLP
Gunnar Fant, Anita Kruckenberg (2000), A prominence based model of Swedish intonation, ICSLP
Hideki Kasuya, Masanori Yoshizawa, Kikuo Maekawa (2000), Roles of voice source dynamics as a conveyer of paralinguistic features, ICSLP
Kikuo Maekawa, Takayuki Kagomiya (2000), Influence of paralinguistic information on segmental articulation, ICSLP
Sumio Ohno, Yoshimitsu Sugiyama, Hiroya Fujisaki (2000), Analysis and modeling of the effect of paralinguistic information upon the local speech rate, ICSLP
Jianfen Cao (2000), Rhythm of spoken Chinese - linguistic and paralinguistic evidences -, ICSLP
Sanae Eda (2000), Identification and discrimination of syntactically and pragmatically contrasting intonation patterns by native and non-native speakers of standard Japanese, ICSLP
Donna Erickson, Arthur Abramson, Kikuo Maekawa, Tokihiko Kaburagi (2000), Articulatory characteristics of emotional utterances in spoken English, ICSLP
Keikichi Hirose, Nobuaki Minematsu, Hiromichi Kawanami (2000), Analytical and perceptual study on the role of acoustic features in realizing emotional speech, ICSLP
Sylvie J. L. Mozziconacci, Dik J. Hermes (2000), Expression of emotion and attitude through temporal speech variations, ICSLP
Klaus R. Scherer (2000), A cross-cultural investigation of emotion inferences from voice and speech: implications for speech technology, ICSLP
Bong-Seok Kang, Chul-Hee Han, Sang-Tae Lee, Dae-Hee Youn, Chungyong Lee (2000), Speaker dependent emotion recognition using speech signals, ICSLP
Edmilson S. Morais, Paul Taylor, Fábio Violaro (2000), Concatenative text-to-speech synthesis based on prototype waveform interpolation (a time frequency approach), ICSLP
Ren-Hua Wang, Zhongke Ma, Wei Li, Donglai Zhu (2000), A corpus-based Chinese speech synthesis with contextual dependent unit selection, ICSLP
Geert Coorman, Justin Fackrell, Peter Rutten, Bert Van Coile (2000), Segment selection in the L&h Realspeak laboratory TTS system, ICSLP
Ren-yuan Lyu, Zhen-hong Fu, Yuang-chin Chiang, Hui-mei Liu (2000), A Taiwanese (min-nan) text-to-speech (TTS) system based on automatically generated synthetic units, ICSLP
Masayuki Yamada, Yasuo Okutani, Toshiaki Fukada, Takashi Aso, Yasuhiro Komori (2000), Puretalk: a high quality Japanese text-to-speech system, ICSLP
Ka Man Law, Tan Lee (2000), Using cross-syllable units for Cantonese speech synthesis, ICSLP
Alan W. Black, Kevin A. Lenzo (2000), Limited domain synthesis, ICSLP
Christine H. Nakatani, Jennifer Chu-Carroll (2000), Coupling dialogue and prosody computation in spoken dialogue generation, ICSLP
Tomio Takara, Kazuto Izumi, Keiichi Funaki (2000), A study on the pitch pattern of a singing voice synthesis system based on the cepstral method, ICSLP
Steve Pearson, Roland Kuhn, Steven Fincke, Nick Kibre (2000), Automatic methods for lexical stress assignment and syllabification, ICSLP
Olga Goubanova, Paul Taylor (2000), Using bayesian belief networks for model duration in text-to-speech systems, ICSLP
Diane Hirschfeld (2000), Comparing static and dynamic features for segmental cost function calculation in concatenative speech synthesis, ICSLP
Pratibha Jain, Hynek Hermansky (2000), Temporal patterns of critical-band spectrum for text-to-speech, ICSLP
Eric H. C. Choi, Jianming Song (2000), Successive cohort selection (SCS) for text-independent speaker verification, ICSLP
Dat Tran, Michael Wagner (2000), Fuzzy normalisation methods for speaker verification, ICSLP
Yong Gu, Hans Jongebloed, Dorota Iskra, Els den Os, Lou Boves (2000), Speaker verification in operational environments - monitoring for improved service operation, ICSLP
Larry P. Heck, Nikki Mirghafori (2000), On-line unsupervised adaptation in speaker verification, ICSLP
P. Sivakumaran, A. M. Ariyaeeinia, Jill A. Hewitt (2000), Multiple sub-band systems for speaker verification, ICSLP
Xiaoxing Liu, Baosheng Yuan, Yonghong Yan (2000), An orthogonal GMM based speaker verification system, ICSLP
Qin Jin, Alex Waibel (2000), A naive de-lambing method for speaker identification, ICSLP
Douglas A. Reynolds, R. Bob Dunn, Jack L. McLaughlin (2000), The lincoln speaker recognition system: NIST eval2000, ICSLP
Aaron E. Rosenberg, S. Parthasarathy, Julia Hirschberg, Stephen Whittaker (2000), Foldering voicemail messages by caller using text independent speaker recognition, ICSLP
Claude Montacié, Marie-José Caraty (2000), Structural framework for combining speaker recognition methods, ICSLP
Walter D. Andrews, Joseph P. Campbell, Douglas A. Reynolds (2000), Bootstrapping for speaker recognition, ICSLP
Bin Zhen, Xihong Wu, Zhimin Liu, Huisheng Chi (2000), On the importance of components of the MFCC in speech and speaker recognition, ICSLP
Thomas F. Quatieri, R. Bob Dunn, Douglas A. Reynolds (2000), On the influence of rate, pitch, and spectrum on automatic speaker recognition performance, ICSLP
Remco Teunen, Ben Shahshahani, Larry Heck (2000), A model-based transformational approach to robust speaker recognition, ICSLP
Amanda Miller-Ockhuizen, Bonny E. Sands (2000), Contrastive lateral clicks and variation in click types, ICSLP
Tomoko Matsui, Masaki Naito, Yoshinori Sagisaka, Kozo Okuda, Satoshi Nakamura (2000), Analysis of acoustic models trained on a large-scale Japanese speech database, ICSLP
Mahmood Bijankhan (2000), Farsi vowel compensatory lengthening: an experimental approach, ICSLP
Yue Wang, Joan A. Sereno, Allard Jongman, Joy Hirsch (2000), Cortical reorganization associated with the acquisition of Mandarin tones by american learners: an FMRI study, ICSLP
S. P. Whiteside, R. A. Varley, T. Phillips, H. Garety (2000), The production of real and non-words in adult stutterers and non-stutterers: an acoustic study, ICSLP
Masaaki Shimizu, Masatake Dantsuji (2000), A new proposal of laryngeal features for the tonal system of Vietnamese, ICSLP
Hong Zhang, Bo Xu, Taiyi Huang (2000), How to choose training set for language modeling, ICSLP
Piero Cosi, John-Paul Hosom (2000), High performance "general purpose" phonetic recognition for Italian, ICSLP
Miren Karmele López de Ipiña, Inés Torres, Lourdes Oñederra, Amparo Varona, N. Ezeiza, M. Peñagarikano, M. Hernandez, Luis Javier Rodriguez (2000), First approach to the selection of lexical units for continuous speech recognition of Basque, ICSLP
David W. Gow Jr. (2000), Assimilation, ambiguity, and the feature parsing problem, ICSLP
Sachin S. Kajarakar, Hynek Hermansky (2000), Optimization of units for continuous-digit recognition task, ICSLP
Ioana Vasilescu, Francois Pellegrino, Jean-Marie Hombert (2000), Perceptual features for the identification of Romance languages, ICSLP
Dawn M. Behne, Peter E. Czigler, Kirk P. H. Sullivan (2000), Perception of Swedish vowel quantity: tracing late stages of development, ICSLP
Ananlada Chotimongkol, Alan W. Black (2000), Statistically trained orthographic to sound models for Thai, ICSLP
Janice Fon, Keith Johnson (2000), Speech timing patterning as an indicator of discourse and syntactic boundaries, ICSLP
Amalia Arvaniti, Georgios Tserdanelis (2000), On the phonetics of geminates: evidence from Cypriot Greek, ICSLP
Hanny den Ouden, Carel van Wijk, Marc Swerts (2000), A simple procedure to clarify the relation between text and prosody, ICSLP
Kimiko Tsukada (2000), Effects of consonantal voicing on English diphthongs: a comparison of L1 and L2 production, ICSLP
Nigel Ward (2000), The challenge of non-lexical speech sounds, ICSLP
Yousif A. El-Imam (2000), A method to synthesize Arabic from short phonetic, ICSLP
Mauricio C. Schramm, Luis Felipe R. Freitas, Adriano Zanuz, Dante Barone (2000), A brazilian portuguese language corpus development, ICSLP
C. Colin, Monique Radeau, Didier Demolin, A. Soquet (2000), Visual lipreading of voicing for French stop consonants, ICSLP
Yang Chen, Michael Robb (2000), Acoustic features of vowel production in Mandarin speakers of English, ICSLP
Robert Belvin, Ron Burns, Cheryl Hein (2000), Spoken language navigation systems for drivers, ICSLP
Fang Chen, Baozong Yuan (2000), An approach to intelligent Chinese dialogue system, ICSLP
Huei-Ming Wang, Yi-Chung Lin (2000), Goal-oriented table-driven design for dialogue manager, ICSLP
Alexandros Potamianos, Egbert Ammicht, Hong-Kwang J. Kuo (2000), Dialogue management in the Bell Labs communicator system, ICSLP
Jiang Han, Yong Wang (2000), Dialogue management based on a hierarchical task structure, ICSLP
Johanneke Caspers (2000), Melodic characteristics of backchannels in Dutch map task dialogues, ICSLP
Marc Swerts, Diane Litman, Julia Hirschberg (2000), Corrections in spoken dialogue systems, ICSLP
John Fry (2000), F0 correlates of topic and subject in spontaneous Japanese speech, ICSLP
Mutsuko Tomokiyo, Solange Hollard (2000), Specification of communicative acts of utterances based on dialogue corpus analysis, ICSLP
Hiroaki Noguchi, Yasuhiro Katagiri, Yasuharu Den (2000), An experimental verification of the prosodic/lexical effects on the occurrence of backchannels, ICSLP
Tsutomu Sato, John A. Maidment (2000), The acoustic characteristics of Japanese identical vowel sequences in connected speech, ICSLP
Shrikanth Narayanan, Giuseppe Di Fabbrizio, C. Kamm, James Hubbell, B. Buntschuh, P. Ruscitti, Jerry H. Wright (2000), Effects of dialog initiative and multi-modal presentation strategies on large directory information access, ICSLP
William Thompson, Harry Bliss (2000), A declarative framework for building compositional dialog modules, ICSLP
Kuansan Wang (2000), A plan-based dialog system with probabilistic inferences, ICSLP
Kazunori Komatani, Tatsuya Kawahara (2000), Generating effective confirmation and guidance using two-level confidence measures for dialogue systems, ICSLP
Nikko Ström, Stephanie Seneff (2000), Intelligent barge-in in conversational systems, ICSLP
Andrew Breen, Barry Eggleton, Gavin Churcher, Paul Deans, Simon Downey (2000), A system for the research into multi-modal man-machine communication within a virtual environment, ICSLP
Fabio Brugnara, Mauro Cettolo, Marcello Federico, Diego Giuliani (2000), Advances in automatic transcription of Italian broadcast news, ICSLP
Shui-Lung Chuang, Hsiao-Tieh Pu, Wen-Hsiang Lu, Lee-Feng Chien (2000), Live thesaurus construction for interactive voice-based web search, ICSLP
Yoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi (2000), Selecting TV news stories and newswire articles related to a target article of newswire using SVM, ICSLP
Kenney Ng (2000), Towards an integrated approach for spoken document retrieval, ICSLP
Beth Logan, Pedro Moreno, Jean-Manuel van Thong, Ed Whittaker (2000), An experimental study of an audio indexing system for the web, ICSLP
Rong Jin, Alex G. Hauptmann (2000), Title generation for spoken broadcast news using a training corpus, ICSLP
Manfred Weber, Thomas Kemp (2000), Evaluating different information retrieval algorithms on real-world data, ICSLP
Konstantinos Koumpis, Steve Renals (2000), Transcription and summarization of voicemail speech, ICSLP
W. C. Tsai, Y. C. Chu (2000), Robust rejection for embedded systems, ICSLP
Sharon Oviatt (2000), Multimodal signal processing in naturalistic noisy environments, ICSLP
Joyce Chai, Sylvie Levesque, Margorzata Budzikowska, Veronika Horvath, Nanda Kambhatla, Nicolas Nicolov, Wlodek Zadrozny (2000), A multi-modal dialog system for business transactions, ICSLP
Jiang Han, Yonghong Yan, Zhiwei Lin, Yong Wang, Jian Liu, Danjun Liu, Zhihui Wang (2000), Office message center - a spoken dialogue system, ICSLP
Noboru Miyazaki, Jun-ichi Hirasawa, Mikio Nakano, Kiyoaki Aikawa (2000), A new method for understanding sequences of utterances by multiple speakers, ICSLP
Hideaki Kikuchi, Katsuhiko Shirai (2000), Improvement of dialogue efficiency by dialogue control model according to performance of processes, ICSLP
C. Wang, D. Scott Cyphers, Xiaolong Mou, Joseph Polifroni, Stephanie Seneff, J. Yi, Victor Zue (2000), MUXING: a telephone-access Mandarin conversational system, ICSLP
Markku Turunen, Jaakko Hakulinen (2000), Jaspis - a framework for multilingual adaptive speech applications, ICSLP
Bryan Pellom, Wayne Ward, Sameer Pradhan (2000), The CU communicator: an architecture for dialogue systems, ICSLP
Vildan Bilici, Emiel Krahmer, Saskia te Riele, Raymond Veldhuis (2000), Preferred modalities in dialogue systems, ICSLP
Fréderic Béchet, Elisabeth den Os, Lou Boves, Jürgen Sienel (2000), Introduction to the IST-HLT project speech-driven multimodal automatic directory assistance (SMADA), ICSLP
Crusoe Mao, Tony Tuo, Danjun Liu (2000), Using HPSG to represent multi-modal grammar in multi-modal dialogue, ICSLP
Kohji Dohsaka, Norihito Yasuda, Noboru Miyazaki, Mikio Nakano, Kiyoaki Aikawa (2000), An efficient dialogue control method under system²s limited knowledge, ICSLP
Ying Cheng, Anurag Gupta, Raymond Lee (2000), A distributed spoken user interface based on open agent architecture (OAA), ICSLP
Stephen M. Chu, Thomas S. Huang (2000), Bimodal speech recognition using coupled hidden Markov models, ICSLP
Jiyong Ma, Wen Gao (2000), A parallel multi-stream model for sign language recognition, ICSLP
Lionel Revéret, Gérard Bailly, Pierre Badin (2000), MOTHER: a new generation of talking heads providing a flexible articulatory control for video-realistic speech animation, ICSLP
Steve Minnis, Andrew Breen (2000), Modeling visual coarticulation in synthetic talking heads using a lip motion unit inventory with concatenative synthesis, ICSLP
Hua Wu, Taiyi Huang, Bo Xu (2000), A generation system for Chinese texts, ICSLP
Stephanie Seneff, Joseph Polifroni (2000), Formal and natural language generation in the Mercury conversational system, ICSLP
Takashi Saito, Masaharu Sakamoto (2000), A method of creating a new speaker²s voicefont in a text-to-speech system, ICSLP
Jun Huang, Stephen Levinson, Mark Hasegawa-Johnson (2000), Signal approximation in Hilbert space and its application on articulatory speech synthesis, ICSLP
Nobuaki Minematsu, Seiichi Nakagawa (2000), Quality improvement of PSOLA analysis-synthesis using partial zero-phase conversion, ICSLP
Hanna Lindgren, Jessica Granberg (2000), A machine learning approach to Swedish word pronunciation, ICSLP
Takahiro Ohtsuka, Hideki Kasuya (2000), An improved speech analysis-synthesis algorithm based on the autoregressive with exogenous input speech production model, ICSLP
Kuo-Hwei Yuo, Tai-Hwei Hwang, Hsiao-Chuan Wang (2000), Combination of temporal trajectory filtering and projection measure for robust speaker identification, ICSLP
Yunxin Zhao, Xiao Zhang, Xiaodong He, Laura Schopp (2000), A combined adaptive and decision tree based speech separation technique for telemedicine applications, ICSLP
Olivier Bellot, Driss Matrouf, Teva Merlin, Jean-François Bonastre (2000), Additive and convolutional noises compensation for speaker recognition, ICSLP
Frédéric Beaugendre, Tom Claes, Hugo van Hamme (2000), Dialect adaptation for Mandarin Chinese speech recognition, ICSLP
Klaus R. Scherer, Tom Johnstone, Gudrun Klasmeyer, Thomas Bänziger (2000), Can automatic speaker verification be improved by training the algorithms on emotional speech?, ICSLP
Zhong-Hua Wang, Cheng Wu, David Lubensky (2000), New distance measures for text-independent speaker identification, ICSLP
Fengguang Zhao, Prabhu Raghavan, Sunil K. Gupta, Ziyi Lu, Wentao Gu, Wentao Gu (2000), Automatic speech recognition in Mandarin for embedded platforms, ICSLP
Husheng Li, Jia Liu, Runsheng Liu (2000), Confidence measure based unsupervised speaker adaptation, ICSLP
Javier Macías-Guarasa, Javier Ferreiros, José Colás, A. Gallardo-Antolín, Juan Manuel Pardo (2000), Improved variable preselection list length estimation using NNs in a large vocabulary telephone speech recognition system, ICSLP
Ascensión Gallardo-Antolín, Javier Ferreiros, Javier Macías-Guarasa, R. de Córdoba, Juan Manuel Pardo (2000), Incorporating multiple-HMM acoustic modeling in a modular large vocabulary speech recognition system in telephone environment, ICSLP
Janne Suontausta, Juha Häkkinen (2000), Decision tree based text-to-phoneme mapping for speech recognition, ICSLP
Jeff Meunier (2000), Reduced traceback matrix storage for small footprint model alignment, ICSLP
Claudio Vair, Luciano Fissore, Pietro Laface (2000), Dynamic adaptation of vocabulary independent HMMs to an application environment, ICSLP
Roberto Gemello, Loreta Moisa, Pietro Laface (2000), Synergy of spectral and perceptual features in multi-source connectionist speech recognition, ICSLP
Ramalingam Hariharan, Olli Viikki (2000), High performance connected digit recognition through gender-dependent acoustic modelling and vocal tract length normalisation, ICSLP
Ellen Eide, Benoît Maison, D. Kanevsky, P. Olsen, S. Chen, L. Mangu, M. Gales, Miroslav Novak, Ramesh Gopinath (2000), Transcription of broadcast news with a time constraint: IBM’s 10xRT HUB4 system, ICSLP
Geoffrey Zweig, Mukund Padmanabhan (2000), Exact alpha-beta computation in logarithmic space with application to MAP word graph construction, ICSLP
Kazumasa Yamamoto, Seiichi Nakagawa (2000), Relationship among speaking style, inter-phoneme's distance and speech recognition performance, ICSLP
Ruben San-Segundo, José Colás, Javier Ferreiros, Javier Macías-Guarasa, Juan Miguel Pardo (2000), Spanish recogniser of continuously spelled names over the telephone, ICSLP
Frank Seide, Nick J.C. Wang (2000), Two-stream modeling of Mandarin tones, ICSLP
Seyyed Ali Seyyed Salehi (2000), A neural network speech recognizer based on the both acoustic steady portions and transitions, ICSLP
Marc Hofmann, Manfred Lang (2000), Belief networks for a syntactic and semantic analysis of spoken utterances for speech understanding, ICSLP
Jiping Sun, Roberto Togneri, Li Deng (2000), A robust speech understanding system using conceptual relational grammar, ICSLP
Wai Lau, Tan Lee, Yiu Wing Wong, P. C. Ching (2000), Incorporating tone information into Cantonese large-vocabulary continuous speech recognition, ICSLP
Janez Kaiser, Bogomir Horvat, Zdravko Kacic (2000), A novel loss function for the overall risk criterion based discriminative training of HMM models, ICSLP
Mirjam Sepesy Maucec, Zdravko Kacic, Bogomir Horvat (2000), Looking for topic similarities of highly inflected languages for language model adaptation, ICSLP
David Janiszek, Frédéric Béchet, Renato De Mori (2000), Integrating MAP and linear transformation for language model adaptation, ICSLP
Beng Tiong Tan, Yong Gu, Trevor Thomas (2000), Utterance verification based speech recognition system, ICSLP
Rathinavelu Chengalvarayan (2000), Use of linear extrapolation based linear predictive cepstral features (LE-LPCC) for Tamil speech recognition, ICSLP
Yoshinori Atake, Toshio Irino, Hideki Kawahara, Jinlin Lu, Satoshi Nakamura, Kiyohiro Shikano (2000), Robust fundamental frequency estimation using instantaneous frequencies of harmonic components, ICSLP
Amparo Varona, In Torres, Miren Karmele López de Ipiña, Luis Javier Rodriguez (2000), Integrating different acoustic and syntactic language models in a continuous speech recognition system, ICSLP
Holger Schwenk, Jean-Luc Gauvain (2000), Combining multiple speech recognizers using voting and language model information, ICSLP
Keisuke Watanabe, Yasushi Ishikawa (2000), Dialogue management based on inferred behavioral goal - improving the accuracy of understanding by dialogue context -, ICSLP
Ralf Schlüter, Frank Wessel, Hermann Ney (2000), Speech recognition using context conditional word posterior probabilities, ICSLP
Hugo Meinedo, Joao P. Neto (2000), The use of syllable segmentation information in continuous speech recognition hybrid systems applied to the Portuguese language, ICSLP
Hugo Meinedo, Joao P. Neto (2000), Combination of acoustic models in continuous speech recognition hybrid systems, ICSLP
David A. van Leeuwen, Sander J. van Wijngaarden (2000), Automatic speech recognition of non-native speakers using consonant-vowel-consonant (CVC) words, ICSLP
Gang Zhao, Hong Xu (2000), Understanding Chinese in spoken dialogue systems, ICSLP
Frédéric Berthommier, Hervé Glotin, Emmanuel Tessier (2000), A front-end using the harmonicity cue for speech enhancement in loud noise, ICSLP
Qiru Zhou, Sergey Kosenko (2000), Lucent automatic speech recognition: a speech recognition engine for internet and telephony srvice applications, ICSLP
Todd A. Stephenson, Hervé Bourlard, Samy Bengio, Andrew C. Morris (2000), Automatic speech recognition using dynamic bayesian networks with both acoustic and articulatory variables, ICSLP
Subrata Das, David Lubensky (2000), Towards robust telephony speech recognition in office and automobile environments, ICSLP
Hiroaki Kojima, Kazuyo Tanaka (2000), Extracting phonological chunks based on piecewise linear segment lattices, ICSLP
Lucian Galescu, James Allen (2000), Evaluating hierarchical hybrid statistical language models, ICSLP
Jun Ogata, Yasuo Ariki (2000), An efficient lexical tree search for large vocabulary continuous speech recognition, ICSLP
Bin Jia, Xiaoyan Zhu, Yupin Luo, Dongcheng Hu (2000), Reliability evaluation of speech recognition in acoustic modeling, ICSLP
Ching X. Xu (2000), Using GMM for voiced/voiceless segmentation and tone decision in Mandarin continuous speech recognition, ICSLP
Chi H. Yim, Oscar C. Au, Wanggen Wan, Cyan L. Keung, Carrson C. Fung (2000), Auditory spectrum based features (ASBF) for robust speech recognition, ICSLP
Eric Chang, Jianlai Zhou, Shuo Di, Chao Huang, Kai-Fu Lee (2000), Large vocabulary Mandarin speech recognition with different approaches in modeling tones, ICSLP
Kalirroi Georgila, Kyriakos Sgarbas, Nikos Fanotakis, George Kokkinakis (2000), Fast very large vocabulary recognition based on compact DAWG-structured language models, ICSLP
Robert Eklund (2000), Crosslinguistic disfluency modeling: a comparative analysis of Swedish and tok pisin human-human ATIS dialogues, ICSLP
Shiro Terashima, Kazuya Takeda, Fumitada Itakura (2000), Vector space representation of language probabilities through SVD of n-gram matrix, ICSLP
Yoshihide Kato, Shigeki Matsubara, Katsuhiko Toyama, Yasuyoshi Inagaki (2000), Spoken language parsing based on incremental disambiguation, ICSLP
Hiroshi Shimodaira, Yutaka Kato, Toshihiko Akae, Mitsuru Nakai, Shigeki Sagayama (2000), Jacobian adaptation of HMM with initial model selection for noisy speech recognition, ICSLP
Han Shu, Chuck Wooters, Owen Kimball, Thomas Colthurst, Fred Richardson, Spyros Matsoukas, Herbert Gish (2000), The BBN Byblos 2000 conversational Mandarin LVCSR system, ICSLP
Thomas Colthurst, Owen Kimball, Fred Richardson, Han Shu, Chuck Wooters, Rukmini Iyer, Herbert Gish (2000), The 2000 BBN Byblos LVCSR system, ICSLP
Langzhou Chen, Lori Lamel, Gilles Adda, Jean-Luc Gauvain (2000), Broadcast news transcription in Mandarin, ICSLP
Yang Li, Tong Zhang, Stephen E. Levinson (2000), Word concept model: a knowledge representation for dialogue agents, ICSLP
Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura (2000), Audio-visual speech recognition using MCE-based hmms and model-dependent stream weights, ICSLP
Hiroaki Nanjo, Akinobu Lee, Tatsuya Kawahara (2000), Automatic diagnosis of recognition errors in large vocabulary continuous speech recognition systems, ICSLP
Yuang-Chin Chiang, Zhi-Siang Yang, Ren-Yuan Lyu (2000), Taiwanese corpus collection via continuous speech recognition tool, ICSLP
Baosheng Yuan, Qingwei Zhao, Qing Guo, Xiangdong Zhang, Zhiwei Lin (2000), Optimal maximum likelihood on phonetic decision tree acoustic model for LVCSR, ICSLP
Konstantin P. Markov, Satoshi Nakamura (2000), Frame level likelihood transformations for ASR and utterance verification, ICSLP
Timothy J. Hazen, Theresa Burianek, Joseph Polifroni, Stephanie Seneff (2000), Integrating recognition confidence scoring with language understanding and dialogue modeling, ICSLP
Yibiao Yu, Heming Zhao (2000), Speech recognition based on estimation of mutual information, ICSLP
Qing Guo, Yonghong Yan, Zhiwei Lin, Baosheng Yuan, Qingwei Zhao, Jian Liu (2000), Keyword spotting in auto-attendant system, ICSLP
Weimin Ren, Chengfa Wang, Wen Gao, Jinpei Xu (2000), A new approach for modeling OOV words, ICSLP
Rachida El Méliani, Douglas O'Shaughnessy (2000), Speech recognition using error spotting, ICSLP
Chung-Ho Yang, Ming-Shiun Hsieh (2000), Robust endpoint detection for in-car speech recognition, ICSLP
Jouji Miwa, Masaru Kumagai (2000), Internet speech analysis system using e-mail and web technology, ICSLP
Marco Loog, Reinhold Haeb-Umbach (2000), Multi-class linear dimension reduction by generalized Fisher criteria, ICSLP
Wendy J. Holmes (2000), Improving the representation of time structure in front-ends for automatic speech recognition, ICSLP
Katrin Kirchhoff (2000), Speech analysis by rule extraction from trained artificial neural networks, ICSLP
Jaishree Venugopal, Stephen A. Zahorian, Montri Karnjanadecha (2000), Minimum mean square error spectral peak envelope estimation for automatic vowel classification, ICSLP
Cyan L. Keung, Oscar C. Au, Chi H. Yim, Carrson C. Fung (2000), Probabilistic compensation of unreliable feature components for robust speech recognition, ICSLP
Congxiu Wang, Qihu Li, Guoying Zhao, Li Yin, Shuai Hao, Da Meng (2000), A new tone conversion method for Mandarin by an adaptive linear prediction analysis, ICSLP
Sharon Oviatt (2000), Multimodal interface research: a science without borders, ICSLP
K. G. Munhall, C. Kroos, T. Kuratate, J. Lucero, M. Pitermann, Eric Vatikiotis-Bateson, H. Yehia (2000), Studies of audiovisual speech perception using production-based animation, ICSLP
Chalapathi Neti, Giridharan Iyengar, Gerasimos Potamianos, A. Senior, Benoit Maison (2000), Perceptual interfaces for information interaction: joint processing of audio and visual information for human-computer interaction, ICSLP
Wen Gao, Jiyong Ma, Rui Wang, Hongxun Yao (2000), Towards robust lipreading, ICSLP
Satoshi Nakamura, Hidetoshi Ito, Kiyohiro Shikano (2000), Stream weight optimization of speech and lip image sequence for audio-visual speech recognition, ICSLP
Shinji Sako, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura (2000), HMM-based text-to-audio-visual speech synthesis, ICSLP
Jill Hewitt, Andi Bateman, Andrew Lambourne, A. Ariyaeeinia, P. Sivakumaran (2000), Real-time speech-generated subtitles: problems and solutions, ICSLP
Xuedong Huang, Alex Acero, C. Chelba, Li Deng, D. Duchene, Joshua Goodman, H. Hon, D. Jacoby, L. Jiang, R. Loynd, M. Mahajan, P. Mau, S. Meredith, S. Mughal, S. Neto, Mike Plumpe, K. Wang, Y. Wang (2000), Mipad: a next generation PDA prototype, ICSLP
Fei Huang, Jie Yang, Alex Waibel (2000), Dialogue management for multimodal user registration, ICSLP
Lynne E. Bernstein (2000), Segmental optical phonetics for human and machine speech processing, ICSLP
Umavasee Thathong, Somchai Jitapunkul, Visarut Ahkuputra, Ekkarit Maneenoi, Boonchai Thampanitchawong (2000), Classification of Thai consonant naming using Thai tone, ICSLP
Qi Li, Frank K. Soong, Olivier Siohan (2000), A high-performance auditory feature for robust speech recognition, ICSLP
Kun Xia, Carol Espy-Wilson (2000), A new strategy of formant tracking based on dynamic programming, ICSLP
Xugang Lu, Gang Li, Lipo Wang (2000), Dominant subspace analysis for auditory spectrum, ICSLP
Ilyas Potamitis, Nikos Fanotakis, George Kokkinakis (2000), Spectral and cepstral projection bases constructed by independent component analysis, ICSLP
Sacha Krstulovic (2000), Relating LPC modeling to a factor-based articulatory model, ICSLP
Michael L. Shire, Barry Y. Chen (2000), On data-derived temporal processing in speech feature extraction, ICSLP
George Saon, Mukund Padmanabhan (2000), Minimum Bayes error feature selection, ICSLP
Daniel P. W. Ellis, Jeff A. Bilmes (2000), Using mutual information to design feature combinations, ICSLP
Seungjin Choi, Heonseok Hong, Hervé Glotin, Frédéric Berthommier (2000), Multichannel signal separation for cocktail party speech recognition: a dynamic recurrent network, ICSLP
V. Kamakshi Prasad, Hema A. Murthy (2000), An automatic algorithm for segmenting and labelling a connected digit sequence, ICSLP
Hui Yan, Xuegong Zhang, Yanda Li, Liqin Shen, Weibin Zhu (2000), The signal reconstruction of speech by KPCA, ICSLP
Hiroshi Saruwatari, Satoshi Kurita, Kazuya Takeda, Fumitada Itakura, Kiyohiro Shikano (2000), Blind source separation based on subband ICA and beamforming, ICSLP
Claudio Estienne, Patricia Pelle (2000), A synchrony front-end using phase-locked-loop techniques, ICSLP
Javier Hernando (2000), On the use of filter-bank energies driven from the autocorrelation sequence for noisy speech recognition, ICSLP
Rens Bod (2000), Combining semantic and syntactic structure for language modeling, ICSLP
Joshua Goodman, Jianfeng Gao (2000), Language model size reduction by pruning and clustering, ICSLP
Jun Wu, Sanjeev Khudanpur (2000), Efficient training methods for maximum entropy language modeling, ICSLP
Sabine Deligne (2000), Statistical language modeling with a class based n-multigram model, ICSLP
Koichi Tanigaki, Hirofumi Yamamoto, Yoshinori Sagisaka (2000), A hierarchical language model incorporating class-dependent word models for OOV words recognition, ICSLP
Fang Zheng, Jian Wu, Wenhu Wu (2000), Input Chinese sentences using digits, ICSLP
Matt Richardson, Jeff Bilmes, Chris Diorio (2000), Hidden-articulator Markov models: performance improvements and robustness to noise, ICSLP
Eric D. Sandness, I. Lee Hetherington (2000), Keyword-based discriminative training of acoustic models, ICSLP
Vaibhava Goel, Shankar Kumar, William Byrne (2000), Segmental minimum Bayes-risk ASR voting strategies, ICSLP
Harriet J. Nock, Steve J. Young (2000), Loosely coupled HMMs for ASR, ICSLP
Katrin Weber, Samy Bengio, Hervé Bourlard (2000), HMM2- a novel approach to HMM emission probability estimation, ICSLP
Rita Singh, Bhiksha Raj, Richard M. Stern (2000), Structured redefinition of sound units by merging and splitting for improved speech recognition, ICSLP
V. Arsigny, Gérard Chollet, Guillaume Gravier, Marc Sigelle (2000), Speech modeling with state constrained Markov fields over frequency bands, ICSLP
Weibin Zhu, Liqin Shen, Xiaochuan Miu (2000), Duration modeling for Chinese synthesis from C-toBI labeled corpus, ICSLP
Bei Wang, Bo Zheng, Shinan Lu, Jianfen Cao, Yufang Yang (2000), The pitch movement of word stress in Chinese, ICSLP
Michiko Watanabe, Carlos Toshinori Ishi (2000), The distribution of fillers in lectures in the Japanese language, ICSLP
Huhe Harnud, Yuling Zheng, Jiayou Chen (2000), Research on stress in bisyllsblic words of Mongolian, ICSLP
Kazunori Imoto, Masatake Dantsuji, Tatsuya Kawahara (2000), Modelling of the perception of English sentence stress for computer-assisted language learning, ICSLP
Jeska Buhmann, Halewijn Vereecken, Justin Fackrell, Jean-Pierre Martens, Bert van Coile (2000), Data driven intonation modelling of 6 languages, ICSLP
Laurent Blin, Mike Edgington (2000), Prosody prediction using a tree-structure similarity metric, ICSLP
Carlos Teixeira, Horacio Franco, Elizabeth Shriberg, Kristin Precoda, Kemal Sönmez (2000), Prosodic features for automatic text-independent evaluation of degree of nativeness for language learners, ICSLP
Nobuaki Minematsu, Seiichi Nakagawa (2000), Instantaneous estimation of prosodic pronunciation habits for Japanese students to learn English pronunciation, ICSLP
Jinfu Ni, Keikichi Hirose (2000), Synthesis of fundamental FDrequency contours of standard Chinese sentences from tone sandhi and focus conditions, ICSLP
Yiqing Zu, Xiaoxia Chan, Aijun Li, Wu Hua, Guohua Sun (2000), Syllable duration and its functions in standard Chinese discourse, ICSLP
Bleicke Holm, Gérard Bailly (2000), Generating prosody by superposing multi-parametric overlapping contours, ICSLP
Raymond Veldhuis (2000), Consistent pitch marking, ICSLP
Sun-Ah Jun, Sook-Hyang Lee, Keeho Kim, Yong-Ju Lee (2000), Labeler agreement in transcribing korean intonation with K-toBI, ICSLP
Yukiyoshi Hirose, Kazuhiko Ozeki, Kazuyuki Takagi (2000), Effectiveness of prosodic features in syntactic analysis of read Japanese sentences, ICSLP
Mieko Banno (2000), A study of F0 declination in Japanese: towards a discourse model of prosodic structure, ICSLP
Atsuhiro Sakurai, Nobuaki Minematsu, Keikichi Hirose (2000), Data-driven intonation modeling using a neural network and a command response model, ICSLP
Caglayan Erdem, Martin Holzapfel, Rüdiger Hoffmann (2000), Natural F0 contours with a new neural-network-hybrid approach, ICSLP
Justin Fackrell, Halewijn Vereecken, Jeska Buhmann, Jean-Pierre Martens, Bert Van Coile (2000), Prosodic variation with text type, ICSLP
Ann K. Syrdal, Julia McGory (2000), Inter-transcriber reliability of toBI prosodic labeling, ICSLP
Greg P. Kochanski, Chilin Shih (2000), Stem-ML: language-independent prosody description, ICSLP
Minghui Dong, Kim Teng Lua (2000), Using prosody database in Chinese speech synthesis, ICSLP
Donna Erickson, Kikuo Maekawa, Michiko Hashi, Jianwu Dang (2000), Some articulatory and acoustic changes associated with emphasis in spoken English, ICSLP
Esther Janse, Anke Sennema, Anneke Slis (2000), Fast speech timing in Dutch: durational correlates of lexical stress and pitch accent, ICSLP
Makoto Hiroshige, Kantaro Suzuki, Kenji Araki, Koji Tochinai (2000), On perception of word-based local speech rate in Japanese without focusing attention, ICSLP
Atsuhiro Sakurai, Koji Iwano, Keikichi Hirose (2000), Modeling and generation of accentual phrase F0 contours based on discrete HMMs synchronized at mora-unit transitions, ICSLP
Philippa H. Louw, Justus. C. Roux, Elizabeth. C. Botha (2000), Synthesizing prosody for commands in a Xhosa TTS system, ICSLP
Costas Christogiannis, Yiannis Stavroulas, Yiannis Vamvakoulas, Theodora Varvarigou, Agatha Zappa, Chilin Shih, Amalia Arvaniti (2000), Design and implementation of a Greek text-to-speech system based on concatenative synthesis, ICSLP
Lauren Baptist, Stephanie Seneff (2000), GENESIS-II: a versatile system for language generation in conversational system applications, ICSLP
Eun-Kyoung Kim, Yung-Hwan Oh (2000), New analysis method for harmonic plus noise model based on time-domain periodicity score, ICSLP
Tomoki Toda, Jinlin Lu, Hiroshi Saruwatari, Kiyohiro Shikano (2000), Straight-based voice conversion algorithm based on Gaussian mixture model, ICSLP
Marion Libossek, Florian Schiel (2000), Syllable-based text-to-phoneme conversion for German, ICSLP
Horst-Udo Hain (2000), A hybrid approach for grapheme-to-phoneme conversion based on a combination of partial string matching and a neural network, ICSLP
Hans G. Tillmann, Hartmut R. Pfitzinger (2000), Parametric high definition (PHD) speech synthesis-by-analysis: the development of a fundamentally new system creating connected speech by modifying lexically-represented language units, ICSLP
Chul H. Kwon, Minkyu Lee, Joseph P. Olive (2000), A new synthesis algorithm using phase information for TTS systems, ICSLP
Johan Wouters, Michael W. Macon (2000), Unit fusion for concatenative speech synthesis, ICSLP
Kevin A. Lenzo, Alan W. Black (2000), Diphone collection and synthesis, ICSLP
Thomas Portele (2000), Natural language generation for spoken dialogue, ICSLP
Alistair Conkie, Mark C. Beutnagel, Ann K. Syrdal, Philip E. Brown (2000), Preselection of candidate units in a unit selection-based text-to-speech synthesis system, ICSLP
Kare Jean Jensen, Søren Riis (2000), Self-organizing letter code-book for text-to-phoneme neural network model, ICSLP
Jon R. W. Yi, James R. Glass, I. Lee Hetherington (2000), A flexible, scalable finite-state transducer architecture for corpus-based concatenative speech synthesis, ICSLP
Changfu Wang, Hiroya Fujisaki, Ryou Tomana, Sumio Ohno (2000), Analysis of fundamental frequency contours of standard Chinese in terms of the command-response model and its application to synthesis by rule of intonation, ICSLP
Toshio Hirai, Seiichi Tenpaku, Kiyohiro Shikano (2000), Manipulating speech pitch periods according to optimal insertion/deletion position in residual signal for intonation control in speech synthesis, ICSLP
Pradit Mittrapiyanuruk, Chatchawarn Hansakunbuntheung, Virongrong Tesprasit, Virach Sornlertlamvanich (2000), Improving naturalness of Thai text-to-speech synthesis by prosodic rule, ICSLP
Dawei Xu, Hiroki Mori, Hideki Kasuya (2000), Word-level F0 range in Mandarin Chinese and its application to inserting words into a sentence, ICSLP
Mitsuaki Isogai, Kimihito Tanaka, Satoshi Takano, Hideyuki Mizuno, Masanobu Abe, Sin’ya Nakajima (2000), A new Japanese TTS system based on speech-prosody database and speech modification, ICSLP
Ruben San-Segundo, Juan Manuel Montero, Ricardo de Córdoba, Juana Gutiérrez-Arriola (2000), Stress assignment in Spanish proper names, ICSLP
Zhengyu Niu, Peiqi Chai (2000), Segmentation of prosodic phrases for improving the naturalness of synthesized Mandarin Chinese speech, ICSLP
Xiaohu Liu, Douglas O'Shaughnessy (2000), Practical language modeling: an interpolating method, ICSLP
Gongjun Li, Na Dong, Toshiro Ishikawa (2000), Combination of different n-grams based on their different assumptions, ICSLP
Nobuo Kawaguchi, Shigeki Matsubara, Hiroyuki Iwa, Shoji Kajita, Kazuya Takeda, Fumitada Itakura, Yasuyoshi Inagaki (2000), Construction of speech corpus in moving car environment, ICSLP
Yue-Shi Lee, Hsin-Hsi Chen (2000), Parsing spoken dialogues, ICSLP
Børge Lindberg, Finn Tore Johansen, Narada Warakagoda, Gunnar Lehtinen, Zdravko Kacic, Andrej Zgank, Kjell Elenius, Giampiero Salvi (2000), A noise robust multilingual reference recogniser based on SPEECHDAT(II), ICSLP
Muhua Lv, Lianhong Cai (2000), The design and application of a speech database for Chinese TTS system, ICSLP
Rathinavelu Chengalvarayan (2000), Use of multiple classifiers for speech recognition in wireless CDMA network environments, ICSLP
Alexander Franz, Keiko Horiguchi, Lei Duan (2000), An imperative programming language for spoken language translation, ICSLP
Yumi Wakita, Kenji Matsui, Yoshinori Sagisaka (2000), Fine keyword clustering using a thesaurus and example sentences for speech translation, ICSLP
JunLan Feng, XianFang Wang, LiMin Du (2000), Data collection and processing in a Chinese spontaneous speech corpus IIS_CSS, ICSLP
Yasuyuki Aizawa, Shigeki Matsubara, Nobuo Kawaguchi, Katsuhiko Toyama, Yasuyoshi Inagaki (2000), Spoken language corpus for machine interpretation research, ICSLP
Jan van Santen, Michael Macon, Andrew Cronk, John-Paul Hosom, Alexander Kain, Vincent Pagel, Johan Wouters (2000), When will synthetic speech sound human: role of rules and data, ICSLP
Ann K. Syrdal, Colin W. Wightman, Alistair Conkie, Yannis Stylianou, Mark Beutnagel, Juergen Schroeter, Volker Strom, Ki-Seung Lee, Matthew J. Makashay (2000), Corpus-based techniques in the AT&t nextgen synthesis system, ICSLP
Nick Campbell (2000), Limitations to concatenative speech synthesis, ICSLP
Hisashi Kawai, Seiichi Yamamoto, Norio Higuchi, Tohru Shimizu (2000), A design method of speech corpus for text-to-speech synthesis taking account of prosody, ICSLP
Richard Sproat (2000), Corpus-based methods and hand-built methods, ICSLP
Michael A. Picheny (2000), Heredity and environment in speech recognition: the role of a priori information vs. data, ICSLP
Haruo Kubozono (2000), A constraint-based analysis of compound accent in Japanese, ICSLP
Naoto Iwahashi (2000), Language acquisition through a human-robot interface, ICSLP
Yoshinori Sagisaka, Hirofumi Yamamoto, Minoru Tsuzaki, Hiroaki Kato (2000), Rules, but what for? - rule description as efficient and robust abstraction of corpora and optimal fitting to applications -, ICSLP
Veronika Makarova (2000), Cross-linguistic aspects of intonation perception, ICSLP
Haruo Kubozono, Shosuke Haraguchi (2000), Visual information and the perception of prosody, ICSLP
Masato Akagi, Hironori Kitakaze (2000), Perception of synthesized singing voices with fine fluctuations in their fundamental frequency contours, ICSLP
Kalle J. Palomäki, Paavo Alku, Ville Mäkinen, Patrick May, Hannu Tiitinen (2000), Neuromagnetic study on localization of speech sounds, ICSLP
Yukiyoshi Hirose, Kazuhiko Kakehi (2000), Perception of identical vowel sequences in Japanese conversational speech, ICSLP
Santiago Fernández, Sergio Feijóo (2000), Acoustic cues to perception of vowel quality, ICSLP
Esther Klabbers, Raymond Veldhuis, Kim Koppen (2000), A solution to the reduction of concatenation artefacts in speech synthesis, ICSLP
Jhing-Fa Wang, Hsien-Chang Wang, Kin-Nan Lee, Chieh-Yi Huang (2000), Domain-unconstrained language understanding based on CKIP-auto tag, how-net, and ART, ICSLP
Chris Powell, Mary Zajicek, David Duce (2000), The generation of representations of word meanings from dictionaries, ICSLP
Po Chui Luk, Helen Meng, Filung Wang (2000), Grammar partitioning and parser composition for natural language understanding, ICSLP
Jennifer Lai, Omer Tsimhoni, Paul Green (2000), Comprehension of synthesized speech while driving and in the lab, ICSLP
Michael D. Tyler, Denis K. Burnham (2000), Orthographic influences on initial phoneme addition and deletion tasks: the effect of lexical status, ICSLP
Parham Zolfaghari, Yoshinori Atake, Kiyohiro Shikano, Hideki Kawahara (2000), Investigation of analysis and synthesis parameters of straight by subjective evaluation, ICSLP
Andrew N. Pargellis, Alexandros Potamianos (2000), Cross-domain classification using generalized domain acts, ICSLP
Ganesh N. Ramaswamy, Jan Kleindienst (2000), Hierarchical feature-based translation for scalable natural language understanding, ICSLP
Alexandros Potamianos, Hong-Kwang J. Kuo (2000), Statistical recursive finite state machine parsing for speech understanding, ICSLP
Chaojun Liu, Yonghong Yan (2000), Speaker change detection using minimum message length criterion, ICSLP
Sadaoki Furui, Kikuo Maekawa, Hitoshi Isahara, Takahiro Shinozaki, Takashi Ohdaira (2000), Toward the realization of spontaneous speech recognition - introduction of a Japanese priority program and preliminary results -, ICSLP
Toshiyuki Takezawa, Fumiaki Sugaya, Masaki Naito, Seiichi Yamamoto (2000), A comparative study on acoustic and linguistic characteristics using speech from human-to-human and human-to-machine conversations, ICSLP
Néstor Becerra Yoma (2000), Speaker dependent temporal constraints combined with speaker independent HMM for speech recognition in noise, ICSLP
Yoshihiro Ito, Hiroshi Matsumoto, Kazumasa Yamamoto (2000), Forward masking on a generalized logarithmic scale for robust speech recognition, ICSLP
Heidi Christensen, Børge Lindberg, Ove Andersen (2000), Noise robustness of heterogeneous features employing minimum classification error feature space transformations, ICSLP
Michael L. Seltzer, Bhiksha Raj, Richard M. Stern (2000), Classifier-based mask estimation for missing feature methods of robust speech recognition, ICSLP
Kris Hermus, Werner Verhelst, Patrick Wambacq (2000), Optimized subspace weighting for robust speech recognition in additive noise environments, ICSLP
Ji Ming, Peter Jancovic, Philip Hanna, Darryl Stewart, F. Jack Smith (2000), Robust feature selection using probabilistic union models, ICSLP
Ramalingam Hariharan, Imre Kiss, Olli Viikki, Jilei Tian (2000), Multi-resolution front-end for noise robust speech recognition, ICSLP
Douglas O'Shaughnessy, Marcel Gabrea (2000), Recognition of digit strings in noisy speech with limited resources, ICSLP
Keiichi Tajima, Donna Erickson, Kyoko Nagao (2000), Factors affecting native Japanese speakers' production of intrusive (epenthetic) vowels in English words, ICSLP
Imed Zitouni, Kamel Smaïli, Jean-Paul Haton (2000), Beyond the conventional statistical language models: the variable-length sequences approach, ICSLP
Yasushi Tsubota, Masatake Dantsuji, Tatsuya Kawahara (2000), Computer-assisted English vowel learning system for Japanese speakers using cross language formant structures, ICSLP
Trym Holter, Erik Harborg, Magne Hallstein Johnsen, Torbjörn Svendsen (2000), ASR-based subtitling of live TV-programs for the hearing impaired, ICSLP
Chung-Hsien Wu, Yu-Hsien Chiu, Chi-Shiang Guo (2000), Natural language processing for Taiwanese sign language to speech conversion, ICSLP
Jouji Miwa, Hiroshi Sasaki, Kazunori Tanno (2000), Japanese spoken language learning system using java information technology, ICSLP
Helmer Strik, Catia Cucchiarini, Diana Binnenpoorte (2000), L2 pronunciation quality in read and spontaneous speech, ICSLP
Tomoko Kitamura, Keisuke Kinoshita, Takayuki Arai, Akiko Kusumoto, Yuji Murahara (2000), Designing modulation filters for improving speech intelligibility in reverberant environments, ICSLP
Lei Zhang, Jiqing Han, Chengguo Lv, Chengfa Wang (2000), An environment model-based robust speech recognition, ICSLP
Jaco Vermaak, Christophe Andrieu, Arnaud Doucet (2000), Particle filtering for non-stationary speech modelling and enhancement, ICSLP
Martin Graciarena (2000), Maximum likelihood noise HMMm estimation in model-based robust speech recognition, ICSLP
Qingsheng Zeng, Douglas O'Shaughnessy (2000), Microphone array within a handset or face mask for speech enhancement, ICSLP
Chengfa Wang, Qiusheng Wang (2000), Embedding visually recognizable watermarks into digital audio signals, ICSLP
Mamoru Iwaki (2000), Auditory perception of amplitude modulated sinusoid using a pure tone and band-limited noises as modulation signals, ICSLP
Masoud Geravanchizadeh (2000), Spectral voice conversion based on unsupervised clustering of acoustic space, ICSLP
Hartmut R. Pfitzinger (2000), Removing hum from spoken language resources, ICSLP
Ingunn Amdal, Filipp Korkmazskiy, Arun C. Surendran (2000), Joint pronunciation modelling of non-native speakers using data-driven methods, ICSLP
Linda Bell, Robert Eklund, Joakim Gustafson (2000), A comparison of disfluency distribution in a unimodal and a multimodal speech interface, ICSLP
Yi Liu, Pascale Fung (2000), Modelling pronunciation variations in spontaneous Mandarin speech, ICSLP
Tadashi Suzuki, Jun Ishii, Kunio Nakajima (2000), A method of generating English pronunciation dictionary for Japanese English recognition systems, ICSLP
Hélène Bonneau-Maynard, L. Devillers (2000), A framework for evaluating contextual understanding, ICSLP
Yonggang Deng, Taiyi Huang, Bo Xu (2000), Towards high performance continuous Mandarin digit string recognition, ICSLP
Matthew Aylett (2000), Stochastic suprasegmentals: relationships between redundancy, prosodic structure and care of articulation in spontaneous speech, ICSLP
Masaharu Sakamoto, Takashi Saitoh (2000), An automatic pitch-marking method using wavelet transform, ICSLP
Keiichi Takamaru, Makoto Hiroshige, Kenji Araki, Koji Tochinai (2000), A proposal of a model to extract Japanese voluntary speech rate control, ICSLP
Veronika Makarova (2000), Acoustic characteristics of surprise in Russian questions, ICSLP
Yonggang Deng, Yang Cao, Bo Xu (2000), Neural network based integration of multiple confidence measures for OOV detection, ICSLP
Yi Xu, Xuejing Sun (2000), How fast can we really change pitch? maximum speed of pitch change revisited, ICSLP
Esther Klabbers, Jan van Santen (2000), Predicting segmental durations for Dutch using the sums-of-products approach, ICSLP
Yang Cao, Taiyi Huang, Bo Xu, Chengrong Li (2000), A stochastic polynomial tone model for continuous Mandarin speech, ICSLP
Marcel Gabrea, Douglas O’Shaughnessy (2000), Detection of filled pauses in spontaneous conversational speech, ICSLP
Bertil Lyberg, Sonia Sangarig (2000), Some observations on different strategies for the timing of fundamental frequency events, ICSLP
Zhiyong Wu, Lianhong Cai, Tongchun Zhou (2000), Research on dynamic characters of Chinese pitch contours, ICSLP
Bing Zhao, Bo Xu (2000), Incorporating HMM-state sequence confusion for rapid MLLR adaptation to new speakers, ICSLP
Zhipeng Zhang, Sadaoki Furui (2000), An online incremental speaker adaptation method using speaker-clustered initial models, ICSLP
Guoqiang Li, Limin Du, Ziqiang Hou (2000), Prior parameter transformation for unsupervised speaker adaptation, ICSLP
Ruhi Sarikaya, John H. L. Hansen (2000), Improved Jacobian adaptation for fast acoustic model adaptation in noisy speech recognition, ICSLP
Keiko Fujita, Yoshio Ono, Yoshihisa Nakatoh (2000), A study of vocal tract length normalization with generation-dependent acoustic models, ICSLP
Shaojun Wang, Yunxin Zhao (2000), Optimal on-line Bayesian model selection for speaker adaptation, ICSLP
Bowen Zhou, John H. L. Hansen (2000), Unsupervised audio stream segmentation and clustering via the Bayesian information criterion, ICSLP
Satoru Tsuge, Toshiaki Fukada, Kenji Kita (2000), Frame-period adaptation for speaking rate robust speech recognition, ICSLP
C. Nieuwoudt, Elizabeth C. Botha (2000), Cross-language use of acoustic information for automatic speech recognition, ICSLP
Shoei Sato, Toru Imai, Hideki Tanaka, Akio Ando (2000), Selective training of HMMs by using two-stage clustering, ICSLP
Angel de la Torre, Dominique Fohr, Jean-Paul Haton (2000), Compensation of noise effects for robust speech recognition in car environments, ICSLP
Dong Kook Kim, Nam Soo Kim (2000), Bayesian speaker adaptation based on probabilistic principal component analysis, ICSLP
Wai Kat Liu, Pascale Fung (2000), MLLR-based accent model adaptation without accented data, ICSLP
Kuan-Ting Chen, Wen-Wei Liau, Hsin-Min Wang, Lin-Shan Lee (2000), Fast speaker adaptation using eigenspace-based maximum likelihood linear regression, ICSLP
Gerasimos Potamianos, Chalapathy Neti (2000), Stream confidence estimation for audio-visual speech recognition, ICSLP
Masahiko Komatsu, Won Tokuma, Shinichi Tokuma, Takayuki Arai (2000), The effect of reduced spectral information on Japanese consonant perception: comparison between L1 and L2 listeners, ICSLP
Valter Ciocca, Rani Aisha, Alex Francis, Lena Wong (2000), Can cantonese children with cochlear implants perceive lexical tones?, ICSLP
Michael C. W. Yip (2000), Recognition of spoken words in the continuous speech: effects of transitional probability, ICSLP
Ariel Salomon, Carol Espy-Wilson (2000), Detection of speech landmarks using temporal cues, ICSLP
Takashi Otake, Anne Cutler (2000), A set of Japanese word cohorts rated for relative familiarity, ICSLP
Kimiko Yamakawa, Hiromitsu Miyazono, Ryoji Baba (2000), The phonetic value of the devocalized vowel in Japanese - in case of velar plosive, ICSLP
James M. McQueen, Anne Cutler, Dennis Norris (2000), Positive and negative influences of the lexicon on phonemic decision-making, ICSLP
Andrea Weber (2000), Phonotactic and acoustic cues for word segmentation in English, ICSLP
Esther Janse (2000), Intelligibility of time-compressed speech: three ways of time-compression, ICSLP
Hartmut Traunmüller (2000), Evidence for demodulation in speech perception, ICSLP
Jean-Luc Gauvain, Lori Lamel (2000), Fast decoding for indexation of broadcast data, ICSLP
Sheng Gao, Bo Xu, Hong Zhang, Bing Zhao, Chengrong Li, Taiyi Huang (2000), Update progress of Sinohear: advanced Mandarin LVCSR system at NLPR, ICSLP
Xavier L. Aubert, Reinhard Blasig (2000), Combined acoustic and linguistic look-ahead for one-pass time-synchronous decoding, ICSLP
Li Deng, Alex Acero, Mike Plumpe, Xuedong Huang (2000), Large-vocabulary speech recognition under adverse acoustic environments, ICSLP
Volker Fischer, S. J. Kunzmann (2000), Acoustic language model classes for a large vocabulary continuous speech recognizer, ICSLP
Franz Kummert, Gernot A. Fink, Gerhard Sagerer (2000), A hybrid speech recognizer combining HMMs and polynomial classification, ICSLP
Chao Huang, Eric Chang, Jianlai Zhou, Kai-Fu Lee (2000), Accent modeling based on pronunciation dictionary adaptation for large vocabulary Mandarin speech recognition, ICSLP
Jinzhong Zhang, Yingmin He, Renshu Yu (2000), A mixed and code excitation LPC vocoder at 1.76 kb/s, ICSLP
Minoru Kohata, Ikuya Mitsuya, Motoyuki Suzuki, Shozo Makino (2000), Efficient segment quantization of LSP parameters for very low bit speech coding, ICSLP
Carlos M. Ribeiro, Isabel M. Trancoso, Diamantino A. Caseiro (2000), Phonetic vocoder assessment, ICSLP
Hongtao Hu, Limin Du (2000), A new low bit rate speech coder based on intraframe waveform interpolation, ICSLP
Rathinavelu Chengalvarayan, David L. Thomson (2000), Discriminatively derived HMM-based announcement modeling approach for noise control avoiding the problem of false alarms, ICSLP
Juan M. Huerta, Richard M. Stern (2000), Instantaneous-distortion based weighted acoustic modeling for robust recognition of coded speech, ICSLP
Nitendra Rajput, L. Venkata Subramaniam, Ashish Verma (2000), Adapting phonetic decision trees between languages for continuous speech recognition, ICSLP
Stephen Cox (2000), Speaker normalization in the MFCC domain, ICSLP
Reinhold Haeb-Umbach (2000), Data-driven phonetic regression class tree estimation for MLLR adaptation, ICSLP
Mohamed Afify, Olivier Siohan (2000), Constrained maximum likelihood linear regression for speaker adaptation, ICSLP
Woo-Yong Choi, Hyung Soon Kim (2000), Predictive speaker adaptation based on least squares method, ICSLP
Alex Acero, Li Deng, Trausti Kristjansson, Jerry Zhang (2000), HMM adaptation using vector taylor series for noisy speech recognition, ICSLP
Dimitra Vergyri, Stavros Tsakalidis, William Byrne (2000), Minimum risk acoustic clustering for multilingual acoustic model combination, ICSLP
Sharon Oviatt (2000), Talking to thimble jellies: children²s conversational speech with animated characters, ICSLP
Robert Rodman, David McAllister, Donald Bitzer, D. Chappell (2000), A high-resolution glottal pulse tracker, ICSLP
Paavo Alku, Jan G. Svec, Erkki Vilkman, Frantisek Sram (2000), Analysis of voice production in breathy, normal and pressed phonation by comparing inverse filtering and videokymography, ICSLP
Takayuki Ito, Hiroaki Gomi, Masaaki Honda (2000), Model of the mechanical linkage of the upper lip-jaw for the articulatory coordination, ICSLP
Masafumi Matsumura, Takuya Niikawa, Taku Torii, Hitoshi Yamasaki, Hisanaga Hara, Takashi Tachimura, Takeshi Wada (2000), Measurement of palatolingual contact pressure and tongue force using a force-sensor-mounted palatal plate, ICSLP
Olov Engwall (2000), A 3d tongue model based on MRI data, ICSLP
Jae-Hyun Bae, Heo-Jin Byeon, Yung-Hwan Oh (2000), Speech quality improvement in TTS system using ABS/OLA sinusoidal model, ICSLP
Marielle Bruyninckx, Bernard Harmegnies (2000), A study of palatal segments' production by danish speakers, ICSLP
Bhuvana Ramabhadran, Yuqing Gao, Michael Picheny (2000), Dynamic selection of feature spaces for robust speech recognition, ICSLP
Santiago Fernández, Sergio Feijóo (2000), A probabilistic model of integration of acoustic cues in FV syllables, ICSLP
Jeff A. Bilmes, Katrin Kirchhoff (2000), Directed graphical models of classifier combination: application to phone recognition, ICSLP
E. E. Jan, Jaime Botella Ordinas, George Saon, Salim Roukos (2000), Real-time multilingual HMM training robust to channel variations, ICSLP
Sander J. van Wijngaarden, Herman J.M. Steeneken (2000), The intelligibility of German and English speech to Dutch listeners, ICSLP
Bin Zhen, Xihong Wu, Zhimin Liu, Huisheng Chi (2000), On the use of bandpass liftering in speaker recognition, ICSLP
René Carré, Liliane Sprenger-Charolles, Souhila Messaoud-Galusi, Willy Serniclaes (2000), On auditory-phonetic short-term transformation, ICSLP
James J. Hant, Abeer Alwan (2000), Predicting the perceptual confusion of synthetic plosive consonants in noise, ICSLP
Martha Larson, Daniel Willett, Joachim Köhler, Gerhard Rigoll (2000), Compound splitting and lexical unit recombination for improved performance of a speech recognition system for German parliamentary speeches, ICSLP
Martine van Zundert, Jacques Terken (2000), Learning and transfer of learning for synthetic speech, ICSLP
Yang Zhang, Patricia K. Kuhl, Toshiaki Imada, Paul Iverson, John Pruitt, Makoto Kotani, Erica Stevens (2000), Neural plasticity revealed in perceptual training of a Japanese adult listener to learn american /l-r/ contrast: a whole-head magnetoencephalography study, ICSLP
Akiyo Joto (2000), The effect of consonantal context and acoustic characteristics on the discrimination between the English vowel /i/ and /e/ by Japanese learners, ICSLP
Li Zhao, Wei Lu, Ye Jiang, Zhenyang Wu (2000), A study on emotional feature recognition in speech, ICSLP
Juan I. Godino-Llorente, Santiago Aguilera-Navarro, Pedro Gómez-Vilda (2000), LPC, LPCC and MFCC parameterisation applied to the detection of voice impairments, ICSLP
Benjamin K. T'sou, Tom B. Y. Lai (2000), A complementary approach to computer-aided transcription: synergy of statistical-based and kbnowledge discovery paradigms, ICSLP
Marie-José Caraty, Claude Montacié (2000), Teraspeech2000 : a 10,000 speakers database, ICSLP
Laila Dybkjær, Niels Ole Bernsen (2000), The MATE workbench - a tool in support of spoken dialogue annotation and information extraction, ICSLP
Armelle Brun, David Langlois, Kamel Smaili, Jean-Paul Haton (2000), Discarding impossible events from statistical language models, ICSLP
Yves Lepage, Nicolas Auclerc, Satoshi Shirai (2000), A tool to build a treebank for conversational Chinese, ICSLP
Roland Auckenthaler, Michael Carey, John Maso (2000), Parameter reduction in a text-independent speaker verification system, ICSLP
Yong Gu, Trevor Thomas (2000), Advances on HMM-based text-dependent speaker verification, ICSLP
Robert Stapert, John S. Mason, Roland Auckenthaler (2000), Optimisation of GMM in speaker recognition, ICSLP
Ran D. Zilca, Yuval Bistritz (2000), Distance-based Gaussian mixture model for speaker recognition over the telephone, ICSLP
Jun-Hui Liu, Ke Chen (2000), Pruning abnormal data for better making a decision in speaker verification, ICSLP
Louis ten Bosch (2000), ASR, dialects, and acoustic/phonological distances, ICSLP
Masafumi Nishida, Yasuo Ariki (2000), Speaker verification by integrating dynamic and static features using subspace method, ICSLP
Se-Hyun Kim, Gil-Jin Jang, Yung-Hwan Oh (2000), Improvement of speaker recognition system by individual information weighting, ICSLP
Néstor Becerra Yoma, Tarciano Facco Pegoraro (2000), Speaker verification in noise using temporal constraints, ICSLP
Bogdan Sabac, Inge Gavat, Zica Valsan (2000), Speaker identification using discriminative features selection, ICSLP
Ivan Magrin-Chagnolleau, Guilleaume Gravier, Mouhamadou Seck, Olivier Boeffard, R. Blouet, Frédéric Bimbot (2000), A further investigation on speech features for speaker characterization, ICSLP
Jyotsana Balleda, Hema A Murthy, T. Nagarajan (2000), Language identification from short segments of speech, ICSLP
Susanne Kronenberg, Franz Kummert (2000), Generation of utterances based on visual context information, ICSLP
Mazin Rahim, Roberto Pieraccini, Wieland Eckert, Esther Levin, Giuseppe Di Fabbrizio, Giuseppe Riccardi, Candy Kamm, Shrikanth Narayanan (2000), A spoken dialogue system for conference/workshop services, ICSLP
Gavin Churcher, Peter Wyard (2000), Developing robust, user-centred multimodal spoken language systems: the MUeSLI project, ICSLP
Magne H. Johnsen, Torbjørn Svendsen, Tore Amble, Trym Holter, Erik Harborg (2000), TABOR - a norwegian spoken dialogue system for bus travel information, ICSLP
Yinfei Huang, Fang Zheng, Mingxing Xu, Pengju Yan, Wenhu Wu (2000), Language understanding component for Chinese dialogue system, ICSLP
Kazumi Aoyama, Izumi Hirano, Hideaki Kikuchi, Katsuhiko Shirai (2000), Designing a domain independent platform of spoken dialogue system, ICSLP
Qiru Zhou, Antoine Saad, Sherif Abdou (2000), An enhanced BLSTIP dialogue research platform, ICSLP
Weidong Qu, Katsuhiko Shirai (2000), Using machine learning method and subword unit representations for spoken document categorization, ICSLP
Litza Stark, Steve Whittaker, Julia Hirschberg (2000), ASR satisficing: the effects of ASR accuracy on speech retrieval, ICSLP
Hiromitsu Nishizaki, Seiichi Nakagawa (2000), A system for retrieving broadcast news speech documents using voice input keywords and similarity between words, ICSLP
Yu-Sheng Lai, Kuen-Lin Lee, Chung-Hsien Wu (2000), Intention extraction and semantic matching for internet FAQ retrieval using spoken language query, ICSLP
Robert J. van Vark, Jelle K. de Haan, Leon J. M. Rothkrantz (2000), A domain-independent model to improve spelling in a web environment, ICSLP
Seiichi Takao, Jun Ogata, Yasuo Ariki (2000), Expanded vector space model based on word space in cross media retrieval of news speech data, ICSLP
John H. L. Hansen, Bowen Zhou, Murat Akbacak, Ruhi Sarikaya, Bryan Pellom (2000), Audio stream phrase recognition for a national gallery of the spoken word: "one small step", ICSLP
Hideharu Nakajima, Yoshinori Sagisaka, Hirofumi Yamamoto (2000), Pronunciation variants description using recognition error modeling with phonetic derivation hypotheses, ICSLP
Wataru Tsukahara, Nigel Ward (2000), Evaluating responsiveness in spoken dialog systems, ICSLP
Nobuhiko Kitawaki, Futoshi Asano, Takeshi Yamada (2000), Characteristics of spoken language required for objective quality evaluation of echo cancellers, ICSLP
Fumiaki Sugaya, Toshiyuki Takezawa, Akio Yokoo, Yoshinori Sagisaka, Seiichi Yamamoto (2000), Evaluation of the ATR-matrix speech translation system with a pair comparison method between the system and humans, ICSLP
Ichiro Maruyama, Yoshiharu Abe, Terumasa Ehara, Katsuhiko Shirai (2000), An automatic timing detection method for superimposing closed captions of TV programs, ICSLP
Marcel Ogner, Zdravko Kacic (2000), Normalized time-frequency speech representation in articulation training systems, ICSLP
Shinichi Torihara, Katashi Nagao (2000), Semantic transcoding: making the handicapped and the aged free from their barriers in obtaining information on the web, ICSLP
Rathinavelu Chengalvarayan (2000), The use of nonlinear energy transformation for Tamil connected-digit speech recognition, ICSLP
Aimin Chen, Saeed Vaseghi (2000), State based sub-band Wiener filters for speech enhancement in car environments, ICSLP
Kris Hermus, Werner Verhelst, Patrick Wambacq, Philippe Lemmerling (2000), Total least squares based subband modelling for scalable speech representations with damped sinusoids, ICSLP
Joon-Hyuk Chang, Nam Soo Kim (2000), Speech enhancement: new approaches to soft decision, ICSLP
James Glass, Joseph Polifroni, Stephanie Seneff, Victor Zue (2000), Data collection and performance evaluation of spoken dialogue systems: the MIT experience, ICSLP
Lori Lamel, Sophie Rosset, Jean-Luc Gauvain (2000), Considerations in the design and evaluation of spoken language dialog systems, ICSLP
Martin Heckmann, Frédéric Berthommier, Christophe Savario, Kristian Kroschel (2000), Labeling audio-visual speech corpora and training an ANN/HMM audio-visual speech recognition system, ICSLP
Aijun Li, Maocan Lin, XiaoXia Chen, Yiqing Zu, Guohua Sun, Wu Hua, Zhigang Yin, Jingzhu Yan (2000), Speech corpus of Chinese discourse and the phonetic research, ICSLP
Jonathan G. Fiscus, George R. Doddington (2000), Results of the 1999 topic detection and tracking evaluation in Mandarin and English, ICSLP
Satoshi Nakamura, Keiko Watanuki, Toshiyuki Takezawa, Satoru Hayamizu (2000), Multimodal corpora for human-machine interaction research, ICSLP
David Pearce, Hans-Günter Hirsch (2000), The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions, ICSLP
Hans-Günther Tillmann, Florian Schiel, Christoph Draxler, Phil Hoole (2000), The bavarian archive for speech signals - serving the speech community, ICSLP
J. Bruce Millar (2000), The development of spoken language resources in oceania, ICSLP
Frank K. Soong, Eric A. Woudenberg (2000), Hands-free human-machine dialogue - corpora, technology and evaluation, ICSLP
Giuseppe Riccardi (2000), On-line learning of acoustic and lexical units for domain-independent ASR, ICSLP
Tomoyosi Akiba, Katsunobu Itou (2000), Semi-automatic language model acquisition without large corpora, ICSLP
Dijana Petrovska-Delacrétaz, Allen L. Gorin, Jerry H. Wright, Giuseppe Riccardi (2000), Detecting acoustic morphemes in lattices for spoken language understanding, ICSLP
Mitsunori Mizumachi, Masato Akagi, Satoshi Nakamura (2000), Design of robust subtractive beamformer for noisy speech recognition, ICSLP
Hamid Sheikhzadeh, Rassoul Amirfattahi (2000), Objective long-term assessment of speech quality changes in pre-lingual cochlear implant children, ICSLP
Elmar Nöth, Heinrich Niemann, Tino Haderlein, M. Decher, Uwe Eysholdt, F. Rosanowski, T. Wittenberg (2000), Automatic stuttering recognition using hidden Markov models, ICSLP
Deb Roy (2000), Grounded speech communication, ICSLP
Sun-Ah Jun, Mira Oh (2000), Acquisition of second language intonation, ICSLP
Man-hung Siu, Ka-Ming Wong, Man-Yan Ching, Mei-Sum Lau (2000), Computer-aided Mandarin pronunciation learning system, ICSLP
Michael McTear, Norma Conn, Nicola Phillips (2000), Speech recognition software: a tool for people with dyslexia, ICSLP
H. Timothy Bunnell, Debra M. Yarrington, James B. Polikoff (2000), STAR: articulation training for young children, ICSLP
Takayoshi Nakai, Keizo Ishida, Hisayoshi Suzuki (2000), Sound pressure distributions and propagation paths in the vocal tract with the pyriform fossa and the larynx, ICSLP
László Czap (2000), Lip representation by image ellipse, ICSLP
Rob J. J. H. van Son, Barbertje M. Streefkerk, Louis C. W. Pols (2000), An acoustic profile of speech efficiency, ICSLP
Helen M. Meng, W. K. Lo, Yuk Chi Li, P. C. Ching (2000), Multi-scale audio indexing for Chinese spoken document retrieval, ICSLP
Hagen Soltau, Alex Waibel (2000), Phone dependent modeling of hyperarticulated effects#, ICSLP
Qing Guo, Yonghong Yan, Baosheng Yuan, Xiangdong Zhang, Ying Jia, Xiaoxing Liu (2000), Vocabulary-based acoustic model trim down and task adaptation, ICSLP
Willa S. Chen, Abeer Alwan (2000), Place of articulation cues for voiced and voiceless plosives and fricatives in syllable-initial position, ICSLP
Jingdong Chen, Kuldip K. Paliwal, Satoshi Nakamura (2000), A block cosine transform and its application in speech recognition, ICSLP
Jeih-Weih Hung, Hsin-Min Wang, Lin-Shan Lee (2000), Automatic metric-based speech segmentation for broadcast news via principal component analysis, ICSLP
Yuqing Gao, Yongxin Li, Michael Picheny (2000), Maximal rank likelihood as an optimization function for speech recognition, ICSLP
Yue Pan, Alex Waibel (2000), The effects of room acoustics on MFCC speech parameter, ICSLP
Mark Hasegawa-Johnson (2000), Time-frequency distribution of partial phonetic information measured using mutual information, ICSLP
Li Jiang, Xuedong Huang (2000), Subword-dependent speaker clustering for improved speech recognition, ICSLP
Chunhua Luo, Fang Zheng, Mingxing Xu (2000), An equivalent-class based MMI learning method for MGCPM, ICSLP
Alan A. Wrench, Korin Richmond (2000), Continuous speech recognition using articulatory data, ICSLP
Brian Mak, Yik-Cheung Tam (2000), Asynchrony with trained transition probabilities improves performance in multi-band speech recognition, ICSLP
Sunil Sivadas, Pratibha Jain, Hynek Hermansky (2000), Discriminative MLPs in HMM-based recognition of speech in cellular telephony, ICSLP
Toshiyuki Hanazawa, Jun Ishii, Yohei Okato, Kunio Nakajima (2000), Acoustic modeling for spontaneous speech recognition using syllable dependent models, ICSLP
Hui Jiang, Li Deng (2000), A robust training strategy against extraneous acoustic variations for spontaneous speech recognition, ICSLP
Darryl W. Purnell, Elizabeth C. Botha (2000), Improved performance and generalization of minimum classification error training for continuous speech recognition, ICSLP
Ying Jia, Yonghong Yan, Baosheng Yuan (2000), Dynamic threshold setting via Bayesian information criterion (BIC) in HMM training, ICSLP
Thomas Hain, Philip C. Woodland (2000), Modelling sub-phone insertions and deletions in continuous speech recognition, ICSLP
Carrson C. Fung, Oscar C. Au, Wanggen Wan, Chi H. Yim, Cyan L. Keung (2000), Improved acoustics modeling for speech recognition using transformation techniques, ICSLP
Liang Gu, Jayanth Nayak, Kenneth Rose (2000), Discriminative training of tied-mixture HMM by deterministic annealing, ICSLP
Hong-Kwang Jeff Kuo, Chin-Hui Lee (2000), Discriminative training in natural language call routing, ICSLP
Kazuyo Tanaka, Hiroaki Kojima (2000), A speech recognition method with a language-independent intermediate phonetic code, ICSLP
Fabrice Lefèvre (2000), Confidence measures based on the k-nn probability estimator, ICSLP
Niloy Mukherjee, Nitendra Rajput, L. Venkata Subramaniam, Ashish Verma (2000), On deriving a phoneme model for a new language, ICSLP
Tomonobu Saito, Kiyoshi Hashimoto (2000), Estimation of semantic case of Japanese dialogue by use of distance derived from statistics of dependency, ICSLP
Stephen Cox, Srinandan Dasmahapatra (2000), A semantically-based confidence measure for speech recognition, ICSLP
Aravind Ganapathiraju, Joseph Picone (2000), Support vector machines for automatic data cleanup, ICSLP
Yong Gu, Trevor Thomas (2000), Competition-based score analysis for utterance verification in name recognition, ICSLP
Yaxin Zhang (2000), Utterance verification/rejection for speaker-dependent and speaker-independent speech recognition, ICSLP
Valery A. Petrushin (2000), Emotion recognition in speech signal: experimental study, development, and application, ICSLP
Ren-yuan Lyu, Chi-yu Chen, Yuang-chin Chiang, Min-shung Liang (2000), A bi-lingual Mandarin/taiwanese (min-nan), large vocabulary, continuous speech recognition system based on the tong-yong phonetic alphabet (TYPA), ICSLP
Ossama Emam, Jorge Gonzalez, Carsten Günther, Eric Janke, Siegfried Kunzmann, Giulio Maltese, Claire Waast-Richard (2000), A data-driven methodology for the production of multilingual conversational systems, ICSLP
Tzur Vaich, Arnon Cohen (2000), Multi-path, context dependent SC-HMM architectures for improved connected word recognition, ICSLP
Yoram Meron, Keikichi Hirose (2000), Robust recognition using multiple utterances, ICSLP
Piero Cosi, John-Paul Hosom, Fabio Tesser (2000), High performance Italian continuous "digit" recognition, ICSLP
Dominique Fohr, Odile Mella, Christophe Antoine (2000), The automatic speech recognition engine ESPERE: experiments on telephone speech, ICSLP
Imre Kiss (2000), A comparison of distributed and network speech recognition for mobile communication systems, ICSLP
Joe Frankel, Korin Richmond, Simon King, Paul Taylor (2000), An automatic speech recognition system using neural networks and linear dynamic models to recover and model articulatory traces, ICSLP
Khaldoun Shobaki, John-Paul Hosom, Ronald A. Cole (2000), The OGI kids² speech corpus and recognizers, ICSLP
Jian Wu, Fang Zheng (2000), Reducing time-synchronous beam search effort using stage based look-ahead and language model rank based pruning, ICSLP
Grace Chung (2000), A three-stage solution for flexible vocabulary speech understanding, ICSLP
Jon Barker, Martin Cooke, Daniel P. W. Ellis (2000), Decoding speech in the presence of other sound sources, ICSLP
Shi-Wook Lee, Keikichi Hirose, Nobuaki Minematsu (2000), Efficient search strategy in large vocabulary continuous speech recognition using prosodic boundary information, ICSLP
Ha-Jin Yu, Hoon Kim, Joon-Mo Hong, Min-Seong Kim, Jong-Seok Lee (2000), Large vocabulary Korean continuous speech recognition using a one-pass algorithm, ICSLP
Alexander Seward (2000), A tree-trellis n-best decoder for stochastic context-free grammars, ICSLP
Patrick Nguyen, Luca Rigazio, Jean-Claude Junqua (2000), EWAVES: an efficient decoding algorithm for lexical tree based speech recognition, ICSLP
Atsunori Ogawa, Yoshiaki Noda, Shoichi Matsunaga (2000), Novel two-pass search strategy using time-asynchronous shortest-first second-pass beam search, ICSLP
Yu-Chung Chan, Manhung Siu, Brian Mak (2000), Pruning of state-tying tree using bayesian information criterion with multiple mixtures, ICSLP
Yuan-Fu Liao, Nick Wang, Max Huang, Hank Huang, Frank Seide (2000), Improvements of the Philips 2000 Taiwan Mandarin benchmark system, ICSLP
Christoph Neukirchen, Xavier Aubert, Hans Dolfing (2000), Extending the generation of word graphs for a cross-word m-gram decoder, ICSLP
Qingwei Zhao, Zhiwei Lin, Baosheng Yuan, Yonghong Yan (2000), Improvements in search algorithm for large vocabulary continuous speech recognition, ICSLP
Hua Yu, Takashi Tomokiyo, Zhirong Wang, Alex Waibel (2000), New developments in automatic meeting transcription, ICSLP
Jielin Pan, Baosheng Yuan, Yonghong Yan (2000), Effective vector quantization for a highly compact acoustic model for LVCSR, ICSLP
Hiroki Yamamoto, Toshiaki Fukada, Yasuhiro Komori (2000), Effective lexical tree search for large vocabulary continuous speech recognition, ICSLP
Chiori Hori, Sadaoki Furui (2000), Improvements in automatic speech summarization and evaluation methods, ICSLP
Shuangyu Chang, Lokendra Shastri, Steven Greenberg (2000), Automatic phonetic transcription of spontaneous speech (american English), ICSLP
Miroslav Novak, Michael Picheny (2000), Speed improvement of the tree-based time asynchronous search, ICSLP
Jing Huang, B. Kingsbury, L. Mangu, Mukund Padmanabhan, George Saon, Geoffrey Zweig (2000), Recent improvements in speech recognition performance on large vocabulary conversational speech (voicemail and switchboard), ICSLP
Lei He, Ditang Fang, Wenhu Wu (2000), Speaker normalization training and adaptation for speech recognition, ICSLP
Laura Mayfield Tomokiyo (2000), Lexical and acoustic modeling of non-native speech in LVSCR, ICSLP
Baojie Li, Keikichi Hirose, Nobuaki Minematsu (2000), Modeling phone correlation for speaker adaptive speech recognition, ICSLP
Henrik Botterweck (2000), Very fast adaptation for large vocabulary continuous speech recognition using eigenvoices, ICSLP
Chengyi Zheng, Yonghong Yan (2000), Efficiently using speaker adaptation data, ICSLP
Thilo Pfau, Robert Faltlhauser, Günther Ruske (2000), A combination of speaker normalization and speech rate normalization for automatic speech recognition, ICSLP
Tai-Hwei Hwang, Kuo-Hwei Yuo, Hsiao-Chuan Wang (2000), Speech model compensation with direct adaptation of cepstral variance to noisy environment, ICSLP
Ji Wu, Zuoying Wang (2000), Gaussian similarity analysis and its application in speaker adaptation, ICSLP
Nobuyasu Itoh, Masafumi Nishimura, Shinsuke Mori (2000), A method for style adaptation to spontaneous speech by using a semi-linear interpolation technique, ICSLP
Petra Geutner, Luis Arevalo, Joerg Breuninger (2000), VODIS - voice-operated driver information systems: a usability study on advanced speech technologies for car environments, ICSLP
Wu Chou, Qiru Zhou, Hong-Kwang Jeff Kuo, Antoine Saad, David Attwater, Peter Durston, Mark Farrell, Frank Scahill (2000), Natural language call steering for service applications, ICSLP
Jörg Hunsinger, Manfred Lang (2000), A single-stage top-down probabilistic approach towards understanding spoken and handwritten mathematical formulas, ICSLP
Prabhu Raghavan, Sunil K. Gupta (2000), Low complexity connected digit recognition for mobile applications, ICSLP
Jan Nouza (2000), Telephone speech recognition from large lists of Czech words, ICSLP
Duanpei Wu, X. Menendez-Pidal, L. Olorenshaw, R. Chen, M. Tanaka, M. Amador (2000), Speech and word detection algorithms for hands-free applications, ICSLP
Ashwin Rao, Bob Roth, Venkatesh Nagesha, Don McAllaster, Natalie Liberman, Larry Gillick (2000), Large vocabulary continuous speech recognition of read speech over cellular and landline networks, ICSLP
Seiichi Yamamoto (2000), Toward speech communications beyond language barrier - research of spoken language translation technologies at ATR -, ICSLP
Hervé Blanchon, Christian Boitet (2000), Speech translation for French within the c-STAR II consortium and future perspectives, ICSLP
Chengqing Zong, Yumi Wakita, Bo Xu, Zhenbiao Chen, Kenji Matsui (2000), Japanese-to-Chinese spoken language translation based on the simple expression, ICSLP
Srinivas Bangalore, Giuseppe Riccardi (2000), Finite-state models for lexical reordering in spoken language translation, ICSLP
Ralf Engel (2000), CHUNKY: an example based machine translation system for spoken dialogs, ICSLP
Gianni Lazzari (2000), Spoken translation: challenges and opportunities, ICSLP
Christian Boitet, Jean-Philippe Guilbaud (2000), Analysis into a formal task-oriented pivot without clear abstract - semantics is best handled as "usual" translation, ICSLP
Chengqing Zong, Taiyi Huang, Bo Xu (2000), An improved template-based approach to spoken language translation, ICSLP
Takao Watanabe, Akitoshi Okumura, Shinsuke Sakai, Kiyoshi Yamabana, Shinichi Doi, Ken Hanazawa (2000), An automatic interpretation system for travel conversation, ICSLP
Rainer Gruhn, Harald Singer, Hajime Tsukada, Masaki Naito, Atsushi Nishino, Atsushi Nakamura, Yoshinori Sagisaka, Satoshi Nakamura (2000), Cellular-phone based speech-to-speech translation system ATR-MATRIX, ICSLP
Nicole Beringer, Tsuyoshi Ito, Marcia Neff (2000), Generation of pronunciation rule sets for automatic segmentation of American English and Japanese, ICSLP
K. Samudravijaya, P. V. S. Rao, S. S. Agrawal (2000), Hindi speech database, ICSLP
Hsiao-Chuan Wang, Frank Seide, Chiu-Yu Tseng, Lin-Shan Lee (2000), MAT-2000 - design, collection, and validation of a Mandarin 2000-speaker telephone speech database, ICSLP
Kåre Sjölander, Jonas Beskow (2000), Wavesurfer - an open source speech tool, ICSLP
Nick Campbell, Toru Marumoto (2000), Automatic labelling of voice-quality in speech databases for synthesis, ICSLP
Joe Timoney, J. Brian Foley (2000), Speech quality evaluation based on AM-FM time-frequency representations, ICSLP
Tatsuya Kawahara, Akinobu Lee, Tetsunori Kobayashi, Kazuya Takeda, Nobuaki Minematsu, Shigeki Sagayama, Katsunobu Itou, Akinori Ito, Mikio Yamamoto, Atsushi Yamada, Takehito Utsuro, Kiyohiro Shikano (2000), Free software toolkit for Japanese large vocabulary continuous speech recognition, ICSLP
Qiang Huo, Bin Ma (2000), Robust speech recognition based on off-line elicitation of multiple priors and on-line adaptive prior fusion, ICSLP
William J.J. Roberts, Sadaoki Furui (2000), Robust speech recognition via modeling spectral coefficients with HMM's with complex Gaussian components, ICSLP
Mirjam Wester, Judith M. Kessens, Helmer Strik (2000), Pronunciation variation in ASR: which variation to model?, ICSLP
Xiaolong Mou, Victor Zue (2000), The use of dynamic reliability scoring in speech recognition, ICSLP
Javier Macías-Guarasa, Javier Ferreiros, Ruben San-Segundo, Juan Manuel Montero, Juan Manuel Pardo (2000), Acoustical and lexical based confidence measures for a very large vocabulary telephone speech hypothesis-verification system, ICSLP
Silke Goronzy, Krzysztof Marasek, Ralf Kompe, Andreas Haag (2000), Phone-duration-based confidence measures for embedded applications, ICSLP
Aravind Ganapathiraju, Jonathan Hamaker, Joseph Picone (2000), Hybrid SVM/HMM architectures for speech recognition, ICSLP
Koki Sasaki, Hui Jiang, Keikichi Hirose (2000), Rapid adaptation of n-gram language models using inter-word correlation for speech recognition, ICSLP
Gareth Moore, Steve Young (2000), Class-based language model adaptation using mixtures of word-class weights, ICSLP
Jiasong Sun, Xiaodong Cui, Zuoying Wang, Yang Liu (2000), A language model adaptation approach based on text classification, ICSLP
Grace Chung (2000), Automatically incorporating unknown words in JUPITER, ICSLP
Rathinavelu Chengalvarayan (2000), Look-ahead sequential feature vector normalization for noisy speech recognition, ICSLP
Naoto Iwahashi, Akihiko Kawasaki (2000), Speaker adaptation in noisy environments based on parameter estimation using uncertain data, ICSLP
Alex Acero, Steven Altschuler, Lani Wu (2000), Speech/noise separation using two microphones and a VQ model of speech signals, ICSLP
Michiel Bacchiani (2000), Using maximum likelihood linear regression for segment clustering and speaker identification, ICSLP
Tor André Myrvoll, Olivier Siohan, Chin-Hui Lee, Wu Chou (2000), Structural maximum a-posteriori linear regression for unsupervised speaker adaptation, ICSLP
Jen-Tzung Chien, Guo-Hong Liao (2000), Transformation-based Bayesian predictive classification for online environmental learning and robust speech recognition, ICSLP
Michael Pitz, Frank Wessel, Hermann Ney (2000), Improved MLLR speaker adaptation using confidence measures for conversational speech recognition, ICSLP
Rathinavelu Chengalvarayan (2000), Unified acoustic modeling for continuous speech recognition, ICSLP
Satya Dharanipragada, Mukund Padmanabhan (2000), A nonlinear unsupervised adaptation technique for speech recognition, ICSLP
Sam-Joo Doh, Richard M. Stern (2000), Using class weighting in inter-class MLLR, ICSLP
John-Paul Hosom, Ronald A. Cole (2000), Burst detection based on measurements of intensity discrimination, ICSLP
Javier Ferreiros López, Daniel P. W. Ellis (2000), Using acoustic condition clustering to improve acoustic change detection on broadcast news, ICSLP
Jon P. Nedel, Rita Singh, Richard M. Stern (2000), Phone transition acoustic modeling: application to speaker independent and spontaneous speech systems, ICSLP
Liqin Shen, Guokang Fu, Haixin Chai, Yong Qin (2000), The measurement of acoustic similarity and its applications, ICSLP
Sopae Yi, Hyung Soon Kim, One Good Lee (2000), Glottal parameters contributing to the perceotion of loud voices, ICSLP
Christoph Schillo, Gernot A. Fink, Franz Kummert (2000), Grapheme based speech recognition for large vocabularies, ICSLP
Jon P. Nedel, Rita Singh, Richard M. Stern (2000), Automatic subword unit refinement for spontaneous speech recognition via phone splitting, ICSLP
Takeshi Tarui (2000), Rhythm timing in Japanese English, ICSLP
Mamoru Iwaki (2000), A vocal tract area ratio estimation from spectral parameter extracted by straight, ICSLP
Bhuvana Ramabhadran, Yuqing Gao (2000), Decision tree based rate of speech modeling for speech recognition, ICSLP
Mukund Padmanabhan (2000), Spectral peak tracking and its use in speech recognition, ICSLP
Yongxin Li, Yuqing Gao, Hakan Erdogan (2000), Weighted pairwise scatter to improve linear discriminant analysis, ICSLP
Jindrich Matousek, Josef Psutka (2000), ARTIC: a new Czech text-to-speech system using statistical approach to speech segment database construction, ICSLP
Wu Chou, Olivier Siohan, Tor André Myrvoll, Chin-Hui Lee (2000), Extended maximum a posterior linear regression (EMAPLR) model adaptation for speech recognition, ICSLP
Ekkarit Maneenoi, Somchai Jitapunkul, Visarut Ahkuputra, Umavasee Thathong, Boonchai Thampanitchawong, Sudaporn Luksaneeyanawin (2000), Thai monophthong recognition using continuous density hidden Markov model and LPC cepstral coefficients, ICSLP
Chung-Hsien Wu, Yeou-Jiunn Chen, Cher-Yao Yang (2000), Error recovery and sentence verification using statistical partial pattern tree for conversational speech, ICSLP
Andrew Wilson Howitt (2000), Vowel landmark detection, ICSLP
Carsten Meyer, Georg Rose (2000), Rival training: efficient use of data in discriminative training, ICSLP
Marilyn Y. Chen (2000), Nasal detection module for a knowledge-based speech recognition system, ICSLP
Jun Liu, Xiaoyan Zhu, Bin Jia (2000), Semi-continuous segmental probability model for speech signals, ICSLP
Ea-Ee Jan, Jaime Botella Ordinas (2000), Cross-domain robust acoustic training, ICSLP
Fan Wang, Fang Zheng, Wenhu Wu (2000), A c/v segmentation method for Mandarin speech based on multiscale fractal dimension, ICSLP
Xiaoxia Chen, Aijun Li, Guohua Sun, Wu Hua, Zhigang Yu (2000), An application of SAMPA-c for standard Chinese, ICSLP
Wenkai Lu, Xuegong Zhang, Yanda Li, Shen Liqin, Zhu Weibin (2000), Joint speech signal enhancement based on spectral subtraction and SVD filter, ICSLP
Sacha Krstulovic, Frédéric Bimbot (2000), Inverse lattice filtering of speech with adapted non-uniform delays, ICSLP
Hideki Kawahara, Yoshinori Atake, Parham Zolfaghari (2000), Accurate vocal event detection method based on a fixed-point analysis of mapping from time to weighted average group delay, ICSLP
Jun Huang, Mukund Padmanabhan (2000), Filterbank-based feature extraction for speech recognition and its application to voice mail transcription, ICSLP
Peter J. Murphy (2000), A cepstrum-based harmonics-to-noise ratio in voice signals, ICSLP
Xuejing Sun (2000), A pitch determination algorithm based on subharmonic-to-harmonic ratio, ICSLP
Jordi Solé i Casals, Enric Monte i Moreno, Christian Jutten, Anisse Taleb (2000), Source separation techniques applied to speech linear prediction, ICSLP
Masahide Sugiyama (2000), Model based voice decomposition method, ICSLP
Keiichi Funaki (2000), A time-varying complex speech analysis based on IV method, ICSLP
Parham Zolfaghari, Hideki Kawahara (2000), A sinusoidal model based on frequency-to-instantaneous frequency mapping, ICSLP
Omar Farooq, Sekharjit Datta (2000), Dynamic feature extraction by wavelet analysis, ICSLP
Montri Karnjanadecha, Stephen A. Zahorian (2000), An investigation of variable block length methods for calculation of spectral/temporal features for automatic speech recognition, ICSLP
Akira Sasou, Kazuyo Tanaka (2000), Glottal excitation modeling using HMM with application to robust analysis of speech signal, ICSLP
Laura Docío-Fernández, Carmen García-Mateo (2000), Automatic segmentation of speech based on hidden Markov models and acoustic features, ICSLP
Akira Kurematsu, Youichi Akegami, Susanne Burge, Susanne Jekat, Brigitte Lause, Victoria L. Maclaren, Daniela Oppermann, Tanja Schultz (2000), VERBMOBIL dialogues: multifaced analysis, ICSLP
Jin-Jie Zhang, Zhi-Gang Cao, Zheng-Xin Ma (2000), A computation-efficient parameter adaptation algorithm for the generalized spectral subtraction method, ICSLP
Masahiro Araki, Kiyoshi Ueda, Takuya Nishimoto, Yasuhisa Niimi (2000), A semantic tagging tool for spoken dialogue corpus, ICSLP
Aijun Li, Xiaoxia Chen, Guohua Sun, Wu Hua, Zhigang Yin, Yiqing Zu, Fang Zheng, Zhanjiang Song (2000), The phonetic labeling on read and spontaneous discourse corpora, ICSLP
Nicole Beringer, Florian Schiel (2000), The quality of multilingual automatic segmentation using German MAUS, ICSLP
Vlasta Radová, Josef Psutka (2000), UWB_S01 corpus - a czech read-speech corpus, ICSLP
Giuseppe Di Fabbrizio, Shrikanth Narayanan (2000), Web-based monitoring, logging and reporting tools for multi-service multi-modal systems, ICSLP
Helmer Strik, Catia Cucchiarini, Judith M. Kessens (2000), Comparing the recognition performance of CSRs: in search of an adequate metric and statistical significance test, ICSLP
Alexander Raake (2000), Perceptual dimensions of speech sound quality in modern transmission systems, ICSLP
Michael I. Mandel, Daniel P. W. Ellis (2006), A probability model for interaural phase difference, SAPA
Guoping Li, Mark E. Lutman (2006), Sparseness and speech perception in noise, SAPA
Tomonori Izumitani, Kunio Kashino (2006), Frequency component restoration for music sounds using local probabilistic models with maximum entropy learning, SAPA
Hiroko Terasawa, Malcolm Slaney, Jonathan Berger (2006), A statistical model of timbre perception, SAPA
Steven Rennie, Peder Olsen, John Hershey, Trausti Kristjansson (2006), The Iroquois model: using temporal dynamics to separate speakers, SAPA
Ron J. Weiss, Daniel P. W. Ellis (2006), Estimating single-channel source separation masks: relevance vector machine classifiers vs. pitch-based masking, SAPA
Björn Schölling, Martin Heckmann, Frank Joublin, Christian Goerick (2006), Structuring time domain blind source separation algorithms for CASA integration, SAPA
Shun'ichi Yamamoto, Ryu Takeda, Kazuhiro Nakadai, Mikio Nakano, Hiroshi Tsujino, Jean-Marc Valin, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno (2006), Leak energy based missing feature mask generation for ICA and GSS and its evaluation with simultaneous speech recognition, SAPA
Sourabh Ravindran, David V. Anderson, Malcolm Slaney (2006), Improving the noise-robustness of mel-frequency cepstral coefficients for speech processing, SAPA
Yoshitaka Nishimura, Mikio Nakano, Kazuhiro Nakadai, Hiroshi Tsujino, Mitsuru Ishizuka (2006), Speech recognition for a robot under its motor noises by selective application of missing feature theory and MLLR, SAPA
Jerome R. Bellegarda (2006), LSM-based feature extraction for concatenative speech synthesis, SAPA
Kentaro Ishizuka, Tomohiro Nakatani (2006), Study of noise robust voice activity detection based on periodic component to aperiodic component ratio, SAPA
Gaurav Bhatt, Akshita Gupta, Aditya Arora, Balasubramanian Raman (2018), Acoustic features fusion using attentive multi-channel deep architecture, CHiME
Christoph Boeddecker, Jens Heitkaemper, Joerg Schmalenstroeer, Lukas Drude, Jahn Heymann, Reinhold Haeb-Umbach (2018), Front-end processing for the CHiME-5 dinner party scenario, CHiME
Rama Doddipatla, Takehiko Kagoshima, Cong-Thanh Do, Petko Petkov, Catalin-Tudor Zorila, Euihyun Kim, Daichi Hayakawa, Hiroshi Fujimura, Yannis Stylianou (2018), The Toshiba entry to the CHiME 2018 Challenge, CHiME
Jun Du, Tian Gao, Lei Sun, Feng Ma, Yi Fang, Di-Yuan Liu, Qiang Zhang, Xiang Zhang, Hai-Kun Wang, Jia Pan, Jian-Qing Gao, Chin-Hui Lee, Jing-Dong Chen (2018), The USTC-iFlytek systems for CHiME-5 Challenge, CHiME
Sonal Joshi, Ashish Panda, Meet Soni, Rupayan Chakraborty, Sunilkumar Kopparapu, Nikhil Mohanan, Premanand Nayak, Rajbabu Velmurugan, Preeti Rao (2018), CHiME 2018 Workshop: Enhancing beamformed audio using time delay neural network denoising autoencoder, CHiME
Naoyuki Kanda, Rintaro Ikeshita, Shota Horiguchi, Yusuke Fujita, Kenji Nagamatsu, Xiaofei Wang, Vimal Manohar, Nelson Enrique Yalta Soplin, Matthew Maciejewski, Szu-Jui Chen, Aswin Shanmugam Subramanian, Ruizhi Li, Zhiqi Wang, Jason Naradowsky, L. Paola Garcia-Perera, Gregory Sell (2018), The Hitachi/JHU CHiME-5 system: Advances in speech recognition for everyday home environments using multiple microphone arrays, CHiME
Gil Keren, Jing Han, Björn Schuller (2018), Scaling speech enhancement in unseen environments with noise embeddings, CHiME
Siddharth Dalmia, Suyoun Kim, Florian Metze (2018), Situation informed end-to-end ASR for noisy environments, CHiME
Markus Kitza, Wilfried Michel, Christoph Boeddeker, Jens Heitkaemper, Tobias Menne, Ralf Schlüter, Hermann Ney, Joerg Schmalenstroeer, Lukas Drude, Jahn Heymann, Reinhold Haeb-Umbach (2018), The RWTH/UPB system combination for the CHiME 2018 Workshop, CHiME
Chenxing Li, Tieqiang Wang (2018), The ZTSpeech system for CHiME-5 Challenge: A far-field speech recognition system with front-end and robust back-end, CHiME
Yanhua Long, Renke He (2018), The SHNU system for the CHiME-5 Challenge, CHiME
Ivan Medennikov, Ivan Sorokin, Aleksei Romanenko, Dmitry Popov, Yuri Khokhlov, Tatiana Prisyach, Nikolay Malkovskiy, Vladimir Bataev, Sergei Astapov, Maxim Korenevsky, Alexander Zatvornitskiy (2018), The STC System for the CHiME 2018 Challenge, CHiME
Alim Misbullah (2018), Robust network structures for acoustic model on CHiME5 Challenge dataset, CHiME
Nikhil Mohanan, Premanand Nayak, Rajbabu Velmurugan, Preeti Rao, Sonal Joshi, Ashish Panda, Meet Soni, Rupayan Chakraborty, Sunilkumar Kopparapu (2018), NMF based front-end processing in multi-channel distant speech recognition, CHiME
Ankur Patil, Maddala V. Siva Krishna, Mehak Piplani, Pulikonda Aditya Sai, Hardik B. Sailor, Hemant A. Patil (2018), DA-IICT/IIITV system for the 5th CHiME 2018 Challenge, CHiME
Dan Qu, Cheng-Ran Liu, Xu-Kiu Yang, Wen-lin Zhang (2018), The NDSC transcription system for the 2018 CHiME-5 Challenge, CHiME
Sining Sun, Yangyang Shi, Ching-Feng Yeh, Suliang Bu, Mei-Yuh Hwang, Lei Xie (2018), Multiple beamformers with ROVER for the CHiME-5 Challenge, CHiME
Hannes Unterholzner, Lukas Pfeifenberger, Franz Pernkopf, Marco Matassoni, Alessio Brutti, Daniele Falavigna (2018), Channel-selection for distant-speech recognition on CHiME-5 dataset, CHiME
Feifei Xiong, Jisi Zhang, Bernd Meyer, Heidi Christensen, Jon Barker (2018), Channel selection using neural network posterior probability for speech recognition with distributed microphone arrays in everyday environments, CHiME
Zhiwei Zhao, Jian Wu, Lei Xie (2018), The NWPU System for CHiME-5 Challenge, CHiME
John H L Hansen (2018), Robust speaker diarization and recognition in naturalistic data streams: Challenges for multi-speaker tasks & learning spaces, CHiME
Florian Metze (2018), Open-domain audiovisual speech recognition and video summarization, CHiME
Sam Davies (2013), Automatic speech recognition in the BBC, SLAM
Hervé Bourlard, Marc Ferràs, Nikolaos Pappas, Andrei Popescu-Belis, Steve Renals, Fergus McInnes, Peter J. Bell, Sandy Ingram, Mael Guillemot (2013), Processing and linking audio events in large multimedia archives: the EU inevent project, SLAM
Benjamin Elizalde, Mirco Ravanelli, Gerald Friedland (2013), Audio concept ranking for video event detection on user-generated content, SLAM
Diego Castán, Murat Akbacak (2013), Segmental-GMM approach based on acoustic concept segmentation, SLAM
Diego Castán, Alfonso Ortega, Antonio Miguel, Eduardo Lleida (2013), Broadcast news segmentation with factor analysis system, SLAM
Pierre Lanchantin, Peter J. Bell, Mark J. F. Gales, Thomas Hain, Xunying Liu, Yanhua Long, Jennifer Quinnell, Steve Renals, Oscar Saz, Matt S. Seigel, Pawel Swietojanski, Phil C. Woodland (2013), Automatic transcription of multi-genre media archives, SLAM
Christian Mohr, Christian Saam, Kevin Kilgour, Jonas Gehring, Sebastian Stüker, Alex Waibel (2013), Slightly supervised adaptation of acoustic models on captioned BBC weather forecasts, SLAM
Stefan Ziegler, Guillaume Gravier (2013), A framework for integrating heterogeneous sporadic knowledge sources into automatic speech recognition, SLAM
Olivier Galibert, Juliette Kahn (2013), The first official REPERE evaluation, SLAM
Hervé Bredin, Johann Poignant, Guillaume Fortier, Makarand Tapaswi, Viet-Bac Le, Anindya Roy, Claude Barras, Sophie Rosset, Achintya Sarkar, Qian Yang, Hua Gao, Alexis Mignon, Jakob Verbeek, Laurent Besacier, Georges Quénot, Hazim Kemal Ekenel, Rainer Stiefelhagen (2013), QCompere @ REPERE 2013, SLAM
Benoit Favre, Géraldine Damnati, Frederic Bechet, Meriem Bendris, Delphine Charlet, Rémi Auguste, Stéphane Ayache, Benjamin Bigot, Alexandre Deltei, Richard Dufour, Corinne Fredouille, Georges Linarès, Jean Martinet, Gregory Senay, Pierre Tirilly (2013), PERCOLI: a person identification system for the 2013 REPERE challenge, SLAM
Mohamed Hatmi, Christine Jacquin, Emmanuel Morin, Sylvain Meignier (2013), Named entity recognition in speech transcripts following an extended taxonomy, SLAM
Benjamin Bigot, Corinne Fredouille, Delphine Charlet (2013), Speaker role recognition on TV broadcast documents, SLAM
Houman Ghaemmaghami, David Dean, Sridha Sridharan (2013), Speaker attribution of australian broadcast news data, SLAM
Carole Lailler, Grégor Dupuy, Mickael Rouvier, Sylvain Meignier (2013), Semi-supervised and unsupervised data extraction targeting speakers: from speaker roles to fame?, SLAM
Johann Poignant, Hervé Bredin, Laurent Besacier, Georges Quénot, Claude Barras (2013), Towards a better integration of written names for unsupervised speakers identification in videos, SLAM
Abhigyan Singh, Martha Larson (2013), Narrative-driven multimedia tagging and retrieval: investigating design and practice for speech-based mobile applications, SLAM
Larry Heck, Dilek Hakkani-Tür, Madhu Chinthakunta, Gokhan Tur, Rukmini Iyer, Partha Parthasarathy, Lisa Stifelman, Elizabeth Shriberg, Ashley Fidler (2013), Multi-modal conversational search and browse, SLAM
Korbinian Riedhammer, Martin Gropp, Tobias Bocklet, Florian Hönig, Elmar Nöth, Stefan Steidl (2013), LMELECTURES: a multimedia corpus of academic spoken English, SLAM
Eiríkur Rögnvaldsson (2008), Icelandic Language Technology Ten Years Later, SALTMIL
Heather Simpson, Christopher Cieri, Kazuaki Maeda, Kathryn Baker, Boyan Onyshkevych (2008), Human Language Technology Resources for Less Commonly Taught Languages: Lessons Learned Toward Creation of Basic Language Resources, SALTMIL
Karel Pala, Sonja Bosch, Christiane Fellbaum (2008), Building resources for African languages, SALTMIL
Francis M. Tyers, Jacques A. Pienaar (2008), Extracting bilingual word pairs from Wikipedia, SALTMIL
Iker Luengo, Eva Navas, Iñaki Sainz, Ibon Saratxaga, Jon Sanchez, Igor Odriozola, Juan J. Igarza, Inma Hernaez (2008), Building a Basque/Spanish bilingual database for speaker verification, SALTMIL
Attila Novák (2008), Language resources for Uralic minority languages, SALTMIL
Xulio Viejo, Roser Saurí, Ángel Neira (2008), Eslema. Towards a Corpus for Asturian, SALTMIL
Hemlata Tak, Jee-weon Jung, Jose Patino, Madhu Kamble, Massimiliano Todisco, Nicholas Evans (2021), End-to-end spectro-temporal graph attention networks for speaker verification anti-spoofing and speech deepfake detection, ASVSPOOF
Lin Zhang, Xin Wang, Erica Cooper, Junichi Yamagishi (2021), Multi-task Learning in Utterance-level and Segmental-level Spoof Detection, ASVSPOOF
Xingming Wang, Xiaoyi Qin, Tinglong Zhu, Chao Wang, Shilei Zhang, Ming Li (2021), The DKU-CMRI System for the ASVspoof 2021 Challenge: Vocoder based Replay Channel Response Estimation, ASVSPOOF
Wanying Ge, Jose Patino, Massimiliano Todisco, Nicholas Evans (2021), Raw Differentiable Architecture Search for Speech Deepfake and Spoofing Detection, ASVSPOOF
Rohan Kumar Das (2021), Known-unknown Data Augmentation Strategies for Detection of Logical Access, Physical Access and Speech Deepfake Attacks: ASVspoof 2021, ASVSPOOF
Sunghyun Yoon, Ha-Jin Yu (2021), Multiple-Point Input and Time-Inverted Speech Signal for The ASVspoof 2021 Challenge, ASVSPOOF
Yuan Lei, Xiao Huo, Yuzong Jiao, Yiu Kei Li (2021), Deep Metric Learning for Replay Attack Detection, ASVSPOOF
Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, Héctor Delgado (2021), ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection, ASVSPOOF
Nicolas Müller, Franziska Dieckmann, Pavel Czempin, Roman Canals, Konstantin Böttinger, Jennifer Williams (2021), Speech is Silver, Silence is Golden: What do ASVspoof-trained Models Really Learn?, ASVSPOOF
Anton Tomilov, Aleksei Svishchev, Marina Volkova, Artem Chirkovskiy, Alexander Kondratev, Galina Lavrentyeva (2021), STC Antispoofing Systems for the ASVspoof2021 Challenge, ASVSPOOF
Joaquín Cáceres, Roberto Font, Teresa Grau, Javier Molina (2021), The Biometric Vox System for the ASVspoof 2021 Challenge, ASVSPOOF
Xinhui Chen, You Zhang, Ge Zhu, Zhiyao Duan (2021), UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021, ASVSPOOF
Woo Hyun Kang, Jahangir Alam, Abderrahim Fathan (2021), Investigation on activation functions for robust end-to-end spoofing attack detection system, ASVSPOOF
Tianxiang Chen, Elie Khoury, Kedar Phatak, Ganesh Sivaraman (2021), Pindrop Labs' Submission to the ASVspoof 2021 Challenge, ASVSPOOF
Zhor Benhafid, Sid Ahmed Selouani, Mohammed Sidi Yakoub, Abderrahmane Amrouche (2021), LARIHS ASSERT Reassessment for Logical Access ASVspoof 2021 Challenge, ASVSPOOF
Woo Hyun Kang, Jahangir Alam, Abderrahim Fathan (2021), CRIM's System Description for the ASVSpoof2021 Challenge, ASVSPOOF
Bernd J. Kröger (2007), Perspectives for articulatory speech synthesis, SSW
Oxana Govokhina, Gérard Bailly, Gaspard Breton (2007), Learning optimal audiovisual phasing for an HMM-based control model for facial animation, SSW
Peter Birkholz, Ingmar Steiner, Stefan Breuer (2007), Control concepts for articulatory speech synthesis, SSW
Alexander B. Kain, Qi Miao, Jan P. H. van Santen (2007), Spectral control in concatenative speech synthesis, SSW
Barry Kirkpatrick, Darragh O'Brien, Ronán Scaife (2007), Feature transformation applied to the detection of discontinuities in concatenated speech, SSW
Nick Campbell (2007), Towards conversational speech synthesis; lessons learned from the expressive speech processing project, SSW
Shinsuke Sakai, Jinfu Ni, Ranniery Maia, Keiichi Tokuda, Minoru Tsuzaki, Tomoki Toda, Hisashi Kawai, Satoshi Nakamura (2007), Communicative speech synthesis with XIMERA: a first step, SSW
Raul Fernandez, Bhuvana Ramabhadran (2007), Automatic exploration of corpus-specific properties for expressive text-to-speech: a case study in emphasis, SSW
Charlotte Wollermann, Eva Lasarcyk (2007), Modeling and perceiving of (un-)certainty in articulatory speech synthesis, SSW
Lijuan Wang, Min Chu, Yaya Peng, Yong Zhao, Frank K. Soong (2007), Perceptual annotation of expressive speech, SSW
Karl Schnell, Arild Lacroix (2007), Joint analysis of speech frames for synthesis based on lossy tube models, SSW
Connie R. Adsett, Yannick Marchand (2007), Are rule-based syllabification methods adequate for languages with low syllabic complexity? the case of Italian, SSW
Mark Huckvale, Kayoko Yanagisawa (2007), Spoken language conversion with accent morphing, SSW
Grazyna Demenko, Agnieszka Wagner, Matthias Jilka, Bernd Möbius (2007), Comparative investigation of peak alignment in Polish and German unit selection corpora, SSW
Katarzyna Klessa, Marcin Szymanski, Stefan Breuer, Grazyna Demenko (2007), Optimization of Polish segmental duration prediction with CART, SSW
Toshio Hirai, Junichi Yamagishi, Seiichi Tenpaku (2007), Utilization of an HMM-based feature generation module in 5 ms segment concatenative speech synthesis, SSW
Damien Lolive, Nelly Barbot, Olivier Boeffard (2007), Clustering algorithm for F0 curves based on hidden Markov models, SSW
Rohit Kumar, Rashmi Gangadharaiah, Sharath Rao, Kishore Prahallad, Carolyn P. Rosé, Alan W. Black (2007), Building a better Indian English voice using "more data", SSW
Marc Schröder, Anna Hunecke (2007), Creating German unit selection voices for the MARY TTS platform from the BITS corpora, SSW
Kumi Ohta, Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano (2007), Regression approaches to voice quality controll based on one-to-many eigenvoice conversion, SSW
Daisuke Tani, Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano (2007), An evaluation of many-to-one voice conversion algorithms with pre-stored speaker data sets, SSW
Joao P. Cabral, Steve Renals, Korin Richmond, Junichi Yamagishi (2007), Towards an improved modeling of the glottal source in statistical parametric speech synthesis, SSW
Larbi Mesbahi, Vincent Barreaud, Olivier Boeffard (2007), GMM-based speech transformation systems under data reduction, SSW
Junichi Yamagishi, Takao Kobayashi, Steve Renals, Simon King, Heiga Zen, Tomoki Toda, Keiichi Tokuda (2007), Improved average-voice-based speech synthesis using gender-mixed modeling and a parameter generation algorithm considering GV, SSW
Ranniery Maia, Tomoki Toda, Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda (2007), An excitation model for HMM-based speech synthesis based on residual modeling, SSW
Hui Liang, Yao Qian, Frank K. Soong (2007), An HMM-based bilingual (Mandarin-English) TTS, SSW
Justus C. Roux, Albert S. Visagie (2007), Data-driven approach to rapid prototyping Xhosa speech synthesis, SSW
Nobuaki Minematsu, Ryo Kuroiwa, Keikichi Hirose, Michiko Watanabe (2007), CRF-based statistical learning of Japanese accent sandhi for developing Japanese text-to-speech synthesis systems, SSW
Qinghua Sun, Keikichi Hirose, Nobuaki Minematsu (2007), Two-step generation of Mandarin F0 contours based on tone nucleus and superpositional models, SSW
Suphattharachai Chomphan, Takao Kobayashi (2007), Design of tree-based context clustering for an HMM-based Thai speech synthesis system, SSW
Arne Bachmann, Stefan Breuer (2007), Development of a BOSS unit selection module for tone languages, SSW
Alexander B. Kain, Jan P. H. van Santen (2007), Unit-selection text-to-speech synthesis using an asynchronous interpolation model, SSW
Ingo Hertrich, Hermann Ackermann (2007), Modelling voiceless speech segments by means of an additive procedure based on the computation of formant sinusoids, SSW
Arthur R. Toth, Alan W. Black (2007), Using articulatory position data in voice transformation, SSW
Anand Arokia Raj, Tanuja Sarkar, Satish Chandra Pammi, Santhosh Yuvaraj, Mohit Bansal, Kishore Prahallad, Alan W. Black (2007), Text processing for text-to-speech systems in Indian languages, SSW
Daniel Erro, Asunción Moreno, Antonio Bonafonte (2007), Flexible harmonic/stochastic speech synthesis, SSW
Jan Romportl, Jirí Kala (2007), Prosody modelling in Czech text-to-speech synthesis, SSW
Yong Zhao, Chengsuo Zhang, Frank K. Soong, Min Chu, Xi Xiao (2007), Measuring attribute dissimilarity with HMM KL-divergence for speech synthesis, SSW
Jonathan Chevelu, Nelly Barbot, Olivier Boeffard, Arnaud Delhay (2007), Lagrangian relaxation for optimal corpus design, SSW
Aleksandra Krul, Géraldine Damnati, François Yvon, Cédric Boidin, Thierry Moudenc (2007), Adaptive database reduction for domain specific speech synthesis, SSW
Jordi Adell, Antonio Bonafonte, David Escudero (2007), Statistical analysis of filled pauses² rhythm for disfluent speech synthesis, SSW
Wentao Gu, Tan Lee (2007), Quantitative analysis of F0 contours of emotional speech of Mandarin, SSW
Slava Shechtman (2007), Maximum-likelihood dynamic intonation model for concatenative text-to-speech system, SSW
Uwe D. Reichel (2007), Data-driven extraction of intonation contour classes, SSW
Taniya Mishra, Emily Tucker Prud'hommeaux, Jan P. H. van Santen (2007), Word accentuation prediction using a neural net classifier, SSW
Leonardo Badino, Robert A. J. Clark (2007), Issues of optionality in pitch accent placement, SSW
Matthew P. Aylett, Simon King (2007), Single speaker segmentation and inventory selection using dynamic time warping self organization and joint multigram mapping, SSW
Tanya Lambert, Norbert Braunschweiler, Sabine Buchholz (2007), How (not) to select your voice corpus: random selection vs. phonologically balanced, SSW
Lukas Latacz, Yuk On Kong, Werner Verhelst (2007), Unit selection synthesis using long non-uniform units and phonemic identity matching, SSW
Martin Gruber, Daniel Tihelka, Jindrich Matousek (2007), Evaluation of various unit types in the unit selection approach for the Czech language using the Festival system, SSW
Alan W. Black (2007), The Blizzard Challenge: evaluating corpus-based speech synthesis techniques, SSW
Donata Moers, Petra Wagner, Stefan Breuer (2007), Assessing the adequate treatment of fast speech in unit selection speech synthesis systems for the visually impaired, SSW
Maria Wolters, Pauline Campbell, Christine DePlacido, Amy Liddell, David Owens (2007), Making speech synthesis more accessible to older people, SSW
Heiga Zen, Takashi Nose, Junichi Yamagishi, Shinji Sako, Takashi Masuko, Alan W. Black, Keiichi Tokuda (2007), The HMM-based speech synthesis system (HTS) version 2.0, SSW
Christian Weiss, Luis C. Oliveira, Sergio Paulo, Carlos Mendes, Luis Figueira, Marco Vala, Pedro Sequeira, Ana Paiva, Thurid Vogt, Elisabeth Andre (2007), eCIRCUS: building voices for autonomous speaking agents, SSW
Martin Barbisch, Grzegorz Dogil, Bernd Möbius, Bettina Säuberlich, Antje Schweitzer (2007), Unit selection synthesis in the Smartweb project, SSW
Hanna Silen, Elina Helander, Konsta Koppinen, Moncef Gabbouj (2007), Building a Finnish unit selection TTS system, SSW
Yannick Marchand, Connie R. Adsett, Robert I. Damper (2007), Evaluating automatic syllabification algorithms for English, SSW
John Kominek, Tanja Schultz, Alan W. Black (2007), Voice building from insufficient data - classroom experiences with web-based language development tools, SSW
Peter Cahill, Jan Macek, Julie Carson-Berndsen (2007), SVM based feature extraction in speech synthesis, SSW
Yoshihiko Nankaku, Kenichi Nakamura, Tomoki Toda, Keiichi Tokuda (2007), Spectral conversion based on statistical models including time-sequence matching, SSW
Esther Klabbers, Taniya Mishra, Jan P. H. van Santen (2007), Analysis of affective speech recordings using the superpositional intonation model, SSW
Sylvain Le Beux, Albert Rilliard, Christophe d'Alessandro (2007), Calliphony: a real-time intonation controller for expressive speech synthesis, SSW
Shyamal Kumar Das Mandal, Asoke Kumar Datta (2007), Epoch synchronous non-overlap-add (ESNOLA) method-based concatenative speech synthesis system for Bangla, SSW
Chatchawarn Hansakunbuntheung, Hiroaki Kato, Yoshinori Sagisaka (2007), Syllable-based Thai duration model using multi-level linear regression and syllable accommodation, SSW
Xavier Gonzalvo, Joan Claudi Socoró, Ignasi Iriondo, Carlos Monzo, Elisa Martínez (2007), Linguistic and mixed excitation improvements on a HMM-based speech synthesis for Castilian Spanish, SSW
Tetyana Lyudovyk, Valentyna Robeiko (2007), Inventory of intonation contours for text-to-speech synthesis, SSW
H. Timothy Bunnell, Jason Lilley (2007), Analysis methods for assessing TTS intelligibility, SSW
Brian Langner, Alan W. Black (2007), Understandable production of massive synthesis, SSW
Charlotte van Hooijdonk, Edwin Commandeur, Reinier Cozijn, Emiel Krahmer, Erwin Marsi (2007), The online evaluation of speech synthesis using eye movements, SSW
Lian Zheng, Jianhua Tao, Zhengqi Wen, Rongxiu Zhong (2020), CASIA Voice Conversion System for the Voice Conversion Challenge 2020, VCCBC
Zhiba Su, Wendi He, Yang Sun (2020), The Ximalaya TTS System for Blizzard Challenge 2020, VCCBC
Yi Zhou, Xiaohai Tian, Xuehao Zhou, Mingyang Zhang, Grandee Lee, Riu Liu, Berrak Sisman, Haizhou Li (2020), NUS-HLT System for Blizzard Challenge 2020, VCCBC
Jian Lu, Zeru Lu, Ting He, Peng Zhang, Xinhui Hu, Xinkang Xu (2020), The RoyalFlush Synthesis System for Blizzard Challenge 2020, VCCBC
Laipeng He, Qiang Shi, Lang Wu, Jianqing Sun, Renke He, Yanhua Long, Jiaen Liang (2020), The SHNU System for Blizzard Challenge 2020, VCCBC
Qiao Tian, Zewang Zhang, Ling-Hui Chen, Heng Lu, Chengzhu Yu, Chao Weng, Dong Yu (2020), The Tencent speech synthesis system for Blizzard Challenge 2020, VCCBC
Jing-Xuan Zhang, Li-Juan Liu, Yan-Nian Chen, Ya-Jun Hu, Yuan Jiang, Zhen-Hua Ling, Li-Rong Dai (2020), Voice Conversion by Cascading Automatic Speech Recognition and Text-to-Speech Synthesis with Prosody Transfer, VCCBC
Wen-Chin Huang, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda (2020), The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS, VCCBC
Yang Song, Min Liang, Guilin Yang, Kun Xie, Jie Hao (2020), The OPPO System for the Blizzard Challenge 2020, VCCBC
Zhao Yi, Wen-Chin Huang, Xiaohai Tian, Junichi Yamagishi, Rohan Kumar Das, Tomi Kinnunen, Zhen-Hua Ling, Tomoki Toda (2020), Voice Conversion Challenge 2020 –- Intra-lingual semi-parallel and cross-lingual voice conversion –-, VCCBC
Huhao Fu, Yiben Zhang, Kailong Liu, Chao Liu (2020), The HITSZ TTS system for Blizzard challenge 2020, VCCBC
Zexin Cai, Ming Li (2020), The Duke Entry for 2020 Blizzard Challenge, VCCBC
Li-Juan Liu, Yan-Nian Chen, Jing-Xuan Zhang, Yuan Jiang, Ya-Jun Hu, Zhen-Hua Ling, Li-Rong Dai (2020), Non-Parallel Voice Conversion with Autoregressive Conversion Model and Duration Adjustment, VCCBC
Xiao Zhou, Zhen-Hua Ling, Simon King (2020), The Blizzard Challenge 2020, VCCBC
Yitao Yang, Jinghui Zhong, Shehui Bu (2020), Submission from SCUT for Blizzard Challenge 2020, VCCBC
Tuan Vu Ho, Masato Akagi (2020), Non-parallel Voice Conversion based on Hierarchical Latent Embedding Vector Quantized Variational Autoencoder, VCCBC
Tao Wang, Jianhua Tao, Ruibo Fu, Zhengqi Wen, Chunyu Qiang (2020), The NLPR Speech Synthesis entry for Blizzard Challenge 2020, VCCBC
Beibei Hu, Zilong Bai, Qiang Li (2020), The Ajmide Text-To-Speech System for Blizzard Challenge 2020, VCCBC
Patrick Lumban Tobing, Yi-Chiao Wu, Tomoki Toda (2020), Baseline System of Voice Conversion Challenge 2020 with Cyclic Variational Autoencoder and Parallel WaveGAN, VCCBC
Fanbo Meng, Ruimin Wang, Peng Fang, Shuangyuan Zou, Wenjun Duan, Ming Zhou, Kai Liu, Wei Chen (2020), The Sogou System for Blizzard Challenge 2020, VCCBC
Haitong Zhang (2020), The NeteaseGames System for Voice Conversion Challenge 2020 with Vector-quantization Variational Autoencoder and WaveNet, VCCBC
Oriol Barbany, Milos Cernak (2020), FastVC: Fast Voice Conversion with non-parallel data, VCCBC
Xiaohai Tian, Zhichao Wang, Shan Yang, Xinyong Zhou, Hongqiang Du, Yi Zhou, Mingyang Zhang, Kun Zhou, Berrak Sisman, Lei Xie, Haizhou Li (2020), The NUS & NWPU system for Voice Conversion Challenge 2020, VCCBC
Rohan Kumar Das, Tomi Kinnunen, Wen-Chin Huang, Zhen-Hua Ling, Junichi Yamagishi, Zhao Yi, Xiaohai Tian, Tomoki Toda (2020), Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions, VCCBC
Wen-Chin Huang, Patrick Lumban Tobing, Yi-Chiao Wu, Kazuhiro Kobayashi, Tomoki Toda (2020), The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders, VCCBC
Victor P. da Costa, Ranniery Maia, Igor M. Quintanilha, Sergio L. Netto, Luiz W. P. Biscainho (2020), The UFRJ Entry for the Voice Conversion Challenge 2020, VCCBC
YuHuai Peng, Cheng-Hung Hu, Alexander Kang, Hung-Shin Lee, Pin-Yuan Chen, Yu Tsao, Hsin-Min Wang (2020), The Academia Sinica Systems of Voice Conversion for VCC2020, VCCBC
Hieu-Thi Luong, Junichi Yamagishi (2020), Latent linguistic embedding for cross-lingual text-to-speech and voice conversion, VCCBC
Qiuyue Ma, Ruolan Liu, Xue Wen, Chunhui Lu, Xiao Chen (2020), Submission from SRCB for Voice Conversion Challenge 2020, VCCBC
Tyler Vuong, Mark Lindsey, Yangyang Xia, Richard Stern (2022), L3DAS22: Exploring Loss Functions for 3D Speech Enhancement, L3DAS
Jisheng Bai, Siwei Huang, Yafei Jia, Mou Wang, Jianfeng Chen (2022), Cross-Stitch Network Based System for Sound Event Localization and Detection in L3DAS22 Challenge, L3DAS
Teck Kai Chan, Rohan Kumar Das (2022), Cross-Stitch Network with Adaptive Loss Weightage for Sound Event Localization and Detection, L3DAS
Heitor R. Guimaraes, Wesley Beccaro, Miguel A. Ramirez (2022), A Perceptual Loss Based Complex Neural Beamforming for Ambix 3D Speech Enhancement, L3DAS
Alicia Sagae, Baylor Wetzel, Andre Valente, W. Lewis Johnson (2009), Culture-driven response strategies for virtual human behavior in training systems, SLaTE
Brandon Yoshimoto, Ian McGraw, Stephanie Seneff (2009), Rainbow rummy: a web-based game for vocabulary acquisition using computer-directed speech, SLaTE
Preben Wik, Rebecca Hincks, Julia Hirschberg (2009), Responses to Ville: a virtual language teacher for Swedish, SLaTE
Joost van Doremalen, Helmer Strik, Catia Cucchiarini (2009), Utterance verification in language learning applications, SLaTE
Jared Bernstein, Masanori Suzuki, Jian Cheng, Ulrike Pado (2009), Evaluating diglossic aspects of an automated test of spoken modern standard Arabic, SLaTE
Ingunn Amdal, Magne H. Johnsen, Eivind Versvik (2009), Automatic evaluation of quantity contrast in non-native Norwegian speech, SLaTE
Klaus Zechner (2009), What did they actually say? agreement and disagreement among transcribers of non-native spontaneous speech responses in an English proficiency test, SLaTE
Pieter Müller, Febe de Wet, Christa van der Walt, Thomas Niesler (2009), Automatically assessing the oral proficiency of proficient L2 speakers, SLaTE
Jian Cheng, Brent Townshend (2009), A rule-based language model for reading recognition, SLaTE
Dean Luo, Nobuaki Minematsu, Yutaka Yamauchi, Keikichi Hirose (2009), Analysis and comparison of automatic language proficiency assessment between shadowed sentences and read sentences, SLaTE
Florian Hönig, Anton Batliner, Karl Weilhammer, Elmar Nöth (2009), Islands of failure: employing word accent information for pronunciation quality assessment of English L2 learners, SLaTE
Alissa M. Harrison, Wai-Kit Lo, Xiao-Jun Qian, Helen Meng (2009), Implementation of an extended recognition network for mispronunciation detection and diagnosis in computer-assisted pronunciation training, SLaTE
Sandra Kanters, Catia Cucchiarini, Helmer Strik (2009), The goodness of pronunciation algorithm: a detailed performance study, SLaTE
Preben Wik, David Lucas Escribano (2009), Say ‘aaaaa² interactive vowel practice for second language learning, SLaTE
Helmer Strik, Frederik Cornillieb, Jozef Colpaert, Joost van Doremalen, Catia Cucchiarini (2009), Developing a CALL system for practicing oral proficiency: how to design for speech technology, pedagogy and learners, SLaTE
Hansjörg Mixdorff, Daniel Külls, Hussein Hussein, Shu Gong, Guoping Hu, Si Wei (2009), Towards a computer-aided pronunciation training system for German learners of Mandarin, SLaTE
William Ricardo Rodríguez, Eduardo Lleida (2009), Formant estimation in children²s speech and its application for a Spanish speech therapy tool, SLaTE
Natalia Cylwik, Agnieszka Wagner, Grazyna Demenko (2009), The EURONOUNCE corpus of non-native Polish for ASR-based pronunciation tutoring system, SLaTE
Hiroyuki Obari, Hiroaki Kojima, Machi Okumura, Masahiro Yoshikawa, Shuichi Itahashi (2009), Investigating the effectiveness of Prontest software to train English proficiency, SLaTE
Oscar Saz, Victoria Rodríguez, Eduardo Lleida, William Ricardo Rodríguez, Carlos Vaquero (2009), An experience with a Spanish second language learning tool in a multilingual environment, SLaTE
Lei Chen (2009), Audio quality issue for automatic speech assessment, SLaTE
Shizhen Wang, Patti Price, Yi-Hui Lee, Abeer Alwan (2009), Measuring children²s phonemic awareness through blending tasks, SLaTE
Minh Duong, Jack Mostow (2009), Detecting prosody improvement in oral rereading, SLaTE
Alexander Gruenstein, Ian McGraw, Andrew Sutherland (2009), A self-transcribing speech corpus: collecting continuous speech with an online educational game, SLaTE
Yushi Xu, Anna Goldie, Stephanie Seneff (2009), Automatic question generation and answer judging: a q&a game for language learning, SLaTE
Julie Medero, Mari Ostendorf (2009), Analysis of vocabulary difficulty using Wiktionary, SLaTE
Juan Pino, Maxine Eskenazi (2009), Semi-automatic generation of cloze question distractors effect of students² L1, SLaTE
Luís Marujo, José Lopes, Nuno Mamede, Isabel Trancoso, Juan Pino, Maxine Eskenazi, Jorge Baptista, Céu Viana (2009), Porting REAP to European Portuguese, SLaTE
Grazyna Demenko, Agnieszka Wagner, Natalia Cylwik, Oliver Jokisch (2009), An audiovisual feedback system for acquiring L2 pronunciation and L2 prosody, SLaTE
Rebecca Hincks, Jens Edlund (2009), Using speech technology to promote increased pitch variation in oral presentations, SLaTE
Zöe Handley, Mike Sharples, Dave Moore (2009), Training novel phonemic contrasts: a comparison of identification and oddity discrimination training, SLaTE
Angela M. Wigmore, Gordon J. A. Hunter, Eckhard Pflügel, James Denholm-Price (2009), Talkmaths : a speech user interface for dictating mathematical expressions into electronic documents, SLaTE
Liu Liu, Jack Mostow, Gregory Aist (2009), Automated generation of example contexts for helping children learn vocabulary, SLaTE
Khe Chai Sim (2009), Improving phone verification using state-level posterior features and support vector machine for automatic mispronunciation detection, SLaTE
Masayuki Suzuki, Dean Luo, Nobuaki Minematsu, Keikichi Hirose (2009), Improved structure-based automatic estimation of pronunciation proficiency, SLaTE
Gregory Aist, Jack Mostow (2009), Predictable and educational spoken dialogues: pilot results, SLaTE
John Ingram, Hansjörg Mixdorff, Nahyun Kwon (2009), Voice morphing and the manipulation of intra-speaker and cross-speaker phonetic variation to create foreign accent continua: a perceptual study, SLaTE
Tatsuya Kawahara, Hongcui Wang, Yasushi Tsubota, Masatake Dantsuji (2009), Japanese CALL system based on dynamic question generation and error prediction for ASR, SLaTE
Dean Luo, Nobuaki Minematsu, Yutaka Yamauchi (2009), Development of a CALL system to enhance ESL/EFL learners² skills of shadowing and reading aloud, SLaTE
Michael Beilig (2009), Computer aided pronunciation training (CAPT) system "AZAR", SLaTE
Andre Valente, W. Lewis Johnson (2009), Spoken dialog systems for learning foreign language communication skills, SLaTE
Preben Wik (2009), Langofone - language learning in your pocket, SLaTE
Oscar Saz, William Ricardo Rodríguez, Eduardo Lleida, Carlos Vaquero (2009), COMUNICA: multilevel tools for Spanish CALL, SLaTE
Nobuaki Minematsu, Masayuki Suzuki (2009), Structure-based pronunciation assessment, SLaTE
Alexander Gruenstein, Ian McGraw, Andrew Sutherland (2009), Voice race and voice scatter: online educational games for collectingorthographically-labeled speech data, SLaTE
Luis Marujo, Jose Lopes, Nuno Mamede, Isabel Trancoso, Juan Pino, Maxine Eskenazi, Jorge Baptista, Ceu Viana (2009), REAP.PT, a tutoring system for teaching Portuguese, SLaTE
Yow-Bang Wang, Hsin-Min Wang, Lin-Shan Lee (2009), Virtual Chinese tutor (VCT) - a Chinese language pronunciation learning software, SLaTE
Florian Hönig, Anton Batliner, Karl Weilhammer, Elmar Nöth (2009), Automatic assessment of non-native prosody, SLaTE
Karl Weilhammer, Catharine Oertel, Robin Siegemund, Ricardo Sá, Anton Batliner, Florian Hönig, Elmar Nöth (2009), A spoken dialog system for learners of English, SLaTE
Maxine Eskenazi, Gary Pelton (2009), CLIMB LEVEL 4 - teaching English for aviation safety, SLaTE
Martine Grice, Simon Wehrle (2022), Prosody and Conversational Behaviour in Autism Spectrum Disorder, SpeechProsody
Janne Lorenzen, Simon Roessig, Stefan Baumann (2022), Information status and tonal context jointly modulate prosodic prominence relations in German, SpeechProsody
Martin Ho Kwan Ip, Alex de Carvalho, John Trueswell (2022), Prosody-to-Focus Mapping and Alternative Processing in Word Learning, SpeechProsody
Emmett Strickland, Anne Lacheret-Dujour, Candide Simard (2022), Prosody and cognitive accessibility in left-detached topics: lessons from Nigerian Pidgin, SpeechProsody
Na Hu, Aoju Chen, Fang Li, Hugo Quené, Ted Sanders (2022), A Trade-off Relationship between Lexical and Prosodic Means in Expressing Subjective and Objective Causality, SpeechProsody
Heiko Seeliger, Constantijn Kaland (2022), Boundary tones in German wh-questions and wh-exclamatives - a cluster-based approach, SpeechProsody
Yu Jin Song, Cynthia G. Clopper, Laura Wagner (2022), Children’s Use of Uptalk in Narratives, SpeechProsody
Simon Wehrle, Francesco Cangemi, Kai Vogeley, Martine Grice (2022), New evidence for melodic speech in Autism Spectrum Disorder, SpeechProsody
Heike Lehnert-LeHouillier, Steven Snadoval (2022), Conversational Correlates of Prosodic Entrainment in Youth with and without Autism Spectrum Disorder, SpeechProsody
Chloé Daigmorte, Jessica Tallet, Corine Astésano (2022), On the foundations of rhythm-based methods in Speech Therapy, SpeechProsody
Massimo Pettorino, Marta Maffia, Brigitte Bigi (2022), A diachronic study on Italian speech rhythm in Parkinson’s Disease, SpeechProsody
Janina Boecher, Kathryn Franich, Evan Usler (2022), Rhythm of Speech in People Who Do and Do Not Stutter - A Quantitative Analysis Using the Normalized Pairwise Variability Index, SpeechProsody
Vincent P. Martin, Brice Arnaud, Jean-Luc Rouas, Pierre Philip (2022), Does sleepiness influence reading pauses in hypersomniac patients?, SpeechProsody
Nigel Ward, Ambika Kirkland, Marcin Wlodarczak, Éva Székely (2022), Two Pragmatic Functions of Breathy Voice in American English Conversation, SpeechProsody
Cristel Portes, Uwe Reyle (2022), Combining syntax and prosody to signal information structure: the case of French, SpeechProsody
Mariia Pronina, Iris Hübscher, Ingrid Vilà-Giménez, Pilar Prieto (2022), Pragmatic prosody development from 3 to 8 years of age: A cross-sectional study in Catalan, SpeechProsody
Jiseung Kim, Anja Arnhold (2022), Prosodic focus marking in Canadian English, SpeechProsody
Jill Thorson, Jill Trumbell, Kimberly Nesbitt (2022), Expressing information status through prosody in the spontaneous speech of American English-speaking children, SpeechProsody
Ting Wang, Heng Ding (2022), Mandarin Disyllabic Word Imitation in Children with and without Autism Spectrum Disorder, SpeechProsody
Kiwako Ito, Elizabeth Kryszak, Teresa Ibanez (2022), Effect of Prosodic Emphasis on the Processing of Joint-Attention Cues in Children with ASD, SpeechProsody
Yi Lin, Chuoran Li, Qing Fan, Yueqi Chen, Jiaqi Zhang, Hongwei Ding (2022), Effects of sensory dominance and gender differences on impaired emotion perception in schizophrenic patients, SpeechProsody
Sunghye Cho, Galit Agmon, Sanjana Shellikeri, Katheryn Cousins, Sharon Ash, David Irwin, Meredith Spindler, Andres Deik Acosta Madiedo, Lauren Elman, Colin Quinn, Mark Liberman, Murray Grossman, Naomi Nevler (2022), Prosodic characteristics of prepausal words produced by patients with neurodegenerative disease, SpeechProsody
Joanna Kruyt, Štefan Beňuš, Catherine Faget, Christophe Lançon, Maud Champagne-Lavau (2022), Prosodic and lexical entrainment in adults with and without schizophrenia, SpeechProsody
Laurence White, Hannah Grimes (2022), Articulation rate in psychotherapeutic dialogues for depression: patients and therapists, SpeechProsody
Heete Sahkai, Eva Liina Asu, Pärtel Lippus (2022), Prosodic characteristics of canonical and non-canonical questions in Estonian, SpeechProsody
Claudia Crocco, Barbara Gili Fivela, Mariapaola D'Imperio (2022), Comparing prosody of Italian varieties and dialects: data from Neapolitan, SpeechProsody
Jiyoung Jang, Argyro Katsika (2022), The coordination of boundary tones with constriction gestures in Seoul Korean, an edge-prominence language, SpeechProsody
Billian Khalayi Otundo, Martine Grice (2022), Intonation in advice-giving in Kenyan English and Kiswahili, SpeechProsody
Ela Thurgood, Paul Olejarczuk (2022), The Effects of Intonation on the Sentence-Final Particle nyei in Iu-Mien, SpeechProsody
Io Valls-Ratés, Oliver Niebuhr, Pilar Prieto (2022), Unguided VR public-speaking training enhances your confidence - but does not improve your intonation, SpeechProsody
Clara Huttenlauch, Marie Hansen, Carola de Beer, Isabell Wartenburger, Sandra Hanne (2022), Individual variability in prosodic marking of locally ambiguous sentences, SpeechProsody
Farhat Jabeen (2022), Production and perception of Intonational Phrase boundaries in Urdu polar questions, SpeechProsody
Giuseppe Magistro, Claudia Crocco (2022), Rising declaratives in Veneto dialects, SpeechProsody
Julia Bongiorno, Sophie Herment (2022), High Rising Terminals in Dublin: forms, functions and gender, SpeechProsody
Martina Rossi, Kathrin Feindt, Margaret Zellers (2022), Individual variation in F0 marking of turn-taking in natural conversation in German and Swedish, SpeechProsody
Malek Al Hasan, Shakuntala Mahanta (2022), The Intonational Phonology of Syrian Arabic: A Preliminary Analysis, SpeechProsody
Saskia Wepner, Barbara Schuppler, Gernot Kubin (2022), How prosody affects ASR performance in conversational Austrian German, SpeechProsody
Si-Ioi Ng, Rui-Si Ma, Tan Lee, Raymond Kim-Wai Sum (2022), Acoustical Analysis of Speech Under Physical Stress in Relation to Physical Activities and Physical Literacy, SpeechProsody
Veronica P. Siqueira, Beatriz Raposo de Medeiros (2022), Synchronous speech and semantic incongruity: what do outliers tell us about it?, SpeechProsody
Flaviane Fernandes-Svartman, Larissa Berti, Marcus Martins, Beatriz R. Medeiros, Marcelo Queiroz (2022), Temporal prosodic cues for COVID-19 in Brazilian Portuguese speakers, SpeechProsody
Caroline Crouch, Argyro Katsika, Ioana Chitoran (2022), Georgian Syllables, Uncentered? , SpeechProsody
Christina Tånnander, David House, Jens Edlund (2022), Syllable duration as a proxy to latent prosodic features, SpeechProsody
Franka Zebe (2022), Durational consonant categories in Alemannic and Swiss Standard German across tempo and age, SpeechProsody
Ivy Mok, Lieke van Maastricht, Nuria Esteve-Gibert (2022), Do head gestures function as precursors for prosodic focus marking in the L2?, SpeechProsody
Marita Everhardt, Anastasios Sarampalis, Matt Coler, Deniz Baskent, Wander Lowie (2022), Interpretation of prosodically marked focus in cochlear implant-simulated speech by non-native listeners, SpeechProsody
Kexin Du, Sergey Avrutin, Aoju Chen (2022), Building bridges: The role of prosody in Mandarin-speaking adults' and children's anaphora resolution, SpeechProsody
Judit Gervain (2022), How the neural mechanisms of encoding prosody lay the foundations for early language development, SpeechProsody
Irene de La Cruz-Pavía (2022), The role of audio-visual phrasal prosody in bootstrapping the acquisition of word order, SpeechProsody
Zhenyang Xi, Yan Gu, Gabriella Vigliocco (2022), Speaking Rate in 3-4-Year-Old Children: Its Correlation with Gesture Rate and Word Learning, SpeechProsody
Tae Jin Yoon, Seunghee Ha, Jungmin So (2022), Developmental Patterns of Accentual Phrases in Korean Children’s Speech, SpeechProsody
Qianwen Guan, Yaru Wu, Ioana Chitoran (2022), A corpus-based study of /CR/ and /RC/ clusters in French: Prosodic and segmental effects, SpeechProsody
Johanna Cronenberg, Nicola Klingler, Felicitas Kleber, Michael Pucher (2022), On the role of asymmetry in prosodic change of consonant duration: Results from an agent-based model with two German varieties, SpeechProsody
Ray Huaute (2022), A Preliminary Intonation Model of Torres-Martinez Desert Cahuilla, SpeechProsody
Ronny Bujok, Antje Meyer, Hans Rutger Bosker (2022), Visible lexical stress cues on the face do not influence audiovisual speech perception, SpeechProsody
Laurence Bruggeman, Jenny Yu, Anne Cutler (2022), Listener adjustment of stress cue use to fit language vocabulary structure, SpeechProsody
Anna Bruggeman, Leonie Schade, Marcin Włodarczak, Petra Wagner (2022), Beware of the individual: Evaluating prominence perception in spontaneous speech, SpeechProsody
Soundess Azzabou-Kacem, Alice Turk (2022), Fine gradations of prosodic boundary strength can drive the assignment of prominence, SpeechProsody
Giulio Severijnen, Hans Rutger Bosker, James McQueen (2022), Acoustic correlates of Dutch lexical stress re-examined: Spectral tilt is not always more reliable than intensity, SpeechProsody
Carlos Gussenhoven, Wei-Rong Chen (2022), Segmental intonation in Zwara Berber voiceless stressed syllable rimes, SpeechProsody
Dafydd Gibbon (2022), Speech rhythms: learning to discriminate speech styles, SpeechProsody
Kathryn Franich, Hermann Keupdjio (2022), The Influence of Tone on the Alignment of Speech and Co-Speech Gesture, SpeechProsody
Raphael Werner, Jürgen Trouvain, Bernd Möbius (2022), Optionality and variability of speech pauses in read speech across languages and rates, SpeechProsody
Jinyu Li, Leonardo Lancia (2022), Effects of delayed auditory feedback interacting with prosodic structure, SpeechProsody
Laurence White, Sven Mattys, Sarah Knight, Tess Saunders, Laura Macbeath (2022), Temporal expectations and the interpretation of timing cues to word boundaries, SpeechProsody
Natalia Kuznetsova, Elena Markus (2022), Ongoing vowel shortening in vanishing Soikkola Ingrian: challenges for description, codification, and typology, SpeechProsody
Kakeru Yazawa, Mariko Kondo (2022), A Comparison of Rhythm Metrics for L2 Speech, SpeechProsody
Marie-Anne Morand, Melissa Bruno, Sandra Schwab, Stephan Schmid (2022), Syllable rate and speech rhythm in multiethnolectal Zurich German: a comparison of speaking styles, SpeechProsody
Zhiqiang Zhu, Peggy Pik Ki Mok (2022), Can speech rate transfer between languages? Evidence from Japanese and Mandarin Chinese, SpeechProsody
Chengxia Wang, Yi Xu, Jinsong Zhang (2022), The invalidity of rhythm class hypothesis, SpeechProsody
Gilbert Ambrazaitis, Johan Frid, David House (2022), Auditory vs. audiovisual prominence ratings of speech involving spontaneously produced head movements, SpeechProsody
Axel Barrault, James German, Pauline Welby (2022), Anticipatory marking of (non-corrective) contrastive focus by the Initial Rise in French, SpeechProsody
Amandine Michelas, Sophie Dufour (2022), Gradiency vs. categoricity: How French speakers perceive accentual information in their native language?, SpeechProsody
Maciej Karpiński, Ewa Jarmołowicz-Nowikow, Katarzyna Klessa (2022), High-pitched prominences in the speeches of male Polish members of parliament, SpeechProsody
Beata Łukaszewicz, Janina Mołczanow, Anna Łukaszewicz (2022), Pretonic Lengthening as the Lexical Stress Domain Extension, SpeechProsody
Fabian Santiago, Paolo Mairano, Bianca De Paolis (2022), The effects of prosodic prominence on the acquisition of L2 phonological features, SpeechProsody
Marina Kalashnikova, Cristina Naranjo (2022), Prosody in bilingual caregiver’s infant-directed speech: Cues for infants’ acquisition of their languages’ intonational structure, SpeechProsody
Ricardo Sousa, Susana Silva, Sónia Frota (2022), Early Prosodic Development predicts Lexical Development in typical and atypical language acquisition, SpeechProsody
Joanne Arciuli, Kate Philips, Benjamin Bailey, Alexandre Forndran, Adam Vogel, Kirrie Ballard (2022), Lexical Stress Matures Late in Typically Developing Children, SpeechProsody
Alex de Carvalho, Leticia Kolberg, John Trueswell, Anne Christophe (2022), Cross-linguistic evidence for the role of phrasal prosody in syntactic and lexical acquisition, SpeechProsody
Sara Munoz-Coego, Júlia Florit-Pons, Patrick Louis Rohrer, Ingrid Vilà-Giménez, Pilar Prieto (2022), The prosodic and gestural marking of the information status of referents in children’s narrative speech: A longitudinal study, SpeechProsody
Judit Gervain (2022), Word frequency and prosody bootstrap basic word order in prelexical infants, SpeechProsody
Chen Lan, Peggy Mok (2022), A preliminary study on the acquisition of Mandarin neutral tone by young heritage children, SpeechProsody
Wenwei Xu, Chunyu Ge, Wentao Gu, Peggy Mok (2022), A preliminary analysis on children’s phonation contrast in Kunshan Wu Chinese tones, SpeechProsody
Juraj Šimko, Adaeze Adigwe, Antti Suni, Martti Vainio (2022), A Hierarchical Predictive Processing Approach to Modelling Prosody, SpeechProsody
Li-Fang Lai, Janet G. van Hell, John Lipski (2022), The Role of Rhythm and Vowel Space in Speech Recognition, SpeechProsody
Veranika Mikhailava, John Blake, Evgeny Pyshkin, Natalia Bogach, Sergey Chernonog, Artyom Zhuikov, Maria Lesnichaya, Iurii Lezhenin, Roman Svechnikov (2022), Dynamic Assessment during Suprasegmental Training with Mobile CAPT, SpeechProsody
Hansjörg Mixdorff, Albert Rilliard, Philippe Boula De Mareüil (2022), Perceptual Identification of Speech Acts in Gallo-Romance Dialects: A Study Based on Prosody Re-synthesis, SpeechProsody
Mortaza Taheri-Ardali, Daniel Hirst (2022), Building a Persian-English OMProDat Database Read by Persian Speakers, SpeechProsody
Alex Peiró-Lilja, Guillermo Cámbara, Mireia Farrús, Jordi Luque (2022), Naturalness and Intelligibility Monitoring for Text-to-Speech Evaluation, SpeechProsody
Nicolas Ballier, Adrien Méli, Taylor Arnold, Alice Henderson (2022), Revisiting Paratone Prosodic Features with the EIIDA corpus, SpeechProsody
Tatiana Kachkovskaia, Alla Menshikova, Daniil Kocharov, Pavel Kholiavin, Anna Mamushina (2022), Social and situational factors of speaker variability in collaborative dialogues, SpeechProsody
Maureen de Seyssel, Guillaume Wisniewski, Emmanuel Dupoux, Bogdan Ludusan (2022), Investigating the usefulness of i-vectors for automatic language characterization, SpeechProsody
Rose Sloan, Adaeze Adigwe, Sahana Mohandoss, Julia Hirschberg (2022), Incorporating Prosodic Events in Text-to-Speech Synthesis, SpeechProsody
Mariana Julião, Alberto Abad, Helena Moniz (2022), Can Prosody Transfer Embeddings be Used for Prosody Assessment?, SpeechProsody
Jennifer Cole, Jeremy Steffman, Sam Tilsen (2022), Shape matters: Machine classification and listeners’ perceptual discrimination of American English intonational tunes, SpeechProsody
Juan Manuel Toro (2022), Using prosody to organize the signal: Sensitivities across species set the stage for prosodic bootstrapping, SpeechProsody
Sheng-Fu Wang (2022), Pre-boundary lengthening modulates predictability effects on durational variability in Taiwan Southern Min, SpeechProsody
Leendert Plug, Robert Lennon, Rachel Smith (2022), Schwa deletion and perceived tempo in English, SpeechProsody
Vered Silber-Varod, Ella Alfon, Noam Amir (2022), Perception of the strength of prosodic breaks in three conditions: Explicit pause, implicit pause, and no pause, SpeechProsody
Marcin Wlodarczak, Mattias Heldne (2022), Contribution of voice quality to prediction of turn-taking events, SpeechProsody
Renata R. Passetti, Sandra Madureira, Plínio A. Barbosa (2022), Voice perception on a voice messaging app: implications for Forensic Phonetics, SpeechProsody
Hansjörg Mixdorff, Oliver Niebuhr (2022), The Effects of Fujisaki Model Parameter Manipulation on Perceived Charisma, SpeechProsody
Simon Roessig, Lena Pagel, Doris Mücke (2022), Speaking loudly reduces flexibility and variability in the prosodic marking of focus types, SpeechProsody
Jan Volín, Radek Skarnitzl (2022), The Impact of Prosodic Position on Post-Stress Rise in Three Genres of Czech, SpeechProsody
Caterina Petrone, Arina Antonenko, Sophie Dufour (2022), Does emotional prosody affect word recognition in French?, SpeechProsody
Plinio Barbosa (2022), The Acoustics of Pleasantness in Poetry Declamation in Two Varieties of Portuguese, SpeechProsody
Takuto Matsuda, Yoshiko Arimoto (2022), Acoustic discriminability of unconscious laughter and scream during game-play, SpeechProsody
Aini Li, Wei Lai, Jianjing Kuang (2022), How do listeners identify creak? The effects of pitch range, prosodic position and creak locality in Mandarin, SpeechProsody
Alexsandro Rodrigues Meireles, Hansjörg Mixdorff (2022), Acoustic Study of the Voice Quality of Brazilian Portuguese Stressed Vowels, SpeechProsody
Yaqian Huang (2022), Articulatory properties of period-doubled voice in Mandarin, SpeechProsody
Xinyue Li, Carlos Toshinori Ishi, Changzeng Fu, Ryoko Hayashi (2022), Prosodic and Voice Quality Analyses of Filled Pauses in Japanese Spontaneous Conversation by Chinese learners and Japanese Native Speakers, SpeechProsody
Guillermo Cámbara, Mireia Farrús, Jordi Luque (2022), Voice Quality and Pitch Features in Transformer-Based Speech Recognition, SpeechProsody
Bogdan Ludusan, Petra Wagner (2022), ha-HA-hha? Intensity and voice quality characteristics of laughter, SpeechProsody
Alice Crochiquia, Anders Eriksson, Plinio Barbosa, Sandra Madureira (2022), A perceptual and acoustic study of dubbed voices in an animated film, SpeechProsody
Chunyu Ge, Wenwei Xu, Wentao Gu, Peggy Mok (2022), An electroglottographic study of phonation types in tones of Suzhou Wu Chinese, SpeechProsody
Oliver Niebuhr (2022), Prosody in hate speech perception: A step towards understanding the role of implicit prosody, SpeechProsody
Yu Chen, Ting Wang, Hongwei Ding (2022), Effect of Age and Gender on Categorical Vocal Emotion Recognition in Mandarin Chinese, SpeechProsody
Sylvain Xia, Dominique Fourer, Liliana Audin-Garcia, Jean-Luc Rouas, Takaaki Shochi (2022), Speech Emotion Recognition using Time-frequency Random Circular Shift and Deep Neural Networks, SpeechProsody
Katalin Mády, Beáta Gyuris, Hans-Martin Gärtner, Anna Kohári, Ádám Szalontai, Uwe D. Reichel (2022), Perceived emotions in infant-directed narrative across time and speech acts, SpeechProsody
Donna Erickson, Albert Rilliard, Ela Thurgood, João Antônio de Moraes, Takaaki Shochi (2022), A Valence-Arousal-Dominance Study of American English Social Affective Expressions, SpeechProsody
Jiayong He, Jing Tang, Stella Gryllia, Aoju Chen (2022), Prosodic realization of politeness in the presence of non-prosodic cues in Mandarin Chinese, SpeechProsody
Emilie Marty, Roxane Bertrand, Caterina Petrone, James German (2022), Prosodic Correlates of Discourse Structure and Emotion in Discourse Markers that Preface Announcements of News, SpeechProsody
Sarah Ita Levitan, Julia Hirschberg (2022), Believe It or Not: Acoustic-Prosodic Cues to Trusting and Untrusting Speech in Interview Dialogues, SpeechProsody
Aitor Arronte Alvarez, Elsayed Issa, Mohammed Alshakhori (2022), Computational modeling of intonation patterns in Arabic emotional speech, SpeechProsody
Suzanne Verheul, Adriana Hartman, Roselinde Supheert, Aoju Chen (2022), Gender effects on perception of emotional speech- and visual-prosody in a second language: Emotion recognition in English-speaking films, SpeechProsody
Lisa Maria Tschinse, Ali Asadi, Anna Gutnyk, Oliver Niebuhr (2022), Keep on smiling...? An exploration of the gender-specific connections between smiling duration and perceived speaker attributes in business pitches, SpeechProsody
Huan Wei, Yifei He, Christina Kauschke, Mathias Scharinger, Ulrike Domahs (2022), An EEG-study on L2 categorization of emotional prosody in German, SpeechProsody
Nicole Holliday (2022), Kamala Harris, Maya Rudolph and the Prosody of Parody, SpeechProsody
Marisa Cruz, Jovana Pejovic, Catia Severino, Marina Vigario, Sónia Frota (2022), Auditory and visual cues in face-masked infant-directed speech, SpeechProsody
Peizhu Shang, Wendy Elvira-García, Xinyi Li (2022), Cue weighting differences in perception of Spanish sentence type between native listeners of Chinese and Spanish, SpeechProsody
Shinobu Mizuguchi, Koichi Tateishi (2022), Perception of Boundary and Prominence in Spontaneous Japanese: An RPT Study, SpeechProsody
Hae-Sung Jeon, Antje Heinrich (2022), Perception of Pitch Height and Prominence by Old and Young listeners, SpeechProsody
Yuanyuan Zhang, Hongwei Ding (2022), Asymmetry in L1 and L2 listeners’ use of prosody for PP-attachment disambiguation, SpeechProsody
Anindita Nath, Nigel Ward (2022), On the Predictability of the Prosody of Dialog Markers from the Prosody of the Local Context, SpeechProsody
Omnia Ibrahim, Ivan Yuen, Bistra Andreeva, Bernd Möbius (2022), The effect of predictability on German stop voicing is phonologically selective, SpeechProsody
Yike Yang, Si Chen (2022), Does prosody influence segments differently in Cantonese and Mandarin? A case study of the open vowel /a/, SpeechProsody
Antonio Benítez-Burraco, Wendy Elvira-García (2022), Human self-domestication and the evolution of prosody, SpeechProsody
Aliza Glasbergen-Plas, Stella Gryllia, Leticia Pablos Robles, Jenny Doetjes (2022), Scripted Simulated Dialogue: a new elicitation paradigm, SpeechProsody
Andrew Murphy, Irena Yanushevskaya, Ailbhe Ní Chasaide, Christer Gobl (2022), Affect Expression: Global and Local Control of Voice Source Parameters, SpeechProsody
Nadja Schauffler, Fabian Schubö, Toni Bernhart, Gunilla Eschenbach, Julia Koch, Sandra Richter, Gabriel Viehhauser, Thang Vu, Lorenz Wesemann, Jonas Kuhn (2022), Prosodic realisation of enjambment in recitations of German poetry, SpeechProsody
Nicola West, Tamara Rathcke, Rachel Smith (2022), Timing in speech and music of contemporary English and Scottish composers, SpeechProsody
Usha Goswami (2022), Acoustic Structure in the Amplitude Envelope and Speech Prosody: A Psycholinguistic and Developmental Perspective, SpeechProsody
Sónia Frota, Marina Vigário, Marisa Cruz, Friederike Hohl, Bettina Braun (2022), Amplitude envelope modulations across languages reflect prosody, SpeechProsody
Sabrina Stehwien, Lars Meyer (2022), Short-Term Periodicity of Prosodic Phrasing: Corpus-based Evidence, SpeechProsody
Jianjing Kuang, May Pik Yu Chan, Nari Rhee (2022), The effects of syntactic and acoustic cues on the perception of prosodic boundaries, SpeechProsody
Lieke van Maastricht, Marieke Hoetjes, Lisette van der Heijden (2022), Learning L2 Prosody using Gestures: The Role of Individual Differences related to Musicality, SpeechProsody
Yuan Zhang, Florence Baills, Pilar Prieto (2022), Training with embodied musical activities has positive effects on unfamiliar language imitation skills, SpeechProsody
Wílmar López-Barrios (2022), Language-specific intonation in the Palenquero/Spanish bilinguals, SpeechProsody
Simona Sbranna, Eduardo Möking, Simon Wehrle, Martine Grice (2022), Backchannelling across Languages: Rate, Lexical Choice and Intonation in L1 Italian, L1 German and L2 German, SpeechProsody
Michelina Savino, Simona Sbranna, Caterina Ventura, Aviad Albert, Martine Grice (2022), Imitating intonation in a non-native variety: the influence of the native repertoire, SpeechProsody
Jamie Adams, Sam Hellmuth (2022), Taiwanese and Beijing Mandarin listeners’ perception of English focus prosody, SpeechProsody
Bistra Andreeva, Snezhina Dimitrova (2022), The influence of L1 prosody on Bulgarian-accented German and English, SpeechProsody
Xiaoqing Wang, Wentao Gu (2022), Effects of Gender and Language Proficiency on Phonetic Accommodation in Chinese EFL Learners, SpeechProsody
Adam Bramlett, Seth Wiener (2022), jTRACE modeling of L2 Mandarin learners’ spoken word recognition at two time points in learning, SpeechProsody
Heini Kallio, Rosa Suviranta, Mikko Kuronen, Anna von Zansen (2022), Creaky voice and utterance fluency measures in predicting fluency and oral proficiency of spontaneous L2 Finnish, SpeechProsody
Yanping Li, Catherine Best, Michael Tyler, Denis Burnham (2022), Native Beijing listeners’ perceptual assimilation of Mandarin lexical tones produced by L2-Mandarin speakers from Yantai, Shanghai, and Guangzhou, SpeechProsody
Štefan Beňuš (2022), Prosodic imitation of audiovisual and audio-only prompts in L2 English, SpeechProsody
Lucie Judkins, Charlotte Alazard-Guiu, Corine Astésano (2022), How do we chunk and pause in non-native vs native speech? Methodological implications for SLA, SpeechProsody
Jung-Yueh Tu, Jih-Ho Cha (2022), Mandarin third tone sandhi application in trisyllabic words by L2 learners, SpeechProsody
Florence Baills, Fabián Santiago, Paolo Mairano, Pilar Prieto (2022), The effects of prosodic training with logatomes and prosodic gestures on L2 spontaneous speech, SpeechProsody
Marlene Böttcher (2022), A Comparison of Pitch Accent Patterns in Contrastive Adjective+Noun Structures in Bilingual Englishes, SpeechProsody
Sabine Zerbian, Marlene Böttcher, Yulia Zuban (2022), Prosody of contrastive adjectives in mono- and bilingual speakers of English and Russian: a corpus study, SpeechProsody
Sichang Gao, Mingwei Pan (2022), Developing and validating a rating scale of speaking prosody ability for learners of Chinese as a second language, SpeechProsody
Sandra Schwab, Michael Mouthon, Justine Salvadori, Eugenia Ferreira da Silva, Ilona Yakoub, Nathalie Giroud, Jean-Marie Annoni (2022), Neural correlates and L2 lexical stress learning: an fMRI study, SpeechProsody
Thales Buzan, Cristina Name, Juan Sosa (2022), Intonational interference in English-L2 Brazilian speakers: production and perception, SpeechProsody
Catalina Torres (2022), Pitch range modulations in an edge-marking language, SpeechProsody
Amalia Arvaniti, Stella Gryllia, Cong Zhang, Katherine Marcoux (2022), Disentangling emphasis from pragmatic contrastivity in the English H* ~ L+H* contrast, SpeechProsody
Liang Zhao, Shayne Sloggett, Eleanor Chodroff (2022), Top-Down and Bottom-up Processing of Familiar and Unfamiliar Mandarin Dialect Tone Systems, SpeechProsody
Francesco Rodriquez, Paolo Roseano, Teresa Cabré Monné (2022), Text-tune accommodation processes in the intonation of European Portuguese yes-no questions: an OT analysis, SpeechProsody
Jan Volín, Michaela Svatošová, Pavel Šturm (2022), Fundamental Frequency Variation in Polarity Questions of Czech, SpeechProsody
Jeremy Steffman, Stefanie Shattuck-Hufnagel, Jennifer Cole (2022), The rise and fall of American English pitch accents: Evidence from an imitation study of rising nuclear tunes, SpeechProsody
Tatiana Kachkovskaia, Svetlana Zimina, Alena Portnova, Daniil Kocharov (2022), Social variability of peak alignment in Russian rise-fall tunes, SpeechProsody
Xin Li, Wentao Gu (2022), Phonological Representation of Tone Sandhi in Nanjing Mandarin, SpeechProsody
Lena Borise, David Erschler (2022), Mora count and the alignment of rising pitch accents in Iron Ossetic, SpeechProsody
Peng Li, Yuan Zhang, Xianqiang Fu, Florence Baills, Pilar Prieto (2022), Melodic perception skills predict Catalan speakers’ imitation abilities of unfamiliar languages, SpeechProsody
Yu-Siang Hong, Sin-Horng Chen (2022), A Data-driven Approach to Constructing a Prosodic Grammar for Mandarin Read Speech, SpeechProsody
Changyun Moon, Chuyu Huang, Daiki Hashimoto (2022), The Effect of Japanese Pitch Accent System on Musical Cognitive Ability, SpeechProsody
Helen Türk, Pärtel Lippus, Merit Niinemägi, Karl Pajusalu, Pire Teras (2022), The Durational Structure of Tetrasyllabic Words in Inari Saami, SpeechProsody
Mary Baltazani, Katerina Nicolaidis (2022), Phrasing and speech rate effects on segmental and prosodic variability in Greek, SpeechProsody
Mengzhu Yan, Sasha Calhoun (2022), Prosodic prominence and clefting in L2 focus interpretation, SpeechProsody
Jill Thorson, Rachel Steindel Burdin (2022), The interpretation and phonetic implementation of !H* in American English, SpeechProsody
Stella Gryllia, Amalia Arvaniti, Cong Zhang, Katherine Marcoux (2022), The many shapes of H*, SpeechProsody
Christine T. Röhr, Michelina Savino, Martine Grice (2022), The effect of intonational rises on serial recall in German, SpeechProsody
Nari Rhee, Jianjing Kuang, Aoju Chen (2022), The effect of musicality on the development of Mandarin prosody, SpeechProsody
Tim Laméris (2022), The Effect of L1 Pitch Status and Extralinguistic Factors on L2 Tone Learning, SpeechProsody
Nelleke Jansen, Eleanor Harding, Hanneke Loerts, Deniz Başkent, Wander Lowie (2022), The relation between musical ability and sentence-level intonation perception: A meta-analysis comparing L1 and non-native listening, SpeechProsody
Ellen Gurman Bard, C. Sotillo, M. P. Aylett (2000), Taking the hit: why lexical and phonological processing should not make lexical access too easy, SWAP
Gareth Gaskell (2000), A quick rum picks you up, but is it good for you? Sentence context effects in the identification of spoken words, SWAP
Pienie Zwitserlood, Else Coenen (2000), Consequences of assimilation for word recognition and lexical representation, SWAP
Sieb Nooteboom, Esther Janse, Hugo Quené, Saskia te Riele (2000), Multiple activation and early context effects, SWAP
William D. Marslen-Wilson (2000), Organising principles in lexical access and representation? A view acrosslanguages, SWAP
Sami Boudelaa, William D. Marslen-Wilson (2000), Non-concatenative morphemes in language processing: Evidence from Modern Standard Arabic, SWAP
Kerstin Mauth (2000), Does morphological information influence phonetic categorization?, SWAP
Fanny Meunier, William D. Marslen-Wilson, Mike Ford (2000), Suffixed Word Lexical Representations in French, SWAP
Agnieszka A. Reid, William D. Marslen-Wilson (2000), Complexity and alternation in the Polish mental lexicon, SWAP
Alain Content, Nicolas Dumay, Uli Frauenfelder (2000), The role of syllable structure in lexical segmentation: Helping listeners avoid mondegreens, SWAP
Dennis Norris, Anne Cutler, James M. McQueen, Sally Butterfield, Ruth Kearns (2000), Language-universal constraints on the segmentation of English, SWAP
James M. McQueen, Anne Cutler, Dennis Norris (2000), Why merge really is autonomous and parsimonious, SWAP
Arthur G. Samuel (2000), Some empirical tests of Merge's architecture, SWAP
Petra van Alphen (2000), Does subcategorical variation influence lexical access?, SWAP
Jens Bölte, Else Coenen (2000), Domato primes paprika: Mismatching pseudowords activate semantic and phonological representations, SWAP
Anne Cutler, Dennis Norris, James M. McQueen (2000), Tracking TRACE's troubles, SWAP
Delphine Dahan, James S. Magnuson, Michael K. Tanenhaus, Ellen M. Hogan (2000), Tracking the time course of subcategorical mismatches on lexical access: Evidence for lexical competition, SWAP
Mark A. Pitt, Lisa Shoaf (2000), Beyond traditional measures of lexical influences on perception, SWAP
Uli H. Frauenfelder, Alain Content (2000), Activation flow in models of spoken word recognition, SWAP
Jean Vroomen, Beatrice de Gelder (2000), Lipreading and the compensation for coarticulation mechanism, SWAP
Carol A. Fowler, Lawrence Brancazio (2000), Feedback in audiovisual speech perception, SWAP
Shigeaki Amano, Tadahisa Kondo (2000), Neighborhood and cohort in lexical processings of Japanese spoken words, SWAP
Dannie van den Brink, Colin Brown, Peter) Hagoort (2000), The N200 as an electrophysiological manifestation of early contextual influences on spoken-word recognition, SWAP
Usha Goswami, Bruno de Cara (2000), Lexical Representations and Development: The Emergence of Rime Processing, SWAP
Jennifer M. Rodd, M. Gareth Gaskell, William D. Marslen-Wilson (2000), Semantic ambiguity in spoken word recognition, SWAP
Michael K. Tanenhaus, James S. Magnuson, Bob M. McMurray, Richard N. Aslin (2000), Evidence from research with an artificial lexicon, SWAP
Paul A. Luce, Nathan R. Large (2000), Do spoken words have attractors?, SWAP
Cynthia M. Connine (2000), The time course of lexical activation: Sequential constraint, co-articulatory preview and additional processing time, SWAP
Richard Shillcock (2000), Spoken word access: evidence from statistical analyses of the lexicon, SWAP
Janet Pierrehumbert (2000), Why phonological constraints are so granular, SWAP
Nicolas Dumay, Uli H. Frauenfelder, Alain Content (2000), Acoustic-phonetic cues and lexical competition in segmentation of continuous speech, SWAP
Cecilia Kirk (2000), Syllabic cues to word segmentation, SWAP
Arie van der Lugt (2000), The time-course of competition, SWAP
Rachel Smith, Sarah Hawkins (2000), Allophonic influences on word-spotting experiments, SWAP
Andrea Weber (2000), Native language phonotactics and nonnative language segmentation, SWAP
Doug H. Whalen (2000), Occam's Razor is a double-edged sword: Reduced interaction is not necessarily reduced power, SWAP
John Kingston (2000), Context effects on sensitivity and response bias, SWAP
Stephen D. Goldinger (2000), The Role of Perceptual Episodes in Lexical Processing, SWAP
Christophe Pallier (2000), Word recognition: Do we need phonological representations?, SWAP
Nicole Cooper (2000), Native and non-native preprocessing of lexical stress in English word recognition, SWAP
Sarah Hawkins, Noel Nguyen (2000), Predicting syllable-coda voicing from the acoustic properties of syllable onsets, SWAP
Joan Sereno, Hugo Quené (2000), Facilitatory and inhibitory effects using a segmental phonetic priming paradigm, SWAP
Liang Tao (2000), Prosody and Word Recognition: A case study, SWAP
Terrance M.(2000) Nearey (2000), Phoneme-like units and speech perception, SWAP
Joanne L. Miller (2000), Mapping from Acoustic Signal to Phonetic Category: Nature and Role of Internal Category Structure, SWAP
Timothée Premat, Philippe Boula De Mareüil (2018), Le /R/ « roulé » en français et dans quelques langues régionales de France, JEP
Cédric Gendrot, Gabriele Chignoli, Nicolas Audibert, Cécile Fougeron (2018), Variabilité inter et intra locuteurs de mesures spectrales et prosodiques en parole lue, JEP
Monika Pukli (2018), L'effet de la fréquence lexicale sur les réalisations des rhotiques en Ecosse, JEP
Xavier St-Gelais, Christophe Coupé, François Pellegrino, Vincent Arnaud (2018), Entre Québec et France, qu'en est-il de l'antériorisation de /ɔ/ en français contemporain ?, JEP
Christine Meunier, Alain Ghio (2018), Caractériser la distinctivité du système vocalique des locuteurs, JEP
Wang Ning (2018), Analyse acoustique des occlusives produites par des jeunes locuteurs en dialecte wu de Suzhou, JEP
Erwan Pépiot, Aron Arnold (2018), Étude des variations de fréquence fondamentale relatives au genre chez des bilingues Anglais/Français, JEP
Anna Marczyk, Yohann Meynadier, Maria-Josep Solé (2018), Prénasalisation des plosives initiales comme une stratégie de voisement dans un cas d'apraxie de la parole : une étude aérodynamique, JEP
Jean-Sylvain Liénard (2018), Représentation et Estimation de la Force de Voix à partir du Spectre Moyen à Long Terme, JEP
Adrien Gresse, Richard Dufour, Vincent Labatut, Mickael Rouvier, Jean-François Bonastre (2018), Mesure de similarité fondée sur des réseaux de neurones siamois pour le doublage de voix, JEP
Jane Wottawa, Martine Adda-Decker (2018), Quand les voyelles longues et brèves ne tiennent pas en place : la qualité vocalique en allemand L2, JEP
Aron Arnold (2018), L'abaissement de la fréquence fondamentale comme pratique de séduction, JEP
Mélanie Canault, Naomi Yamaguchi, Nikola Paillereau, Johanna-Pascale Roy, Christophe Dos Santos, Sophie Kern (2018), Evolution des habiletés articulatoires au stade du babillage : le timing des syllabes CV, JEP
Anisia Popescu, Ioana Chitoran (2018), Jugements sur le nombre de syllabes et coordination temporelle des gestes articulatoires, JEP
Ivana Didirkovà, Camille Fauth, Maguer Sébastien Le (2018), Étude exploratoire des événements articulatoires pendant la réalisation de pauses en parole spontanée, JEP
Camille Fauth, Angéline Duchemin, Béatrice Vaxelaire, Rudolph Sock (2018), Perturbation de l'organisation temporelle de la parole suite à un effort physique, JEP
Imed Laaridh, Julien Tardieu, Cynthia Magnen, Pascal Gaillard, Jérôme Farinas, Julien Pinquier (2018), Évaluations perceptive et automatique de l'intelligibilité de la parole dégradée par simulation de la surdité professionnelle, JEP
Jean Schoentgen, Dhouha Rezgui, Francis Grenez (2018), Simulation numérique des apériodicités vocales dues aux fluctuations de la tension musculaire, JEP
David Alejandro Bustamante, Pierre Hallé, Claire Pillot-Loiseau (2018), Perception des voyelles nasales du français par des apprenants hispanophones, JEP
Olivier Crouzet (2018), Perception des consonnes et voyelles nasales en parole vocodée : Analyse de la contribution des niveaux de résolution spectrale et temporelle, JEP
Edwin Simonnet, Sahar Ghannay, Nathalie Camelin, Yannick Estève (2018), Simulation d'erreurs de reconnaissance automatique dans un cadre de compréhension de la parole, JEP
Ekaterina Biteeva Lecocq, Nathalie Vallée, Silvain Gerber, Christophe Savariaux (2018), Variabilité du geste linguo-palatal. Le cas du russe, JEP
Paolo Mairano, Fabiàn Santiago, Elisabeth Delais-Roussarie (2018), Gémination non-native en français d'apprenants italophones, JEP
Yizhi Huang, Véronique Delvaux, Kathy Huet, Myriam Piccaluga, Guoxian Zhang, Bernard Harmegnies (2018), Étude exploratoire des stratégies de production du ton 3 en chinois mandarin, JEP
Claire Pillot-Loiseau, Claudia Schweitzer, Christelle Dodane, Alice Romeo, Giuseppina Turco (2018), Doubler les consonnes en chant baroque français : un cas de gémination expressive ?, JEP
Alexandre Suire, Michel Raymond, Melissa Barkat-Defradas (2018), Voix et sélection sexuelle : une approche interdisciplinaire, JEP
Mathieu Avanzi, Philippe Boula De Mareüil (2018), Peut-on distinguer perceptivement huit accents régionaux en français parlé en Europe ? Une réponse à base de crowdsourcing, JEP
Charlotte Alazard-Guiu, Fabiàn Santiago, Paolo Mairano (2018), L'incidence de la correction phonétique sur l'acquisition des voyelles en langue étrangère : étude de cas d'anglophones apprenant le français, JEP
Maëva Garnier, Anaïs Da Fonseca, Christophe Savariaux, Thibault Cattelain (2018), Efforts de production de parole chez les personnes qui bégaient, JEP
Nicha Yamlamai, Thi Thuy Hien Tran (2018), Effet de la position de la syllabe sur la réalisation acoustique des consonnes finales du thaï, JEP
Florent Desnous, Anthony Larcher, Sylvain Meignier (2018), Impact de la détection de la parole pour différentes tâches de traitement automatique de la parole, JEP
Hannah King, Emmanuel Ferragne (2018), La parole sans les lèvres : une étude acoustique et articulatoire, JEP
Moez Ajili, Jean-François Bonastre, Kheder Waad Ben, Solange Rossato, Juliette Kahn (2018), Comparaison des voix dans le cadre judiciaire : influence du contenu phonétique, JEP
Thibault Cattelain, Maëva Garnier, Christophe Savariaux, Silvain Gerber, Pascal Perrier (2018), Analyse électromyographique de la production des plosives labiales: Enjeux méthodologiques, JEP
Nicolas Obin, Pascal Pham, Axel Roebel (2018), Conversion d'Identité de la Voix Chantée par Sélection et Concaténation d'Unités Spectrales, JEP
Brigitte Bigi, Christine Meunier (2018), euh, rire et bruits en parole spontanée : application à l'alignement forcé, JEP
Maëva Garnier, Marion Dohen, Louis Buttiaux, Silvain Gerber (2018), Clarification et correction d'indices segmentaux : une étude pilote sur les consonnes occlusives du français, JEP
Cheraitia Salah-Eddine, Merouane Bouzid, Meziane Nacéra (2018), Codage efficace à débit variable basé sur la quantification vectorielle à divisions commutées : Application aux paramètres ISF en large bande, JEP
Matthieu Riou, Bassam Jabaian, Stéphane Huet, Fabrice Lefèvre (2018), Évaluation de l'adaptation par renforcement d'un générateur en langage naturel neuronal pour le dialogue homme-machine, JEP
Gaëlle Ferré (2018), Gestes et prosodie dans la parole aphasique non fluente, JEP
Alexis Dehais Underdown, Didier Demolin (2018), Étude acoustique du cluster /t̪͡ɾ/ et de ses allophones à Santiago du Chili, JEP
Laura Abou Haidar (2018), L'opposition de voisement chez les apprenants syriens de FLE, JEP
David Doukhan, Jean Carrive (2018), Description automatique du taux d'expression des femmes dans les flux télévisuels français, JEP
Natalia Tomashenko, Yannick Estève (2018), Impact des techniques d'adaptation au locuteur dans l'espace des paramètres pour des modèles acoustiques purement neuronaux, JEP
Hélène Guiraud, Ana-Sofia Hincapié, Karim Jerbi, Véronique Boulenger (2018), Perception de la parole et oscillations cérébrales chez les enfants neurotypiques et dysphasiques., JEP
Fabiàn Santiago (2018), Effets de l'orthographe dans la prononciation du français L2, JEP
Nicolas Audibert, Cécile Fougeron, Fany Barbier, Léa Croze, Camille Lavoine, Hélène Rance (2018), Quel est mon âge d'après ma voix ? Effets de la variété régionale et de la génération, JEP
Takeki Kamiyama, Nadine Herry-Bénit, Ioana Trifu-Dejeu, Audrey Gros-Bonfiglioli (2018), L'opposition fortis / lenis des occlusives en fin de mot en anglais : liste de mots isolée lue par les apprenants francophones, JEP
Solange Rossato, Dan Zhang, Moez Ajili, Jean-François Bonastre (2018), Suivre le rythme de tes paroles, JEP
Bowei Shao, Rachid Ridouane (2018), La « voyelle apicale » en chinois de Jixi : caractéristiques acoustiques et comportement phonologique, JEP
Olivier Nocaudie, Corine Astésano, Alain Ghio, Muriel Lalain, Virginie Woisard (2018), Evaluation de la compréhensibilité et conservation des fonctions prosodiques en perception de la parole de patients post traitement de cancers de la cavité buccale et du pharynx, JEP
Clémence Verhaegen, Véronique Delvaux, Kathy Huet, Sophie Fagniart, Myriam Piccaluga, Bernard Harmegnies (2018), La distinction entre les paraphasies phonologiques et phonétiques dans l'aphasie : Étude acoustique des productions de 6 patients aphasiques, JEP
Typhanie Prince (2018), Déficit phonético-phonologique dans l'aphasie, JEP
Anne Tortel, Sophie Herment (2018), La voyelle inaccentuée e en position initiale : analyses acoustiques et enjeux pédagogiques pour l'anglais L2, JEP
Delfine Michaud, Nicolas Ballier (2018), Perception et production de /y/ et /u/ en français L2 chez l'apprenant anglophone débutant : étude de cas de leur catégorisation chez quatre locuteurs, JEP
Claudia Schweitzer, Christelle Dodane, Jan Lazar (2018), L'histoire des alphabets phonétiques du XVIIIe jusqu'à l'API, JEP
Pierre-Alexandre Broux, David Doukhan, Simon Petitrenaud, Sylvain Meignier, Jean Carrive (2018), Segmentation et Regroupement en Locuteurs: comment évaluer les corrections humaines, JEP
Marie-Charlotte Cuartero, Roxane Bertrand, Marie Vidailhet, David Grabli, Serge Pinto (2018), Organisation temporelle de la parole dans la dystonie généralisée primaire, JEP
Leslie Lemarchand, Andrea MacLeod, Mélanie Canault, Sophie Kern (2018), Développement de la parole et de la mastication : Evolution de la durée des cycles oscillatoires mandibulaires observés entre 8 et 14 mois chez 4 enfants québécois, JEP
Daria D'Alessandro, Cécile Fougeron (2018), Réduction de la coarticulation et vieillissement, JEP
Hasna Zaouali, Béatrice Vaxelaire, Christian Debry, Guy Bronner, Rudolph Sock (2018), Étude acoustique de voyelles tenues produites par des patients glossectomisés suite à un cancer endo-bucal, JEP
Marie Philippart De Foy, Véronique Delvaux, Kathy Huet, Morgane Monnier, Myriam Piccaluga, Bernard Harmegnies (2018), Un protocole de recueil de productions orales chez l'enfant préscolaire : une étude préliminaire auprès d'enfants bilingues, JEP
Fanny Guitard-Ivent (2018), Effets de la durée vocalique et du locuteur sur le degré de coarticulation C-à-V en français : étude sur grands corpus, JEP
Anaïs Delhoume, Emmanuel Ferragne (2018), Influence de la posture corporelle sur les paramètres acoustiques de la parole, JEP
Sébastien Ferreira, Jérôme Farinas, Julien Pinquier, Stéphane Rabant (2018), Prédiction a priori de la qualité de la transcription automatique de la parole bruitée, JEP
Ambre Davat, Véronique Aubergé, Gang Feng (2018), Vers un modèle du « toucher vocal » pour la communication ubiquïte, JEP
Alain Ghio, Muriel Lalain, Laurence Giusti, Gilles Pouchoulin, Danièle Robert, Marie Rebourg, Corinne Fredouille, Imed Laaridh, Virginie Woisard (2018), Une mesure d'intelligibilité par décodage acoustico-phonétique de pseudo-mots dans le cas de parole atypique, JEP
Mélanie Lancien, Nicolas Audibert, Cécile Fougeron (2018), Effet de la situation de parole sur la variabilité des voyelles en français, JEP
Imed Laaridh, Corinne Fredouille, Alain Ghio, Muriel Lalain, Virginie Woisard (2018), Evaluation automatique de l'intelligibilité de la parole dans le contexte de cancers de la tête et du cou, JEP
Philippe Martin (2018), Un algorithme de segmentation en phrasé, JEP
Yohann Meynadier, Sophie Dufour (2018), Ambiguïté temporaire des obstruantes voisées en parole chuchotée, JEP
Sahar Ghannay, Nathalie Camelin, Yannick Estève (2018), Représentations de phrases dans un espace continu spécifiques à la tâche de détection d'erreurs, JEP
Jennifer Krzonowski, François Pellegrino, Emmanuel Ferragne (2018), Etude acoustique de la production de voyelles de l'anglais par des apprenants francophones, JEP
Véronique Delvaux, Giancarlo Luxardo, Fabrice Hirsch (2018), Une histoire des JEP: 50 ans d'études sur la parole, JEP
Amandine Michelas, Sophie Dufour (2018), L'information accentuelle est-elle représentée dans le lexique mental des locuteurs du français ?, JEP
Kévin Vythelingum, Yannick Estève, Olivier Rosec (2018), Transcription phonétique automatique pour la synthèse de la parole, JEP
Salima Mdhaffar, Antoine Laurent, Yannick Estève (2018), Etude de performance des réseaux neuronaux récurrents dans le cadre de la campagne d'évaluation Multi-Genre Broadcast challenge 3 (MGB3), JEP
Marion Tellier, Gale Stam, Alain Ghio (2018), « Tout ça c'est abstrait » : Comment le degré d'abstraction d'un mot expliqué affecte-t-il la parole multimodale ?, JEP
Peter Ladefoged (1992), Knowing enough to analyze spoken languages, ICSLP
Renato De Mori, R. Kuhn (1992), Speech understanding strategies based on string classification trees, ICSLP
Patricia K. Kuhl (1992), Infants' perception and representation of speech: development of a new theory, ICSLP
Hajime Hirose (1992), The behavior of the larynx in spoken language production, ICSLP
Eduardo Lleida, José B. Marino, J. Salavedra, Antonio Bonafonte (1992), Syllabic fillers for Spanish HMM keyword spotting, ICSLP
Yasuhiro Komori, David Rain Ton (1992), Minimum error classification training for HMM-based keyword spotting, ICSLP
Gregory J. Clary, John H. L. Hansen (1992), A novel speech recognizer for keyword spotting, ICSLP
Herbert Gish, Kenney Ng, J. Robin Rohlicek (1992), Secondary processing using speech segments for an HMM word spotting system, ICSLP
Ming-Whei Feng, Baruch Mazor (1992), Continuous word spotting for applications in telecommunications, ICSLP
Maurizio Copperi (1992), A low bit-rate CELP coder based on multi-path search methods, ICSLP
Katsushi Seza, Hirohisa Tasaki, Shinya Takahashi (1992), Fully vector quantized arm a analysis combined with glottal model for low bit rate coding, ICSLP
Erdal Paksoy, Wai-Yip Chan, Allen Gersho (1992), Vector quantization of speech LSF parameters with generalized product codes, ICSLP
Yair Shoham (1992), Low-rate speech coding based on time-frequency interpolation, ICSLP
Tomohiko Taniguchi, Yoshinori Tanaka, Yasuji Ohta, Fumio Amano (1992), Improved CELP speech coding at 4 kbit/s and below, ICSLP
Antonio Bonafonte, Jose B. Marino, Montse Pardas (1992), Efficient integration of coarticulation and lexical information in a finite state grammar, ICSLP
H. A. Leeper, A. P. Rochet, I. R. A. MacKay (1992), Characteristics of nasalance in canadian speakers of English and French, ICSLP
Christine H. Shadle, Andre Moulinier, Christian U. Dobelke, Celia Scully (1992), Ensemble averaging applied to the analysis of fricative consonants, ICSLP
Andrew Slater, Sarah Hawkins (1992), Effects of stress and vowel context on velar stops in british English, ICSLP
E. Magno Caldognetto, K. Vagges, G. Ferrigno, Maria Grazia Busa (1992), Lip rounding coarticulation in Italian, ICSLP
P. M. T. Smeele, A. C. Sittig, Vincent J. van Heuven (1992), Intelligibility of audio-visually desynchronised speech: asymmetrical effect of phoneme position, ICSLP
Unto K. Laine (1992), Speech analysis using complex orthogonal auditory transform (coat), ICSLP
Yuqing Gao, Taiyi Huang, Shaoyan Chen, Jean-Paul Haton (1992), Auditory model based speech processing, ICSLP
Gary N. Tajchman, Nathan Intrator (1992), Phonetic classification of timit segments preprocessed with lyon's cochlear model using a supervised/unsupervised hybrid neural network, ICSLP
Thomas Holton, Steven D. Love, Stephen P. Gill (1992), Formant and pitch-pulse detection using models of auditory signal processing, ICSLP
Hynek Hermansky, Nelson Morgan (1992), Towards handling the acoustic environment in spoken language processing, ICSLP
Alberto Ciaramella, Davide Clementino, Roberto Pacifici (1992), Real-time speaker-independent large-vocabulary CDHMM-based continuous telephonic speech recognizer, ICSLP
Matthew Lennig, Douglas Sharp, Patrick Kenny, Vishwa Gupta, Kristin Precoda (1992), Flexible vocabulary recognition of speech, ICSLP
Benjamin Chigier, Hong C. Leung (1992), The effects of signal representations, phonetic classification techniques, and the telephone network, ICSLP
Leon Gulikers, Rijk Willemse (1992), A lexicon for a text-to-speech system, ICSLP
Rijk Willemse, Leon Gulikers (1992), Word class assignment in a text-to-speech system, ICSLP
Gösta Bruce, Björn Granström, Kjell Gustafson, David House (1992), Aspects of prosodic phrasing in Swedish, ICSLP
K. P. H. Sullivan, Robert I. Damper (1992), Synthesis-by-analogy: a bilingual investigation using German and English, ICSLP
Leonard C. Manzara, David R. Hill (1992), Degas: a system for rule-based diphone speech synthesis, ICSLP
Shyam S. Agrawal, Kenneth N. Stevens (1992), Towards synthesis of Hindi consonants using KLSYN88, ICSLP
Louis C. W. Pols, SAM Partners SAM Partners (1992), Multi-lingual synthesis evaluation methods, ICSLP
Björn Granström, Petur Helgason, Hoskuldur Thrainsson (1992), The interaction of phonetics, phonology and morphology in an icelandic text-to-speech system, ICSLP
Helmer Strik, Joop Jansen, Louis Boves (1992), Comparing methods for automatic extraction of voice source parameters from continuous speech, ICSLP
Jacques Koreman, Louis Boves, Bert Cranen (1992), The influence of linguistic variations on the voice source characteristics, ICSLP
Sarah K. Palmer, Jill House (1992), Dynamic voice source changes in natural and synthetic speech, ICSLP
Satoshi Imaizumi, Jan Gauffin (1992), Acoustic and perceptual modelling of the voice quality caused by fundamental frequency perturbation, ICSLP
Shigeru Kiritani, H. Imagawa, Hajime Hirose (1992), Vocal cord vibration during consonants - high-speed digital imaging using a fiberscope, ICSLP
David R. Traum, James F. Allen (1992), A "speech acts" approach to grounding in conversation, ICSLP
Sheila Meltzer (1992), Antecedent activation by empty pronominals in Spanish, ICSLP
Ron Smyth (1992), Multiple feature matching in pronoun resolution: a new look at parallel function, ICSLP
Keh-Yih Su, Jing-Shin Chang, Yi-Chung Lin (1992), A discriminative approach for ambiguity resolution based on a semantic score function, ICSLP
Nobuaki Minematsu, Sumio Ohno, Keikichi Hirose, Hiroya Fujisaki (1992), The influence of semantic and syntactic information on spoken sentence recognition, ICSLP
Lynne C. Nygaard, Mitchell S. Sommers, David B. Pisoni (1992), Effects of speaking rate and talker variability on the representation of spoken words in memory, ICSLP
Hugo Quene, Yvette Smits (1992), On the absence of word segmentation at "weak" syllables, ICSLP
Mitchell S. Sommers, Lynne C. Nygaard, David B. Pisoni (1992), Stimulus variability and the perception of spoken words: effects of variations in speaking rate and overall amplitude, ICSLP
James M. McQueen, Anne Cutler (1992), Words within words: lexical statistics and lexical access, ICSLP
Stephan Euler, Joachim Zinke (1992), Experiments on the use of the generalized probabilistic descent method in speech recognition, ICSLP
Ricardo de Cordoba, José M. Pardo, Jose Colás (1992), Improving and optimizing speaker independent, 1000 words speech recognition in Spanish, ICSLP
John F. Pitrelli, David Lubensky, Benjamin Chigier, Hong C. Leung (1992), Multiple-level evaluation of speech recognition systems, ICSLP
Tatsuya Kimura, Mitsuru Endo, Shoji Hiraoka, Katsuyuki Niyada (1992), Speaker independent word recognition using continuous matching of parameters in time-spectral form based on statistical measure, ICSLP
R. Roddeman, H. Drexler, Louis Boves (1992), Automatic derivation of lexical models for a very large vocabulary speech recognition system, ICSLP
Anne Cutler, Tony Robinson (1992), Response time as a metric for comparison of speech recognition by humans and machines, ICSLP
S. M. (Raj) Ulagaraj (1992), Characterization of directory assistance operator-customer dialogues in AGT limited, ICSLP
Sheri Hunnicutt, Lynette Hirschman, Joseph Polifroni, Stephanie Seneff (1992), Analysis of the effectiveness of system error messages in a human-machine travel planning task, ICSLP
David Goodine, Lynette Hirschman, Joseph Polifroni, Stephanie Seneff, Victor Zue (1992), Evaluating interactive spoken language systems, ICSLP
Ute Jekosch (1992), The cluster-identification test, ICSLP
Patrick Kenny, R. Hollan, G. Boulianne, H. Garudadri, Yan-Ming Cheng, Matthew Lennig, Douglas O'Shaughnessy (1992), Experiments in continuous speech recognition with a 60,000 word vocabulary, ICSLP
G. Boulianne, Patrick Kenny, Matthew Lennig, Douglas O'Shaughnessy, Paul Mermelstein (1992), HMM training on unconstrained speech for large vocabulary, continuous speech recognition, ICSLP
David Rainion, Shigeki Sagayama (1992), Appropriate error criterion selection for continuous speech HMM minimum error training, ICSLP
Akito Nagai, Kenji Kita, Toshiyuki Hanazawa, Tadashi Suzuki, Tomohiro Iwasaki, Tsuyoshi Kawabata, Kunio Nakajima, Kiyohiro Shikano, Tsuyoshi Morimoto, Shigeki Sagayama, Akira Kurematsu (1992), Hardware implementation of realtime 1000-word HMM-LR continuous speech recognition, ICSLP
Madeleine Bates, Robert Bobrow, Pascale Fung, Robert Ingria, Francis Kubala, John Makhoul, Long Nguyen, Richard Schwartz, David Stallard (1992), Design and performance of HARC, the BBN spoken language understanding system, ICSLP
O. Shirotsuka, G. Kawai, Michael Cohen, J. Bernstein (1992), Performance of speaker-independent Japanese recognizer as a function of training set size and diversity, ICSLP
Kouichi Yamaguchi, Shigeki Sagayama, Kenji Kita, Frank K. Soong (1992), Continuous mixture HMM-LR using the a* algorithm for continuous speech recognition, ICSLP
Kenji Kita, Tsuyoshi Morimoto, Kazumi Ohkura, Shigeki Sagayama (1992), Continuously spoken sentence recognition by HMM-LR, ICSLP
Akinori Ito, Shozo Makino (1992), Word pre-selection using a redundant hash addressing method for continuous speech recognition, ICSLP
Andrej Ljolje, Michael D. Riley (1992), Optimal speech recognition using phone recognition and lexical access, ICSLP
Nick Waegner, Steve J. Young (1992), A trellis-based language model for speech recognition, ICSLP
Carla B. Zoltowski, Mary P. Harper, Leah H. Jamieson, Randall A. Helzerman (1992), PARSEC: a constraint-based framework for spoken language understanding, ICSLP
G. J. F. Jones, J. H. Wright, E. N. Wrigley (1992), The HMM interface with hybrid grammar-bigram language models for speech recognition, ICSLP
Atsuhiko Kai, Seiichi Nakagawa (1992), A frame-synchronous continuous speech recognition algorithm using a top-down parsing of context-free grammar, ICSLP
Fernando Pereira, David Roe (1992), Empirical properties of finite state approximations for phrase structure grammars, ICSLP
Stephanie Seneff, Helen Meng, Victor Zue (1992), Language modelling for recognition and understanding using layered bigrams, ICSLP
David Goddeau (1992), Using probabilistic shift-reduce parsing in speech recognition systems, ICSLP
Tim Howells, David Friedman, Mark Fanty (1992), Broca, an integrated parser for spoken language, ICSLP
P. V. S. Rao, Nandini Bondale (1992), Blank slate language processor for speech recognition, ICSLP
Eric Jackson (1992), Integrating two complementary approaches to spoken language understanding, ICSLP
Marcello Pelillo, Mario Refice (1992), Learning compatibility coefficients for word-class disambiguation relaxation processes, ICSLP
Kaichiro Hatazaki, Jun Noguohi, Akitoshi Okumura, Kazunaga Yoshida, Takao Watanabe (1992), INTERTALKER: an experimental automatic interpretation system using conceptual representation, ICSLP
Tsuyoshi Morimoto, Toshiyuki Takezawa, Kazumi Ohkura, Masaaki Nagata, Fumihiro Yato, Shigeki Sagayama, Akira Kurematsu (1992), Enhancement of ATR's spoken language translation system: SL-TRANS2, ICSLP
Tsuyoshi Morimoto (1992), Continuous speech recognition using a combination of syntactic constraints and dependency relationship, ICSLP
Roberto Pieraccini, Zakhar Gorelov, Esther Levin, Evelyne Tzoukermann (1992), Automatic learning in spoken language understanding, ICSLP
Michael P. Robb, Harold R. Bauer (1992), Prespeech and early speech coarticulation: american English and Japanese characteristics, ICSLP
Hiroaki Kojima, Kazuyo Tanaka, Satoru Hayamizu (1992), Formation of phonological concept structures from spoken word samples, ICSLP
Bernard L. Rochet, Fangxin Chen (1992), Acquisition of the French VOT contrasts by adult speakers of Mandarin Chinese, ICSLP
Michael Gasser (1992), Phonology as a byproduct of learning to recognize and produce words: a connectionist model, ICSLP
Michael S. Hurlburt, Judith C. Goodman (1992), The development of lexical effects on children's phoneme identifications, ICSLP
P. A. Halle, B. de Boysson-Bardies (1992), Word recognition before production of first words?, ICSLP
Toshisada Deguchi, Shigeru Kiritani, Akiko Hayashi, Fumi Katoh (1992), The effect of fundamental frequency for vowel perception in infants, ICSLP
William C. Treurniet (1992), Objective measurement of phoneme similarity, ICSLP
Michael D. Riley, Andrej Ljolje (1992), Recognizing phonemes vs. recognizing phones: a comparison, ICSLP
B. L. Derwing, Terrance M. Nearey, R. A. Beinert, T. A. Bendrien (1992), On the role of the segment in speech processing by human listeners: evidence from speech perception and from global sound similarity judgments, ICSLP
Grace E. Wiebe, Bruce L. Derwing (1992), The syllabic status of postvocalic resonants in an unwritten low German dialect, ICSLP
Agaath Sluijter, Vincent J. van Heuven, A. H. Neijt (1992), The influence of focus distribution and lexical stress on the temporal organisation of the syllable, ICSLP
Agaath Sluijter, Jacques Terken (1992), The development and perceptive evaluation of a model for paragraph intonation in dutch, ICSLP
Nobuyoshi Kaiki, Yoshinori Sagisaka (1992), Pause characteristics and local phrase-dependency structure in Japanese, ICSLP
Bernd Möbius, Matthias Pätzold (1992), F0 synthesis based on a quantitative model of German intonation, ICSLP
K. Ross, Mari Ostendorf, Stefanie Shattuck-Hufnagel (1992), Factors affecting pitch accent placement, ICSLP
Marc Swerts, Ronald Geluykens, Jacques Terken (1992), Prosodic correlates of discourse units in spontaneous speech, ICSLP
Shin'ya Nakajima, James F. Allen (1992), Prosody as a cue for discourse structure, ICSLP
Barbara Grosz, Julia Hirschberg (1992), Some intonational characteristics of discourse structure, ICSLP
Hiroya Fujisaki, Keikichi Hirose, Haitao Lei (1992), Prosody and syntax in spoken sentences of standard Chinese, ICSLP
Kathleen Bishop (1992), Modeling sentential stress in the context of a large vocabulary continuous speech recognizer, ICSLP
Kazumi Ohkura, Masahide Sugiyama, Shigeki Sagayama (1992), Speaker adaptation based on transfer vector field smoothing with continuous mixture density HMMs, ICSLP
Tatsuo Matsubka, Kiyohiro Shikano (1992), Speaker adaptation by modifying mixture coefficients of speaker-independent mixture Gaussian HMMs, ICSLP
Yifan Gong, Olivier Siohan, Jean-Paul Haton (1992), Minimization of speech alignment error by iterative transformation for speaker adaptation, ICSLP
Hiroaki Hattori, Shigeki Sagayama (1992), Vector field smoothing principle for speaker adaptation, ICSLP
Tetsunori Kobayashi, Katsuhiko Shirai (1992), Spectral mapping onto probabilistic domain using neural networks and its application to speaker adaptive phoneme recognition, ICSLP
Jean-Paul Lefevre, Mervyn A. Jack, Claudio Maggio, Mario Refice, Fabio Gabrieli, Michelina Saving, Luigi Santangelo (1992), An interactive system for automated pronunciation improvement, ICSLP
Edmund Rooney, Steven M. Hiller, John Laver, Mervyn A. Jack (1992), Prosodic features for automated pronunciation improvement in the spell system, ICSLP
Maria-Gabriella Di Benedetto, Fabrizio Carraro, Steven M. Hiller, Edmund Rooney (1992), Vowels pronunciation assessment in the spell system, ICSLP
Franck Poirier (1992), Self-organizing map with supervision for speech recognition, ICSLP
Gregory R. De Haan, Ömer Egecioglu (1992), Topology preservation for speech recognition, ICSLP
Gary Bradshaw, Alan Bell (1992), Towards the performance limits of connectionist feature detectors, ICSLP
Helge B. D. Sorensen (1992), Context-dependent and -independent self-structuring hidden control models for speech recognition, ICSLP
Marie-José Caraty, Claude Montacié, Claude Barras (1992), Integration of frequential and temporal structurations in a symbolic learning system, ICSLP
E. Monte, José B. Marino, Eduardo LLeida (1992), Smoothing hidden Markov models ay means of a self organizing feature map, ICSLP
Jyri Mantysalo, Kari Torkkola, Teuvo Kohonen (1992), LVQ-based speech recognition with high-dimensional context vectors, ICSLP
Mikko Kurimo, Kari Torkkola (1992), Application of self-organizing maps and LVQ in training continuous density hidden Markov models for phonemes, ICSLP
Paul Dalsgaard, Ove Andersen (1992), Identification of mono- and poly-phonemes using acoustic-phonetic features derived by a self-organising neural network, ICSLP
Pekka Utela, Samuel Kaski, Kari Torkkola (1992), Using phoneme group specific LVQ-codebooks with HMMs, ICSLP
Naoto Iwahashi, Yoshinori Sagisaka (1992), Speech segment network approach for an optimal synthesis unit set, ICSLP
Yoshinori Sagisaka, Nobuyoshi Kaiki, Naoto Iwahashi, Katsuhiko Mimura (1992), ATR μ-talk speech synthesis system, ICSLP
Bert Van Coile, Steven Leys, Luc Mortier (1992), On the development of a name pronunciation system, ICSLP
Inger Karlsson (1992), Consonants for female speech synthesis, ICSLP
Jan P. H. van Santen (1992), Diagnostic perceptual experiments for text-to-speech system evaluation, ICSLP
Marcello Balestri, Enzo Foti, Luciano Nebbia, Mario Oreglia, Pier Luigi Salza, Stefano Sandri (1992), Comparison of natural and synthetic speech intelligibility for a reverse telephone directory service, ICSLP
Richard Sproat, Julia Hirschberg, David Yarowsky (1992), A corpus-based synthesizer, ICSLP
Tomohisa Hirokawa, Kenzo Itoh, Hirokazu Sato (1992), High quality speech synthesis based on wavelet compilation of phoneme segments, ICSLP
David R. Williams, Corine A. Bickley, Kenneth N. Stevens (1992), Inventory of phonetic contrasts generated by high-level control of a formant synthesizer, ICSLP
Mikael Goldstein, Ove Till (1992), Is % overall error rate a valid measure of speech synthesiser and natural speech performance at the segmental level?, ICSLP
Willy Jongenburger, Renee van Bezooijen (1992), Text-to-speech conversion for dutch: comprehensibility and acceptability, ICSLP
Masayo Katoh, Shin'ichiro Hashimoto (1992), The rhythm rules in Japanese based on the centers of energy gravity of vowels, ICSLP
Kenzo Itoh, Tomohisa Hirokawa, Hirokazu Sato (1992), Segmental power control for Japanese speech synthesis, ICSLP
Jean Schoentgen (1992), Glottal waveform synthesis with volterra shapers, ICSLP
Ken Ceder, Bertil Lyberg (1992), Yet another rule compiler for text-to-speech conversion?, ICSLP
Kazuhiko Iwata, Yukio Mitome (1992), Prosody generation models constructed by considering speech tempo influence on prosody, ICSLP
Alex I. C. Monaghan (1992), Extracting microprosodic information from diphones - a simple way to model segmental effects on prosody for synthetic speech, ICSLP
Arjan van Hessen (1992), Generation of natural sounding speech stimuli by means of linear cepstral interpolation, ICSLP
W. Nick Campbell, Colin Wightman (1992), Prosodic encoding of syntactic structure for speech synthesis, ICSLP
Susan R. Hertz, Marie K. Huffman (1992), A nucleus-based timing model applied to multi-dialect speech synthesis by rule, ICSLP
Jill House, Nick Youd (1992), Evaluating the prosody of synthesized utterances within a dialogue system, ICSLP
Marcel Tatham, Eric Lewis (1992), Prosodics in a syllable-based text-to-speech synthesis system, ICSLP
R. Belrhali, Véronique Aubergé, Louis-Jean Boe (1992), From lexicon to rules: toward a descriptive method of French text-to-phonetics transcription, ICSLP
Marianne Elmlund, Ida Frehr, Niels Reinholt Petersen (1992), Formant transformation from male to female synthetic voices, ICSLP
P. A. Rentzepopoulos, George K. Kokkinakis (1992), Multilingual phoneme to grapheme conversion system based on HMM, ICSLP
Noriyo Hara, Hisayoshi Tsubaki, Hisashi Wakita (1992), Fundamental frequency control using linguistic information, ICSLP
Andrew P. Breen (1992), A comparison of statistical and rule based methods of determining segmental durations, ICSLP
J. R. Andrews, K. M. Curtis, Volker Kraft (1992), Generation and extraction of high quality synthesis units, ICSLP
T. Boogaart, Kim Silverman (1992), Evaluating the overall comprehensibility of speech synthesizers, ICSLP
Olivier Boeffard, Laurent Miclet, S. White (1992), Automatic generation of optimized unit dictionaries for text to speech synthesis, ICSLP
Hideki Kasuya, Seiki Kasuya (1992), Relationships between syllable, word and sentence intelligibilities of synthetic speech, ICSLP
David R. Hill, Craig-Richard Schock, Leonard C. Manzara (1992), Unrestricted text-to-speech revisited: rhythm and intonation, ICSLP
Anton J. Rozsypal (1992), Wavelet speech synthesizer in the classroom and speech laboratory, ICSLP
Thomas Portele, Birgit Steffan, Rainer Preuß, Walter F. Sendlmeier, Wolfgang Hess (1992), HADIFIX - a speech synthesis system for German, ICSLP
Cristina Delogu, S. Conte, A. Paoloni, C. Sementina (1992), Two different methodologies for evaluating the comprehension of synthetic passages, ICSLP
Carlos Gussenhoven, Toni Rietveld (1992), A target-interpolation model for the intonation of dutch, ICSLP
Katharine Davis, Patricia K. Kuhl (1992), Best exemplars of English velar stops: a first report, ICSLP
Kenneth N. Stevens, Sharon Y. Manuel, Stefanie Shattuck-Hufnagel, Sharlene Liu (1992), Implementation of a model for lexical access based on features, ICSLP
Dieter Huber (1992), Perception of aperiodic speech signals, ICSLP
Hiroaki Kato, Minoru Tsuzaki, Yoshinori Sagisaka (1992), Acceptability and discrimination threshold for distortion of segmental duration in Japanese words, ICSLP
Anne Bonneau, Sylvie Coste, Linda Djezzar, Yves Laprie (1992), Two level acoustic cues for consistent stop identification, ICSLP
Rolf Carlson, James Glass (1992), Vowel classification based on analysis-by-synthesis, ICSLP
Maria-Gabriella Di Benedetto, Jean-Sylvain Lienard (1992), Extrinsic normalization of vowel formant values based on cardinal vowels mapping, ICSLP
Terrance M. Nearey (1992), Applications of generalized linear modeling to vowel data, ICSLP
David B. Pisoni (1992), Some comments on invariance, variability and perceptual normalization in speech perception, ICSLP
Stephen D. Goldinger, Thomas J. Palmeri, David B. Pisoni (1992), Words and voices: perceptual details are preserved in lexical representations, ICSLP
Yan Ming Cheng, Douglas O'Shaughnessy, Peter Kabal (1992), Speech enhancement using a statistically derived filter mapping, ICSLP
V. L. Beattie, Steve J. Young (1992), Hidden Markov model state-based cepstral noise compensation, ICSLP
Guy J. Brown, Martin P. Cooke (1992), A computational model of auditory scene analysis, ICSLP
S. Nandkumar, John H. L. Hansen, Robert J. Stets (1992), A new dual-channel speech enhancement technique with application to CELP coding in noise, ICSLP
Asunción Moreno, José A. R. Fonollosa (1992), CUMULANT - based voicing decision in noise corrupted speech, ICSLP
Yolande Anglade, Dominique Fohr, Jean-Claude Junqua (1992), Selectively trained neural networks for the discrimination of normal and lombard speech, ICSLP
Aaron E. Rosenberg, Joel DeLong, Chin-Hui Lee, Biing-Hwang Juang, Frank K. Soong (1992), The use of cohort normalized scores for speaker verification, ICSLP
Tomoko Matsui, Sadaoki Furui (1992), Speaker recognition using concatenated phoneme models, ICSLP
Younes Bennani (1992), Speaker identification through a modular connectionist architecture: evaluation on the timit database, ICSLP
Claude Montacié, Jean-Luc Le Floch (1992), AR-vector models for free-text speaker recognition, ICSLP
Florian Schiel (1992), Rapid non-supervised speaker adaptation of semicontinuous hidden Markov models, ICSLP
D. Ederveen, Louis Boves (1992), Rule-based recognition of phoneme classes, ICSLP
Jie Yi, Kei Miki (1992), A new method of speaker-independent speech recognition using multiphone HMM, ICSLP
Myoung-Wan Koo, Chong-Kwan Un (1992), A speaker adaptation based on corrective training and learning vector quantization, ICSLP
Katsuhiko Shirai, Shigeki Okawa, Tetsunori Kobayashi (1992), Phoneme recognition in continuous speech based on mutual information considering phonemic duration and connectivity, ICSLP
Shinji Koga, Ryosuke Isotani, Satoshi Tsukada, Kazunaga Yoshida, Kaichiro Hatazaki, Takao Watanabe (1992), A real-time speaker-independent continuous speech recognition system based on demi-syllable units, ICSLP
Saeed V. Vaseghi, Ben P. Milner (1992), Speech recognition in noisy environments, ICSLP
Fergus R. McInnes (1992), An enhanced interpolation technique for context-specific probability estimation in speech and language modelling, ICSLP
Lorenzo Fissore, Pietro Laface, G. Micca, G. Sperto (1992), Channel adaptation for a continuous speech recognizer, ICSLP
S. Cifuentes, J. Colas, M. Savoji, José M. Pardo (1992), A new algorithm for connected digit recognition, ICSLP
Günther Ruske, Bernd Plannerer, Tanja Schultz (1992), Stochastic modeling of syllable-based units for continuous speech recognition, ICSLP
David M. Goblirsch, Toffee A. Albina (1992), HARK: an experimental speech recognition system, ICSLP
Akito Nagai, Jun-Ichi Takami, Shigeki Sagayama (1992), The SSS-LR continuous speech recognition system: integrating SSS-derived allophone models and a phoneme-context-dependent LR parser, ICSLP
Shinsuke Sakai, Michael Phillips (1992), J-SUMMIT: a Japanese segment-based speech recognition system, ICSLP
Shinobu Mizuta, Kunio Nakajima (1992), Optimal discriminative training for HMMs to recognize noisy speech, ICSLP
Shingo Kuroiwa, Kazuya Takeda, Fumihiro Yato, Seiichi Yamamoto, Kunihiko Owa, Makoto Shozakai, Ryuji Matsumoto (1992), Architecture and algorithms of a real-time word recognizer for telephone input, ICSLP
Hiroyasu Kuwano, Kazuya Nomura, Atsushi Ookumo, Shoji Hiraoka, Taisuke Watanabe, Katsuyuki Niyada (1992), Speaker independent speech recognition method using word spotting technique and its application to VCR programming, ICSLP
S. Lennon, E. Ambikairajah (1992), Transputer implementation of front-end processors for speech recognition systems, ICSLP
Yasuhiro Minami, Tatsuo Matsuoka, Kiyohiro Shikano (1992), Phoneme HMM evaluation algorithm without phoneme labeling, ICSLP
A. Noll, H. Bergmann, H. H. Hamer, Annedore Paeseler, H. Tomaschewski (1992), Architecture of a configurable application interface for speech recognition systems, ICSLP
Mark Fanty, John Pochmara, Ron Cole (1992), An interactive environment for speech recognition research, ICSLP
Y. Abe, K. Nakajima (1992), An approach to unlimited vocabulary continuous speech recognition based on context-dependent phoneme modeling, ICSLP
Chuck Wooters, Nelson Morgan (1992), Acoustic subword models in the berkeley restaurant project, ICSLP
Claus Nedergaard Jacobsen (1992), SIRtrain, an open standard environment for CHMM recognizer development, ICSLP
Yutaka Kobayashi, Yasuhisa Niimi (1992), Segmented trellis algorithms for the continuous speech recognition, ICSLP
Bo Xu, Z. W. Lin, Taiyi Huang, D. X. Xu, Y. Q. Gao (1992), A. 46,500 word Chinese speech recognition system, ICSLP
Dao Wen Chen (1992), Study of the time extension flat net for speech recognition, ICSLP
Frank Fallside (1992), A hidden Markov model structure for the acquisition of speech by machine, ASM, ICSLP
Yasuyuki Masai, Shin'ichi Tanaka, Tsuneo Nitta (1992), Speaker-independent keyword recognition based on SMQ/HMM, ICSLP
Regis Cardin, Diane Goupil, Roxane Lacouture, Evelyne Millien, Charles Snow, Yves Normandin (1992), CRIM's spontaneous speech recognition system for the ATIS task, ICSLP
F. Brugnara, Renato De Mori, D. Giuliani, Maurizio Omologo (1992), Improved connected digit recognition using spectral variation functions, ICSLP
Andrew Tridgell, Bruce Millar, Kim-Anh Do (1992), Alternative preprocessing techniques for discrete hidden Markov model phoneme recognition, ICSLP
Gerhard Th. Niedermair (1992), Linguistic modelling in the context of oral dialogue, ICSLP
Frangois Andry (1992), Static and dynamic predictions : a method to improve speech understanding in cooperative dialogues, ICSLP
Paul Heisterkamp, Scott McGlashan, Nick Youd (1992), Dialogue semantics for an oral dialogue system, ICSLP
Masaaki Nagata (1992), Using pragmatics to rule out recognition errors in cooperative task-oriented dialogues, ICSLP
Yoichi Takebayashi, Hiroyuki Tsubo, Yoichi Sadamoto, Hideki Hashimoto, Hideaki Shinchi (1992), A real-time speech dialogue system using spontaneous speech understanding, ICSLP
Li-chiung Yang (1992), A semantic and pragmatic analysis of tone and intonation in Mandarin Chinese, ICSLP
Yoshimasa Tsukuma (1992), On prosodic features in speech - comparative studies between Japanese and standard Chinese, ICSLP
W. Nick Campbell (1992), Prosodic encoding of English speech, ICSLP
Gunnar Fant, Anita Kruckenberg, Lennart Nord (1992), Prediction of syllable duration, speech rate and tempo, ICSLP
Rolf Carlson, Björn Granström, Lennart Nord (1992), Experiments with emotive speech - acted utterances and synthesized replicas, ICSLP
J. Caspars, Vincent J. van Heuven (1992), Phonetic properties of dutch accent lending pitch movements under time pressure, ICSLP
Jacques Terken, Karin van den Hombergh (1992), Judgments of relative prominence for adjacent and non-adjacent accents, ICSLP
F. Beaugendre, Christophe d'Alessandro, Anne Lacheret-Dujour, Jacques Terken (1992), A perceptual study of French intonation, ICSLP
Mark Liberman, J. Michael Schultz, Soonhyun Hong, Vincent Okeke (1992), The phonetics of IGBO tone, ICSLP
Stefanie Shattuck-Hufnagel (1992), Stress shift as pitch accent placement: within-word early accent placement in american English, ICSLP
Katherine Morton (1992), Adding emotion to synthetic speech dialogue systems, ICSLP
Cari Spring, Donna Erickson, Thomas Call (1992), Emotional modalities and intonation in spoken language, ICSLP
Tatiana Slama-Cazacu (1992), Are any "press-conferences", "interviews" or "dialogues" true dialogues?, ICSLP
Arthur J. Bronstein (1992), The categorization of the dialects and speech styles of north american English, ICSLP
Eleonora Blaauw (1992), Phonetic differences between read and spontaneous speech, ICSLP
Maxine Eskenazi (1992), Changing speech styles: strategies in read speech and casual and careful spontaneous speech, ICSLP
Noriko Umeda, Karen Wallace, Josephine Horna (1992), Usage of words and sentence structures in spontaneous versus text material, ICSLP
Nancy A. Daly, Victor Zue (1992), Statistical and linguistic analyses of F0 in read and spontaneous speech, ICSLP
Linda Shockey, Edda Farnetani (1992), Spontaneous speech in English and Italian, ICSLP
Claude Lefebvre, Dariusz A. Zwierzyriski, David R. Starks, Gary Birch (1992), Further optimisation of a robust IMELDA speech recogniser for applications with severely degraded speech, ICSLP
Richard M. Stern, Fu-Hua Liu, Yoshiaki Ohshima, Thomas M. Sullivan, Alejandro Acero (1992), Multiple approaches to robust speech recognition, ICSLP
Tadashi Kitamura, Satoshi Ando, Etsuro Hayahara (1992), Speaker-independent spoken digit recognition in noisy environments using dynamic spectral features and neural networks, ICSLP
Douglas A. Cairns, John H. L. Hansen (1992), ICARUS: an mwave-based real-time speech recognition system in noise and lombard effect, ICSLP
C. Mokbel, L. Barbier, Y. Kerlou, Gérard Chollet (1992), Word recognition in the car: adapting recognizers to new environments, ICSLP
P. Meyer, Hans-Wilhelm Rühl, L. L. M. Vogten (1992), German announcements using synthetic speech the Gauss system, ICSLP
Mervyn A. Jack, J. C. Foster, F. W. Stentiford (1992), Intelligent dialogues in automated telephone services, ICSLP
Palle Bach Nielsen, Anders Baekgaard (1992), Experience with a dialogue description formalism for realistic applications, ICSLP
Solomon Lerner, Baruch Mazor (1992), Compensating for additive-noise in automatic speech recognition, ICSLP
Sho-ichi Matsunaga, Toshiaki Tsuboi, Tomokazu Yamada, Kiyohiro Shikano (1992), Continuous speech recognition for medical diagnoses using a character trigram model, ICSLP
Chengxiang Lu, Takayoshi Nakai, Hisayoshi Suzuki (1992), A three-dimensional FEM simulation of the effects of the vocal tract shape on the transfer function, ICSLP
Kiyoshi Oshimat, Vincent L. Gracco (1992), Mandibular contributions to speech production, ICSLP
Masafumi Matsumura (1992), Measurement of three-dimensional shapes of vocal tract and nasal cavity using magnetic resonance imaging technique, ICSLP
Shigeru Kiritani, Hajime Hirose, Kikuo Maekawa, Tsutomu Sato (1992), Electromyographie studies on the production of pitch contour in accentless dialects in Japanese, ICSLP
Yorinobu Sonoda, Kohichi Ogata (1992), Improvements of magnetometer sensing system for monitoring tongue point movements during speech, ICSLP
Paavo Alku (1992), Inverse filtering of the glottal waveform using the Itakura-saito distortion measure, ICSLP
Kunitoshi Motoki, Nobuhiro Miki (1992), Measurement of intraoral sound pressure distributions of Japanese vowels, ICSLP
Alain Marchal, William J. Hardcastle, K. Nicolaidis, N. Nguyen, F. Gibbon (1992), Non-linear annotation of multi-channel speech data, ICSLP
Shingo Fujiwara, Yasuhiro Komori, Masahide Sugiyama (1992), A phoneme labelling workbench using HMM and spectrogram reading knowledge, ICSLP
Michael Phillips, Victor Zue (1992), Automatic discovery of acoustic measurements for phonetic classification, ICSLP
Itou Katunobu, Hayamizu Satoru, Tanaka Hozumi (1992), Detection of unknown words and automatic estimation of their transcriptions in continuous speech recognition, ICSLP
F. Brugnara, D. Falavigna, Maurizio Omologo (1992), A HMM-based system for automatic segmentation and labeling of speech, ICSLP
Robert W. P. Luk, Robert I. Damper (1992), A modification of the viterbi algorithm for stochastic phonographic transduction, ICSLP
Paul C. Bagshaw, Briony J. Williams (1992), Criteria for labelling prosodic aspects of English speech, ICSLP
Yifan Gong, Jean-Paul Haton (1992), DTW-based phonetic labeling using explicit phoneme duration constraints, ICSLP
Kim Silverman, Mary Beckman, John Pitrelli, Mori Ostendorf, Colin Wightman, Patti Price, Janet Pierrehumbert, Julia Hirschberg (1992), TOBI: a standard for labeling English prosody, ICSLP
Barbara Eisen, Hans-Günther Tillmann, Christoph Draxler (1992), Consistency of judgements in manual labelling of phonetic segments: the distinction between clear and unclear cases, ICSLP
Gunnar Fant (1992), Vocal tract area functions of Swedish vowels and a new three-parameter model, ICSLP
Jean-Claude Junqua (1992), Acoustic and production pilot studies of speech vowels produced in noise, ICSLP
Yves Laprie, Marie-Odile Berger (1992), Active models for regularizing formant trajectories, ICSLP
Rene Carré, Samir Chennoukh, Mohamad Mrayati (1992), Vowel-consonant-vowel transitions: analysis, modeling, and synthesis, ICSLP
Maureen Stone, Subhash Lele (1992), Representing the tongue surface with curve fits, ICSLP
Katherine S. Harris, Eric Vatikiotis-Bateson, Peter J. Alfonso (1992), Muscle forces in vowel vocal tract formation, ICSLP
Makoto Hirayama, Eric Vatikiotis-Bateson, Mitsuo Kawato, Kiyoshi Honda (1992), Neural network modeling of speech motor control, ICSLP
Eric Vatikiotis-Bateson, Makoto Hirayama, Kiyoshi Honda, Mitsuo Kawato (1992), The articulatory dynamics of running speech: gestures from phonemes?, ICSLP
Patricia Keating, B. Blankenship, D. Byrd, E. Flemming, Y. Todaka (1992), Phonetic analyses of the TIMIT corpus of american English, ICSLP
Dani Byrd (1992), Sex, dialects, and reduction, ICSLP
Manjari Ohala, John J. Ohala (1992), Phonetic universals and hindi segment duration, ICSLP
Donna Erickson, Osamu Fujimura (1992), Acoustic and articulatory correlates of contrastive emphasis in repeated corrections, ICSLP
Gary N. Tajchman, Marcia A. Bush (1992), Effects of context and redundancy in the perception of naturally produced English vowels, ICSLP
Ronald Cole, Krist Roginski, Mark Fanty (1992), A telephone speech database of spelled and spoken names, ICSLP
Yeshwant K. Muthusamy, Ronald A. Cole, Beatrice T. Oshika (1992), The OGI multi-language telephone speech corpus, ICSLP
Douglas B. Paul, Janet M. Baker (1992), The design for the wall street journal-based CSR corpus, ICSLP
Lynette Hirschman (1992), Multi-site data collection for a spoken language corpus - MAD COW, ICSLP
Michael Phillips, James Glass, Joseph Polifroni, Victor Zue (1992), Collection and analyses of WSJ-CSR corpus at MIT, ICSLP
Victor Abrash, Horacio Franco, Michael Cohen, Nelson Morgan, Yochai Konig (1992), Connectionist gender adaptation in a hybrid neural network / hidden Markov model speech recognition system, ICSLP
Michael Cohen, Horacio Franco, Nelson Morgan, David Rumelhart, Victor Abrash (1992), Hybrid neural network/hidden Markov model continuous-speech recognition, ICSLP
Gernot A. Fink, Franz Kummert, Gerhard Sagerer, Ernst-Günter Schukat-Talamazzini, Heinrich Niemann (1992), Semantic hidden Markov networks, ICSLP
Use Lehiste, Donna Erickson (1992), Hesitation sounds: is there coarticulation across pause?, ICSLP
Corine A. Bickley, Sheri Hunnicutt (1992), Acoustic analysis of laughter, ICSLP
Douglas O'Shaughnessy (1992), Analysis of false starts in spontaneous speech, ICSLP
Robin J. Lickley, Ellen G. Bard (1992), Processing disfluent speech: recognising disfluency before lexical access, ICSLP
Philippe Morin, Jean-Claude Junqua, Jean-Marie Pierrel (1992), A flexible multimodal dialogue architecture independent of the application, ICSLP
Catia Cucchiarini, Renee van Bezooijen (1992), Familiarity with the language transcribed and context as determinants of intratranscriber agreement, ICSLP
Elizabeth E. Shriberg, Robin J. Lickley (1992), Intonation of clause-internal filled pauses, ICSLP
Elizabeth Wade, Elizabeth Shriberg, Patti Price (1992), User behaviors affecting speech recognition, ICSLP
Sharon Y. Manuel, Stefanie Shattuck-Hufnagel, Marie K. Huffman, Kenneth N. Stevens, Rolf Carlson, Sheri Hunnicutt (1992), Studies of vowel and consonant reduction, ICSLP
Noriko Umeda (1992), Formant frequencies of vowels in English function words, ICSLP
Christian Benoît, Tayeb Mohamadi (1992), The lip benefit: auditory and visual intelligibility of French speech in noise, ICSLP
Anne-Marie Öster (1992), Phonological assessment of deaf children's productive knowledge as a basis for speech-training, ICSLP
Hideaki Seki, Akiko Hayashi, Satoshi Imaizumi, Takehiko Harada, Hiroshi Hosoi (1992), Factors affecting voicing distinction of stops for the hearing impaired, ICSLP
Arthur Boothroyd, Robin S. Waldstein, Eddy Yeung (1992), Investigations into the auditory F0 speechreading enhancement effect using a sinusoidal replica of the F0 contour, ICSLP
Francesco Cutugno (1992), Some considerations on pitch and timing control in deaf children, ICSLP
Shari R. Baum (1992), Rate of speech effects in aphasia: an acoustic analysis of voice onset time, ICSLP
Parth M. Bhatt (1992), Fundamental frequency attributes following unilateral left or right temporal lobe lesion, ICSLP
Hiroshi Hosoi, Satoshi Imaizumi, Akiko Hayashi, Takehiko Harada, Hideaki Seki (1992), Cue extraction and integration in speech perception for the hearing impaired, ICSLP
Anna K. Nabelek (1992), The relationship between spectral details in naturally produced vowels and identification errors in noise and reverberation, ICSLP
Donald G. Jamieson, Leonard Cornelisse (1992), Speech processing effects on intelligibility for hearing-impaired listeners, ICSLP
Mo Fuyuan, Li Changli, Chen Tao (1992), Chinese recognition and synthesis system based on Chinese syllables, ICSLP
Hirofumi Yogo, Naoki Inagaki (1992), Accelerated stochastic approximation method based parameter estimation of monosyllables and their recognition using a neural network, ICSLP
Toomas Altosaar, Matti Karjalainen (1992), Diphone-based speech recognition using time-event neural networks, ICSLP
Giovanni Flammia, Paul Dalsgaard, Ove Andersen, Borge Lindberg (1992), Segment based variable frame rate speech analysis and recognition using a spectral variation function, ICSLP
Christian Benoît (1992), Intelligibility of the French spoken in France compared across listeners from France and from the Ivory Coast, ICSLP
Julie Brousseau, Sally Anne Fox (1992), Dialect-dependent speech recognizers for canadian and european French, ICSLP
Yeshwant K. Muthusamy, Ronald A. Cole (1992), Automatic segmentation and identification of ten languages using telephone speech, ICSLP
Seiichi Nakagawa, Yoshio Ueda, Takashi Seino (1992), Speaker-independent, text-independent language identification by HMM, ICSLP
Shuichi Itahashi, Tsutomu Yamashita (1992), A discrimination method between Japanese dialects, ICSLP
B. Boyanov, Gérard Chollet (1992), Pathological voice analysis using cepstra, bispectra and group delay functions, ICSLP
Qianje Fu, Peyu Xia, Ren Hua Wang (1992), Lateralization of speech sounds by binaural distributing processing, ICSLP
H. H. Rump (1992), Timing of pitch movements and perceived vowel duration, ICSLP
J. P. Liu, G. Baudoin, Gérard Chollet (1992), Studies of glottal excitation and vocal tract parameters using inverse filtering and a parameterized input model, ICSLP
Dennis Norris, Brit van Ooyen, Anne Cutler (1992), Speeded detection of vowels and steady-state consonants, ICSLP
Elzbieta B. Slawinski (1992), Temporal factors in the perception of consonants for different age and hearing impairment groups, ICSLP
Abeer Alwan (1992), The role of F3 and F4 in identifying place of articulation for stop consonants, ICSLP
Thomas R. Sawallis (1992), A new measure for perceptual weight of acoustic cues: an experiment on voicing in French intervocalic [t,d], ICSLP
Alan A. Wrench, Mervyn A. Jack, John Laver, M. S. Jackson, D. S. Soutar, A. G. Robertson, J. MacKenzie (1992), Objective speech quality assessment in patients with intra-oral cancers: voiceless fricatives, ICSLP
Bruce Connell (1992), Tongue contact, active articulators, and coarticulation, ICSLP
Makio Kashino, Astrid van Wieringen, Louis C. W. Pols (1992), Cross-languages differences in the identification of intervocalic stop consonants by Japanese and dutch listeners, ICSLP
Minoru Tsuzaki (1992), Effects of typicality and interstimulus interval on the discrimination of speech stimuli: within-subject comparison, ICSLP
Ronald A. Cole, Yeshwant K. Muthusamy (1992), Perceptual studies on vowels excised from continuous speech, ICSLP
Raymond S. Weitzman (1992), The relative perceptual salience of spectral and durational differences, ICSLP
Florien J. Koopmans-van Beinum (1992), Can 'level words' from one speaking style become teaks' when spliced into another speaking style?, ICSLP
Beverley Gable, Helen Nemeth, Martin Haran (1992), Speech errors and task demand, ICSLP
John H. Esling, B. Craig Dickson, Roy C. Snell (1992), Analysis of phonation type using laryngographic techniques, ICSLP
Sumi Shigeno (1992), Effect of prototypes of vowels on speech perception in Japanese and English, ICSLP
Tomo-o Morohashi, Tetsuya Shimamura, Hiroyuki Yashima, Jouji Suzuki (1992), Characteristics of voice picked up from outer skin of larynx, ICSLP
Igor V. Nabelek (1992), Coding of voicing in whispered plosives, ICSLP
Margaret F. Cheesman, Shelly Lawrence, Allison Appleyard (1992), Performance on a nonsense syllable test using the articulation index, ICSLP
Donald G. Jamieson, Ketan Ramji, Issam Kheirallah, Terrance M. Nearey (1992), CSRE: a speech research environment, ICSLP
Kazue Hata, Yoko Hasegawa (1992), A study of F0 reset in naturally-read utterances in Japanese, ICSLP
H. Samuel Wang, Fu-Dong Chiu (1992), On the nature of tone sandhi rules in taiwanese, ICSLP
Geoffrey S. Nathan (1992), How shallow is phonology: declarative phonologies meet fast speech, ICSLP
Junko Hosaka, Toshiyuki Takezawa, Noriyoshi Uratani (1992), Analyzing postposition drops in spoken Japanese, ICSLP
Jialu Zhang, Xinghui Hu (1992), Fundamental frequency patterns of Chinese in different speech modes, ICSLP
Knut Kvale, Ante Kjell Foldvik (1992), The multifarious r-sound, ICSLP
Zita McRobbie-Utasi (1992), The role of preaspiration duration in the voicing contrast in skolt sami, ICSLP
Eiji Yamada (1992), Parameter setting for abstract stress in tokyo Japanese, ICSLP
Georg E. Ottesen (1992), A method for studying prosody in texts read aloud, ICSLP
Vincent J. van Heuven (1992), Linguistic versus phonetic explanation of consonant lengthening after short vowels: a contrastive study of dutch and English, ICSLP
Kjell Elenius, Mats Blomberg (1992), Comparing phoneme and feature based speech recognition using artificial neural networks, ICSLP
Eva Strangert (1992), Prosodic cues to the perception of syntactic boundaries, ICSLP
Paul Taylor, Stephen Isard (1992), A new model of intonation for use with speech synthesis and recognition, ICSLP
Rudolf Weiss (1992), Computerized error detection/correction in teaching German sounds: some problems and solutions, ICSLP
Ahmed M. Elgendy (1992), Velum and epiglottis behavior during the production of Arabic pharyngeals and laryngeals: a fiberscopic study, ICSLP
Kim Silverman, Eleonora Blaauw, Judith Spitz, John F. Pitrelli (1992), A prosodic comparison of spontaneous speech and read speech, ICSLP
John J. Ohala, Maria Grazia Busa, Karen Harrison (1992), Phonological and psychological evidence that listeners normalize the speech signal, ICSLP
Elizabeth A. Hinkelman (1992), Intonation and the request/question distinction, ICSLP
Robert F. Port, Fred Cummins (1992), The English voicing contrast as velocity perturbation, ICSLP
Michael S. Ziolkowski, Mayumi Usami, Karen L. Landahl, Brenda K. Tunnock (1992), How many phonologies are there in one speaker? some experimental evidence, ICSLP
Hirokazu Sato (1992), Decomposition into syllable complexes and the accenting of Japanese loanwords, ICSLP
Jianfen Cao (1992), Temporal structure in bisyllabic word frame: an evidence for relational invariance and variability from standard Chinese, ICSLP
Shih-ping Wang (1992), The integration of phonetics and phonology: a case study of taiwanese "gemination" and syllable structure, ICSLP
James H. Bradford (1992), Towards a robust speech interface for teleoperation systems, ICSLP
Piero Cosi, P. Frasconi, M. Gori, N. Griggio (1992), Phonetic recognition experiments with recurrent neural networks, ICSLP
Mikael Goldstein, Björn Lindström, Ove Till (1992), Some aspects on context and response range effects when assessing naturalness of Swedish sentences generated by 4 synthesiser systems, ICSLP
Marcello Pelillo, Franca Moro, Mario Refice (1992), Probabilistic prediction of parts-of-speech from word spelling using decision trees, ICSLP
D. Barschdorff, U. Gartner (1992), Single word detection system with a neural classifier for recognizing speech at variable levels of background noise, ICSLP
Sharon Oviatt, Philip Cohen, Martin Fong, Michael Frank (1992), A rapid semi-automatic simulation technique for investigating interactive speech and handwriting, ICSLP
Sang-Hwa Chung, Dan Moldovan (1992), Speech understanding on a massively parallel computer, ICSLP
Chan-Do Lee (1992), Rationale for "performance phonology", ICSLP
Takuya Koizumi, Jyoji Urata, Shuji Taniguchi (1992), The effect of information feedback on the performance of a phoneme recognizer using kohonen map, ICSLP
Yasuharu Asano, Keikichi Hirose, Hiroya Fujisaki (1992), A method of dialogue management for the speech response system, ICSLP
Yumi Takizawa, Eiichi Tsuboka (1992), Syllable duration prediction for speech recognition, ICSLP
F. Canavesio, G. Castagneri, G. Di Fabbrizio, F. Senia (1992), Comparison between two methodologies of testing isolated word speech recognizers, ICSLP
He Jun, Henri Leich (1992), Extracting fuzzy features from MLP for recognition of speech, ICSLP
Keiji Fukuzawa, Yoshinaga Kato, Masahide Sugiyama (1992), A fuzzy partition model (FPM) neural network architecture for speaker-independent continuous speech recognition, ICSLP
A. Ennaji, Jean Rouat (1992), Conception of speech filters based on a neural network, ICSLP
Jeff Kuo, Chin-Hui Lee, Aaron E. Rosenberg (1992), Speaker set identification through speaker group modeling, ICSLP
Stephen Springer, Sara Basson, Judith Spitz (1992), Identification of principal ergonomic requirements for interactive spoken language systems, ICSLP
Thomas E. Jacobs, Eric R. Buhrke (1992), Performance of the united kingdom intelligent network automatic speech recognition system, ICSLP
Guy Deville, Pierre Mousel (1992), Evaluation of parsing strategies in natural language spoken man-machine dialogue, ICSLP
Yasuhisa Niimi, Yutaka Kobayashi (1992), An information retrieval system with a speech interface, ICSLP
J. P. Eatock, J. S. D. Mason (1992), Phoneme performance in speaker recognition, ICSLP
Evelyne Tzoukermann, Roberto Pieraccini, Zakhar Gorelov (1992), Natural language processing in the chronus system, ICSLP
Dominique Francois, Dominique Fohr (1992), Contribution of neural networks for phoneme identification in the APHODEX expert system, ICSLP
Douglas B. Paul (1992), A CSR-NL interface architecture, ICSLP
R. Lefebvre, F. Poirier, G. Duncan (1992), Speech interface for a man-machine dialog with the unix operating system, ICSLP
P. Bardaud, F. Capman, C. Mokbel, C. Tadj, Gérard Chollet (1992), Transformation of databases for the evaluation of speech recognizers, ICSLP
Yoichi Yamashita, Riichiro Mizoguchi (1992), Dialog management for speech output from concept representation, ICSLP
Seiichiro Hangai, Shigetoshi Sugiyama, Kazuhiro Miyauchi (1992), Speaker verification using locations and sizes of multipulses on neural networks, ICSLP
Carlos J. Teixeira, Isabel M. Trancoso (1992), Word rejection using multiple sink models, ICSLP
Boerge Lindberg (1992), Verification of language specific performance factors from recogniser testing on EUROM.1 CVC material, ICSLP
Alain Cozannet (1992), Modeling task driven oral dialogue, ICSLP
Wei-ying Li, Kechu Yi, Zheng Hu (1992), Introducing neural predictor to hidden Markov model for speech recognition, ICSLP
Feng Liu, Jianxin Jiang, Jun Cheng, Kechu Yi (1992), A neural network based on subnets - SNN, ICSLP
Ute Ziegenhain (1992), Syntactic anaphora resolution in a speech understanding system, ICSLP
Marion Mast, Ralf Kompe, Franz Kummert, Heinrich Niemann, Elmar Noth (1992), The dialog module of the speech recognition and dialog system EVAR, ICSLP
Yan Ming Cheng, Douglas O'Shaughnessy, Paul Mermelstein (1992), Statistical recovery of wideband speech from narrowband speech, ICSLP
Henk van den Heuvel, Toni Rietveld (1992), Speaker related variability in cepstral representations of dutch speech segments, ICSLP
Per Rosenbeck, Bo Baungaard (1992), Experiences from a real-world telephone application: teledialogue, ICSLP
K. Y. Lee, P. Ha, J. Rheem, S. Ann, I. Song (1992), Robust estimation of time-varying LP parameters on speech, ICSLP
Javier Hernando, Climent Nadeu, Eduardo Lleida (1992), On the AR modelling of the one-sided autocorrelation sequence for noisy speech recognition, ICSLP
Hiroshi Shimodaira, Mitsuru Nakai (1992), Robust pitch detection by narrow band spectrum analysis, ICSLP
S. Eady, B. Craig Dickson, Roy C. Snell, J. Woolsey, P. Ollek, A. Wynrib, J. Clayards (1992), A microcomputer-based system for real-time analysis and display of laryngograph signals, ICSLP
N. M. Veilleux, Mari Ostendorf, Colin Wightman (1992), Parse scoring with prosodic information, ICSLP
Ying Cheng, Paul Fortier, Yves Normandin (1992), Topic identification using a neural network with a keyword-spotting preprocessor, ICSLP
Shane Switzer, Tim Anderson, Matthew Kabrisky, Steven K. Rogers, Bruce Suter (1992), Frequency domain speech coding, ICSLP
Raymond Descout, Robert Bergeron, Bernard Meriald (1992), MEDIATEX-TASF: a closed captioning real-time service in French, ICSLP
S. A. Wilde, K. M. Curtis (1992), The wavelet transform for speech analysis, ICSLP
Pablo Aibar, Andres Marzal, Enrique Vidal, Francisco Casacuberta (1992), Problems and algorithms in optimal linguistic decoding: a unified formulation, ICSLP
Jean Rouat, Sylvain Lemieux, Alain Migneault (1992), A spectro-temporal analysis of speech based on nonlinear operators, ICSLP
Miguel A. Berrojo, Javier Corrales, Jesus Macias, Santiago Aguilera (1992), A PC graphic tool for speech research based on a DSP board, ICSLP
Satoru Hayamizu, Katunobu Itou, Masafumi Tamoto, Kazuyo Tanaka (1992), A spoken language dialogue system for automatic collection of spontaneous speech, ICSLP
Shingo Nishioka, Yoichi Yamashita, Riichiro Mizoguchi (1992), A powerful disambiguating mechanism for speech understanding systems based on ATMs, ICSLP
Najib Naja, Jean Marc Boucher, Samir Saoudi (1992), A mixed Gaussian-stochastic code book for CELP coder in LSP speech coding, ICSLP
Hiroyuki Kamata, Yoshihisa Ishida (1992), A method to estimate the transfer function of ARMA model of speech wave using prony method and homomorphic analysis, ICSLP
Boerge Lindberg, Bjarne Andersen, Anders Baekgaard, Tom Broendsted, Paul Dalsgaard, Jan Kristiansen (1992), An integrated dialogue design and continuous speech recognition system environment, ICSLP
Alain Marchal, C. Meunier, P. Gavarry (1992), The PSH/DISPE helium speech cdrom, ICSLP
John J. Ohala (2015), A brief history of experimental phonetics in the 18th and 19th centuries, HSCR
Walter Schmitz (2015), The power of communication. apps as human substitutes in science-fiction films, HSCR
Rüdiger Hoffmann, Dieter Mehnert (2015), Recent development of the historic acoustic-phonetic collection of the TU Dresden, HSCR
Massimo Pettorino (2015), The history of talking heads: the trick and the research, HSCR
Fabian Brackhane (2015), Kempelen vs. Kratzenstein – researchers on speech synthesis in times of change, HSCR
Silke Berdux (2015), „eine kempelensche sprechmaschine“. new insights in speaking machines in the late 18th and early 19th centuries, HSCR
Christian Korpiun (2015), Kratzenstein’s vowel resonators – reflections on a revival, HSCR
Rüdiger Hoffmann (2015), Voices for toys – first commercial spin-offs in speech synthesis, HSCR
Didier Demolin (2015), The contribution of the kymograph to the description of african languages, HSCR
Mária Gósy (2015), A 75-year-old Hungarian spontaneous speech database, HSCR
Dieter Studer-Joho (2015), The early Swiss dialect recording collection “LA” (1924–1927): a description and a work plan for its comprehensive edition, HSCR
Pavel Šturm (2015), The Prague historical collection of tuning forks: a surviving replica of the Koenig tonometre, HSCR
Angelika Braun (2015), William Holder – a pioneer of phonetics, HSCR
Michael Ashby (2015), Experimental phonetics at University College London before World War I, HSCR
Hans G. Tillmann, Jessica Siddins (2015), The "bonn connection" and its consequences: Paul Menzerath and Werner Meyer-Eppler's reunification of phonetics and phonology and the emergence of a new phonetic speech science based on Shannon’s mathematical theory of communication, HSCR
Reijo Aulanko (2015), Hugo pipping – a pioneer phonetician at the University of Helsinki, HSCR
Coriandre Vilain, Frédéric Berthommier, Louis-Jean Boë (2015), A brief history of articulatory-acoustic vowel representation, HSCR
Jürgen Trouvain (2015), Notes on the development of speaking styles over decades – the case of live football commentaries, HSCR
Herbert H. Clark (1999), Speaking in time, DiaPro
Julia Hirschberg (1999), Communication and prosody: Functional aspects of prosody, DiaPro
Stephen Pulman (1999), Relating dialogue games to information state, DiaPro
Elmar Nöth, Anton Batliner, Volker Warnke, J. Haas, M. Boros, J. Buckow, R. Huber, F. Gallwitz, Matthias Nutt, Heinrich Niemann (1999), On the use of prosody in automatic dialogue understanding, DiaPro
Vincent J. van Heuven, Judith Haan, Robert S. Kirsner (1999), Phonetic correlates of sentence type in Dutch: Statement, question and command, DiaPro
Roddy Cowie, Ellen Douglas-Cowie, A. Romano (1999), Changing emotional tone in dialogue and its prosodic correlates, DiaPro
Petra Wagner, Thomas Portele (1999), Two dimensions of prominence, DiaPro
Pierre Larrey (1999), A framework to allow dialogue systems to generate context-sensitive, DiaPro
Hiromichi Kawanami, Keikichi Hirose (1999), Speech rate control for dialogue speech synthesis based on the prosodic structures, DiaPro
Jaakko Hakulinen, Markku Turunen, Kari-Jouko Räihä (1999), The use of prosodic features to help users extract information from structured elements in spoken dialogue systems, DiaPro
Merle Horne, Petra Hansson, Gösta Bruce, Johan Frid, Arne Jönsson (1999), Accentuation of domain-related information in Swedish dialogues, DiaPro
Johanneke Caspers (1999), The meaning of melodic elements in Dutch, DiaPro
Nigel Ward (1999), Low-pitch regions as dialogue signals? Evidence from dialog-act and lexical correlates in natural conversation, DiaPro
Helen Wright, Massimo Poesio, Stephen Isard (1999), Using high level dialogue information for dialogue act recognition using prosodic features, DiaPro
Kurt Dusterhoff (1999), Automatic intonation analysis using acoustic data, DiaPro
Matthias Nutt, Anton Batliner, Volker Warnke, Elmar Nöth (1999), Using phrase accent information for dialog act recognition in spontaneous German speech, DiaPro
Jan Buckow, Richard Huber, Volker Warnke, Anton Batliner, Elmar Nöth, Heinrich Niemann (1999), Multi-lingual prosodic processing, DiaPro
F. Gallwitz, Heinrich Niemann, Elmar Nöth, Volker Warnke (1999), Prosodic information for integrated word-and-boundary recognition, DiaPro
Emiel Krahmer, Marc Swerts, Mariet Theune, Mieke Weegels (1999), Prosodic correlates of disconfirmations, DiaPro
Greg Aist, Jack Mostow (1999), Measuring the effects of backchanneling in computerized oral reading tutoring, DiaPro
Hannes Pirker, Georg Loderer (1999), I said "TWO TI-CKETS": How to talk to a deaf wizard, DiaPro
Atsushi Shimojima, Yasuhiro Katagiri, Hanae Koiso, Marc Swerts (1999), An experimental study on the informational and grounding functions of prosodic features of Japanese echoic responses, DiaPro
Gina-Anne Levow (1999), Understanding recognition failures in spoken corrections in human-computer dialogue, DiaPro
Ivan Kopecek (1999), Syllable-based approach to automatic prosody detection: Applications for dialogue systems, DiaPro
Gayle Ayers Elam, Sarah C. Wayland (1999), Prosody and prompt design in a computer dialog system, DiaPro
Petra Hansson (1999), Prosodic correlates of discourse markers in dialogue, DiaPro
Anne Kuosmanen (1999), On the relationship between the melodical structure and discourse functions of the particles NU and VOT in spontaneous Russian, DiaPro
Toni Rietveld, Carlos Gussenhoven, Anne Wichmann, Esther Grabe (1999), The communicative effects of rising and falling pitch accents in British English and Dutch, DiaPro
Christian Lachaud, Geneviève Caelen-Haumont, Joël Pynte, Robert Espesser (1999), The role of prosodic cues in ASR, expert knowledge and human perception: A comparison of performance for French word recognition, DiaPro
Kerstin Fischer (1999), Discourse effects on the prosodic properties of repetitions in human-computer interaction, DiaPro
Rupal Patel (1999), Prosody conveys information in severely impaired speech, DiaPro
Susanne Jekat (1999), Prosodic cues as basis for restructuring, DiaPro
Sadaoki Furui (2011), Data-intensive approaches for ASR, IWSLT
Daniel Marcu (2011), Meaning-equivalent semantics for understanding, generation, translation, and evaluation, IWSLT
Junichi Tsujii (2011), Resource-rich research on natural language processing and understanding, IWSLT
Marcello Federico, Luisa Bentivogli, Michael Paul, Sebastian Stüker (2011), Overview of the IWSLT 2011 evaluation campaign, IWSLT
Kazuhiko Abe, Youzheng Wu, Chien-lin Huang, Paul R. Dixon, Shigeki Matsuda, Chiori Hori, Hideki Kashioka (2011), The NICT ASR system for IWSLT2011, IWSLT
A. Ryan Aminzadeh, Tim Anderson, Ray Slyh, Brian Ore, Eric Hansen, Wade Shen, Jennifer Drexler, Terry Gleason (2011), The MIT-LL/AFRL IWSLT-2011 MT system, IWSLT
Pratyush Banerjee, Hala Almaghout, Sudip Naskar, Johann Roturier, Jie Jiang, Andy Way, Josef van Genabith (2011), The DCU machine translation systems for IWSLT 2011, IWSLT
Andrew Finch, Chooi-Ling Goh, Graham Neubig, Eiichiro Sumita (2011), The NICT translation system for IWSLT 2011, IWSLT
Xiaodong He, Amittai Axelrod, Li Deng, Alex Acero, Mei-Yuh Hwang, Alisa Nguyen, Andrew Wang, Xiahui Huang (2011), The MSR SYSTEM for IWSLT 2011 evaluation, IWSLT
Thomas Lavergne, Alexandre Allauzen, Hai-Son Le, François Yvon (2011), LIMSI's experiments in domain adaptation for IWSLT11, IWSLT
Benjamin Lecouteux, Laurent Besacier, Hervé Blanchon (2011), LIG English-French spoken language translation system for IWSLT 2011, IWSLT
Mohammed Mediani, Eunah Cho, Jan Niehues, Teresa Herrmann, Alex Waibel (2011), The KIT English-French translation systems for IWSLT 2011, IWSLT
Anthony Rousseau, Fethi Bougares, Paul Deléglise, Holger Schwenk, Yannick Estève (2011), LIUM's systems for the IWSLT 2011 speech translation tasks, IWSLT
Nick Ruiz, Arianna Bisazza, F. Brugnara, D. Falavigna, D. Giuliani, S. Jaber, R. Gretter, Marcello Federico (2011), FBK @ IWSLT 2011, IWSLT
Sebastian Stüker, Kevin Kilgour, Christian Saam, Alex Waibel (2011), The 2011 KIT English ASR system for the IWSLT evaluation, IWSLT
David Vilar, Eleftherios Avramidis, Maja Popović, Sabine Hunsicker (2011), DFKI's SC and MT submissions to IWSLT 2011, IWSLT
Joern Wuebker, Matthias Huck, Saab Mansour, Markus Freitag, Minwei Feng, Stephan Peitz, Christoph Schmidt, Hermann Ney (2011), The RWTH Aachen machine translation system for IWSLT 2011, IWSLT
Karim Boudahmane, Bianka Buschbeck, Eunah Cho, Josep Maria Crego, Markus Freitag, Thomas Lavergne, Hermann Ney, Jan Niehues, Stephan Peitz, Jean Senellart, Artem Sokolov, Alex Waibel, Tonio Wandmacher, Joern Wuebker, François Yvon (2011), Advances on spoken language translation in the Quaero program, IWSLT
Lori Lamel, Sandrine Courcinous, Julien Despres, Jean-Luc Gauvain, Yvan Josse, Kevin Kilgour, Florian Kraft, Viet Bac Le, Hermann Ney, Markus Nußbaum-Thom, Ilya Oparin, Tim Schlippe, Ralf Schlüter, Tanja Schultz, Thiago Fraga da Silva, Sebastian Stüker, Martin Sundermeyer, Bianca Vieru, Ngoc Thang Vu, Alex Waibel, Cècile Woehrling (2011), Speech recognition for machine translation in Quaero, IWSLT
Victoria Arranz, Olivier Hamon, Karim Boudahmane, Martine Garnier-Rizet (2011), Protocol and lessons learnt from the production of parallel corpora for the evaluation of speech translation systems, IWSLT
Arianna Bisazza, Nick Ruiz, Marcello Federico (2011), Fill-up versus interpolation methods for phrase-based SMT adaptation, IWSLT
Boxing Chen, Roland Kuhn, George Foster (2011), Semantic smoothing and fabrication of phrase pairs for SMT, IWSLT
Tagyoung Chung, Licheng Fang, Daniel Gildea (2011), SCFG latent annotation for machine translation, IWSLT
Chenchen Ding, Takashi Inui, Mikio Yamamoto (2011), Long-distance hierarchical structure transformation rules utilizing function words, IWSLT
Paul R. Dixon, Andrew Finch, Chiori Hori, Hideki Kashioka (2011), Investigation on the effects of ASR tuning on speech translation performance, IWSLT
Mridul Gupta, Sanjika Hewavitharana, Stephan Vogel (2011), Extending a probabilistic phrase alignment approach for SMT, IWSLT
Kenneth Heafield, Hieu Hoang, Philipp Koehn, Tetsuo Kiso, Marcello Federico (2011), Left language model state for syntactic machine translation, IWSLT
Matthias Huck, Saab Mansour, Simon Wiesler, Hermann Ney (2011), Lexicon models for hierarchical phrase-based machine translation, IWSLT
Kevin Kilgour, Christian Saam, Christian Mohr, Sebastian Stüker, Alex Waibel (2011), The 2011 KIT QUAERO speech-to-text system for Spanish, IWSLT
Wang Ling, Pável Calado, Bruno Martins, Isabel Trancoso, Alan Black, Luísa Coheur (2011), Named entity translation using anchor texts, IWSLT
Paul Maergner, Kevin Kilgour, Ian Lane, Alex Waibel (2011), Unsupervised vocabulary selection for simultaneous lecture translation, IWSLT
Saab Mansour, Joern Wuebker, Hermann Ney (2011), Combining translation and language model scoring for domain-specific data filtering, IWSLT
Jan Niehues, Alex Waibel (2011), Using Wikipedia to translate domain-specific terms in SMT, IWSLT
Stephan Peitz, Markus Freitag, Arne Mauser, Hermann Ney (2011), Modeling punctuation prediction as machine translation, IWSLT
Jan-Thorsten Peter, Matthias Huck, Hermann Ney, Daniel Stein (2011), Soft string-to-dependency hierarchical machine translation, IWSLT
Anne H. Schneider, Saturnino Luz (2011), Speaker alignment in synthesised, machine translated communication, IWSLT
Nadi Tomeh, Marco Turchi, Guillaume Wisinewski, Alexandre Allauzen, François Yvon (2011), How good are your phrases? assessing phrase quality with single class classification, IWSLT
Keiji Yasuda, Hideo Okuma, Masao Utiyama, Eiichiro Sumita (2011), Annotating data selection for improving machine translation, IWSLT
Sanjoy Dasgupta (2011), Recent advances in active learning, MLSLP
Lawrence Saul, Chih-Chieh Cheng, Fei Sha (2011), Online learning of large margin hidden Markov models for automatic speech recognition, MLSLP
Eduard Hovy (2011), On the role of machine learning in NLP, MLSLP
George Saon, Jen-Tzung Chien (2011), Bayesian sensing hidden Markov models for speech recognition, MLSLP
Jason Eisner, Markus Dreyer (2011), A non-parametric Bayesian approach to inflectional morphology, MLSLP
Mark Hasegawa-Johnson, Jui-Ting Huang, Xiaodan Zhuang (2011), Unlabeled data and other marginals, MLSLP
Ming-Wei Chang, James Clarke, Dan Goldwasser, Lev Ratinov, Vivek Srikumar, Dan Roth (2011), Structured prediction with indirect supervision, MLSLP
Jeff Bilmes, Hui Lin, Andrew Guillory (2011), Applications of submodular functions in speech and NLP, MLSLP
David McAllester (2011), Generalization bounds and consistency for latent-structural probit and ramp loss, MLSLP
Mark Steedman (2011), Some open problems in machine learning for NLP, MLSLP
Stanley Chen, Stephen Chu, Ahmad Emami, Lidia Mangu, Bhuvana Ramabhadran, Ruhi Sarikaya, Abhinav Sethy (2011), Performance prediction and shrinking language models, MLSLP
Yoshua Bengio (2011), On learning distributed representations of semantics, MLSLP
Robert Moore, John DeNero (2011), L1 and L2 regularization for multiclass hinge loss models, MLSLP
Shirin Badiezadegan, Richard Rose (2011), A comparison of performance monitoring approaches to fusing spectrogram channels in speech recognition, MLSLP
Meng Sun, Hugo Van hamme (2011), A two-layer non-negative matrix factorization model for vocabulary discovery, MLSLP
Deryle Lonsdale, Carl Christensen (2011), Automating the scoring of elicited imitation tests, MLSLP
Rushin Shah, Bo Lin, Kevin Dela Rosa, Anatole Gershman, Robert Frederking (2011), Improving cross-document co-reference with semi-supervised information extraction modelsi, MLSLP
David Chen, Raymond Mooney (2011), Panning for gold: finding relevant semantic content for grounded language learning, MLSLP
Hynek Hermansky, Nima Mesgarani, Samuel Thomas (2011), Performance monitoring for robustness in automatic recognition of speechi, MLSLP
Michael Pucher, Thomas Woltron (2021), Conversion of Airborne to Bone-Conducted Speech with Deep Neural Networks, Interspeech
Markéta Řezáčková, Jan Švec, Daniel Tihelka (2021), T5G2P: Using Text-to-Text Transfer Transformer for Grapheme-to-Phoneme Conversion, Interspeech
Olivier Perrotin, Hussein El Amouri, Gérard Bailly, Thomas Hueber (2021), Evaluating the Extrapolation Capabilities of Neural Vocoders to Extreme Pitch Values, Interspeech
Phat Do, Matt Coler, Jelske Dijkstra, Esther Klabbers (2021), A Systematic Review and Analysis of Multilingual Data Strategies in Text-to-Speech for Low-Resource Languages, Interspeech
Tanya Talkar, Nancy Pearl Solomon, Douglas S. Brungart, Stefanie E. Kuchinsky, Megan M. Eitel, Sara M. Lippa, Tracey A. Brickell, Louis M. French, Rael T. Lange, Thomas F. Quatieri (2021), Acoustic Indicators of Speech Motor Coordination in Adults With and Without Traumatic Brain Injury, Interspeech
J.C. Vásquez-Correa, Julian Fritsch, J.R. Orozco-Arroyave, Elmar Nöth, Mathew Magimai-Doss (2021), On Modeling Glottal Source Information for Phonation Assessment in Parkinson’s Disease, Interspeech
Khalid Daoudi, Biswajit Das, Solange Milhé de Saint Victor, Alexandra Foubert-Samier, Anne Pavy-Le Traon, Olivier Rascol, Wassilios G. Meissner, Virginie Woisard (2021), Distortion of Voiced Obstruents for Differential Diagnosis Between Parkinson’s Disease and Multiple System Atrophy, Interspeech
Pu Wang, Bagher BabaAli, Hugo Van hamme (2021), A Study into Pre-Training Strategies for Spoken Language Understanding on Dysarthric Speech, Interspeech
Rosanna Turrisi, Arianna Braccia, Marco Emanuele, Simone Giulietti, Maura Pugliatti, Mariachiara Sensi, Luciano Fadiga, Leonardo Badino (2021), EasyCall Corpus: A Dysarthric Speech Dataset, Interspeech
Xiaoyu Bie, Laurent Girin, Simon Leglaive, Thomas Hueber, Xavier Alameda-Pineda (2021), A Benchmark of Dynamical Variational Autoencoders Applied to Speech Spectrogram Modeling, Interspeech
Metehan Yurt, Pavan Kantharaju, Sascha Disch, Andreas Niedermeier, Alberto N. Escalante-B, Veniamin I. Morgenshtern (2021), Fricative Phoneme Detection Using Deep Neural Networks and its Comparison to Traditional Methods, Interspeech
RaviShankar Prasad, Mathew Magimai-Doss (2021), Identification of F1 and F2 in Speech Using Modified Zero Frequency Filtering, Interspeech
Yann Teytaut, Axel Roebel (2021), Phoneme-to-Audio Alignment with Recurrent Neural Networks for Speaking and Singing Voice, Interspeech
Seong-Hu Kim, Yong-Hwa Park (2021), Adaptive Convolutional Neural Network for Text-Independent Speaker Recognition, Interspeech
Jiajun Qi, Wu Guo, Bin Gu (2021), Bidirectional Multiscale Feature Aggregation for Speaker Verification, Interspeech
Yu-Jia Zhang, Yih-Wen Wang, Chia-Ping Chen, Chung-Li Lu, Bo-Cheng Chan (2021), Improving Time Delay Neural Network Based Speaker Recognition with Convolutional Block and Feature Aggregation Methods, Interspeech
Yanfeng Wu, Junan Zhao, Chenkai Guo, Jing Xu (2021), Improving Deep CNN Architectures with Variable-Length Training Samples for Text-Independent Speaker Verification, Interspeech
Tinglong Zhu, Xiaoyi Qin, Ming Li (2021), Binary Neural Network for Speaker Verification, Interspeech
Youzhi Tu, Man-Wai Mak (2021), Mutual Information Enhanced Training for Speaker Embedding, Interspeech
Ge Zhu, Fei Jiang, Zhiyao Duan (2021), Y-Vector: Multiscale Waveform Encoder for Speaker Embedding, Interspeech
Yan Liu, Zheng Li, Lin Li, Qingyang Hong (2021), Phoneme-Aware and Channel-Wise Attentive Learning for Text Dependent Speaker Verification, Interspeech
Hongning Zhu, Kong Aik Lee, Haizhou Li (2021), Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding, Interspeech
Cheng Gong, Longbiao Wang, Ju Zhang, Shaotong Guo, Yuguang Wang, Jianwu Dang (2021), TacoLPCNet: Fast and Stable TTS by Conditioning LPCNet on Mel Spectrogram Predictions, Interspeech
Taejun Bak, Jae-Sung Bae, Hanbin Bae, Young-Ik Kim, Hoon-Young Cho (2021), FastPitchFormant: Source-Filter Based Decomposed Modeling for Speech Synthesis, Interspeech
Taiki Nakamura, Tomoki Koriyama, Hiroshi Saruwatari (2021), Sequence-to-Sequence Learning for Deep Gaussian Process Based Speech Synthesis Using Self-Attention GP Layer, Interspeech
Naoto Kakegawa, Sunao Hara, Masanobu Abe, Yusuke Ijima (2021), Phonetic and Prosodic Information Estimation from Texts for Genuine Japanese End-to-End Text-to-Speech, Interspeech
Xudong Dai, Cheng Gong, Longbiao Wang, Kaili Zhang (2021), Information Sieve: Content Leakage Reduction in End-to-End Prosody Transfer for Expressive Speech Synthesis, Interspeech
Qingyun Dou, Xixin Wu, Moquan Wan, Yiting Lu, Mark J.F. Gales (2021), Deliberation-Based Multi-Pass Speech Synthesis, Interspeech
Isaac Elias, Heiga Zen, Jonathan Shen, Yu Zhang, Ye Jia, R.J. Skerry-Ryan, Yonghui Wu (2021), Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling, Interspeech
Chunyang Wu, Zhiping Xiu, Yangyang Shi, Ozlem Kalinli, Christian Fuegen, Thilo Koehler, Qing He (2021), Transformer-Based Acoustic Modeling for Streaming Speech Synthesis, Interspeech
Ye Jia, Heiga Zen, Jonathan Shen, Yu Zhang, Yonghui Wu (2021), PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS, Interspeech
Zhenhao Ge, Lakshmish Kaushik, Masanori Omote, Saket Kumar (2021), Speed up Training with Variable Length Inputs by Efficient Batching Strategies, Interspeech
Yuhang Sun, Linju Yang, Huifeng Zhu, Jie Hao (2021), Funnel Deep Complex U-Net for Phase-Aware Speech Enhancement, Interspeech
Qiquan Zhang, Qi Song, Aaron Nicolson, Tian Lan, Haizhou Li (2021), Temporal Convolutional Network with Frequency Dimension Adaptive Attention for Speech Enhancement, Interspeech
Changjie Pan, Feng Yang, Fei Chen (2021), Perceptual Contributions of Vowels and Consonant-Vowel Transitions in Understanding Time-Compressed Mandarin Sentences, Interspeech
Ritujoy Biswas, Karan Nathwani, Vinayak Abrol (2021), Transfer Learning for Speech Intelligibility Improvement in Noisy Environments, Interspeech
Ayako Yamamoto, Toshio Irino, Kenichi Arai, Shoko Araki, Atsunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani (2021), Comparison of Remote Experiments Using Crowdsourcing and Laboratory Experiments on Speech Intelligibility, Interspeech
Wenzhe Liu, Andong Li, Yuxuan Ke, Chengshi Zheng, Xiaodong Li (2021), Know Your Enemy, Know Yourself: A Unified Two-Stage Framework for Speech Enhancement, Interspeech
Qiuqiang Kong, Haohe Liu, Xingjian Du, Li Chen, Rui Xia, Yuxuan Wang (2021), Speech Enhancement with Weakly Labelled Data from AudioSet, Interspeech
Tsun-An Hsieh, Cheng Yu, Szu-Wei Fu, Xugang Lu, Yu Tsao (2021), Improving Perceptual Quality by Phone-Fortified Perceptual Loss Using Wasserstein Distance for Speech Enhancement, Interspeech
Szu-Wei Fu, Cheng Yu, Tsun-An Hsieh, Peter Plantinga, Mirco Ravanelli, Xugang Lu, Yu Tsao (2021), MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement, Interspeech
Amin Edraki, Wai-Yip Chan, Jesper Jensen, Daniel Fogerty (2021), A Spectro-Temporal Glimpsing Index (STGI) for Speech Intelligibility Prediction, Interspeech
Yuanhang Qiu, Ruili Wang, Satwinder Singh, Zhizhong Ma, Feng Hou (2021), Self-Supervised Learning Based Phone-Fortified Speech Enhancement, Interspeech
Khandokar Md. Nayem, Donald S. Williamson (2021), Incorporating Embedding Vectors from a Human Mean-Opinion Score Prediction Model for Monaural Speech Enhancement, Interspeech
Jianwei Zhang, Suren Jayasuriya, Visar Berisha (2021), Restoring Degraded Speech via a Modified Diffusion Model, Interspeech
Hoang Long Nguyen, Vincent Renkens, Joris Pelemans, Srividya Pranavi Potharaju, Anil Kumar Nalamalapu, Murat Akbacak (2021), User-Initiated Repetition-Based Recovery in Multi-Utterance Dialogue Systems, Interspeech
Nuo Chen, Chenyu You, Yuexian Zou (2021), Self-Supervised Dialogue Learning for Spoken Conversational Question Answering, Interspeech
Ruolin Su, Ting-Wei Wu, Biing-Hwang Juang (2021), Act-Aware Slot-Value Predicting in Multi-Domain Dialogue State Tracking, Interspeech
Yuya Chiba, Ryuichiro Higashinaka (2021), Dialogue Situation Recognition for Everyday Conversation Using Multimodal Information, Interspeech
Yoshihiro Yamazaki, Yuya Chiba, Takashi Nose, Akinori Ito (2021), Neural Spoken-Response Generation Using Prosodic and Linguistic Context for Conversational Systems, Interspeech
Weiyuan Xu, Peilin Zhou, Chenyu You, Yuexian Zou (2021), Semantic Transportation Prototypical Network for Few-Shot Intent Detection, Interspeech
Li Tang, Yuke Si, Longbiao Wang, Jianwu Dang (2021), Domain-Specific Multi-Agent Dialog Policy Learning in Multi-Domain Task-Oriented Scenarios, Interspeech
Haoyu Wang, John Chen, Majid Laali, Kevin Durda, Jeff King, William Campbell, Yang Liu (2021), Leveraging ASR N-Best in Deep Entity Retrieval, Interspeech
Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Ye Bai, Jianhua Tao, Xuefei Liu, Zhengqi Wen (2021), End-to-End Spelling Correction Conditioned on Acoustic Feature for Code-Switching Speech Recognition, Interspeech
Kathleen Siminyu, Xinjian Li, Antonios Anastasopoulos, David R. Mortensen, Michael R. Marlo, Graham Neubig (2021), Phoneme Recognition Through Fine Tuning of Phonetic Representations: A Case Study on Luhya Language Varieties, Interspeech
Erfan Loweimi, Zoran Cvetkovic, Peter Bell, Steve Renals (2021), Speech Acoustic Modelling Using Raw Source and Filter Components, Interspeech
Masakiyo Fujimoto, Hisashi Kawai (2021), Noise Robust Acoustic Modeling for Single-Channel Speech Recognition Based on a Stream-Wise Transformer Architecture, Interspeech
Anton Ratnarajah, Zhenyu Tang, Dinesh Manocha (2021), IR-GAN: Room Impulse Response Generator for Far-Field Speech Recognition, Interspeech
Junqi Chen, Xiao-Lei Zhang (2021), Scaling Sparsemax Based Channel Selection for Speech Recognition with ad-hoc Microphone Arrays, Interspeech
Feng-Ju Chang, Martin Radfar, Athanasios Mouchtaris, Maurizio Omologo (2021), Multi-Channel Transformer Transducer for Speech Recognition, Interspeech
Emiru Tsunoo, Kentaro Shibata, Chaitanya Narisetty, Yosuke Kashiwagi, Shinji Watanabe (2021), Data Augmentation Methods for End-to-End Speech Recognition on Distant-Talk Scenarios, Interspeech
Guodong Ma, Pengfei Hu, Jian Kang, Shen Huang, Hao Huang (2021), Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur Speech Recognition, Interspeech
Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Paden Tomasello, Jacob Kahn, Gilad Avidov, Ronan Collobert, Gabriel Synnaeve (2021), Rethinking Evaluation in ASR: Are Our Models Robust Enough?, Interspeech
Max W.Y. Lam, Jun Wang, Chao Weng, Dan Su, Dong Yu (2021), Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition, Interspeech
Yuanbo Hou, Zhesong Yu, Xia Liang, Xingjian Du, Bilei Zhu, Zejun Ma, Dick Botteldooren (2021), Attention-Based Cross-Modal Fusion for Audio-Visual Voice Activity Detection in Musical Video Streams, Interspeech
Ui-Hyun Kim (2021), Noise-Tolerant Self-Supervised Learning for Audio-Visual Voice Activity Detection, Interspeech
Hyun-Jin Park, Pai Zhu, Ignacio Lopez Moreno, Niranjan Subrahmanya (2021), Noisy Student-Teacher Training for Robust Keyword Spotting, Interspeech
Osamu Ichikawa, Kaito Nakano, Takahiro Nakayama, Hajime Shirouzu (2021), Multi-Channel VAD for Transcription of Group Discussion, Interspeech
Hengshun Zhou, Jun Du, Hang Chen, Zijun Jing, Shifu Xiong, Chin-Hui Lee (2021), Audio-Visual Information Fusion Using Cross-Modal Teacher-Student Learning for Voice Activity Detection in Realistic Environments, Interspeech
Naoki Makishima, Mana Ihori, Tomohiro Tanaka, Akihiko Takashima, Shota Orihashi, Ryo Masumura (2021), Enrollment-Less Training for Personalized Voice Activity Detection, Interspeech
Yuto Nonaka, Chee Siang Leow, Akio Kobayashi, Takehito Utsuro, Hiromitsu Nishizaki (2021), Voice Activity Detection for Live Speech of Baseball Game Based on Tandem Connection with Speech/Noise Separation Model, Interspeech
Young D. Kwon, Jagmohan Chauhan, Cecilia Mascolo (2021), FastICARL: Fast Incremental Classifier and Representation Learning with Efficient Budget Allocation in Audio Sensing Applications, Interspeech
Bo Wei, Meirong Yang, Tao Zhang, Xiao Tang, Xing Huang, Kyuhong Kim, Jaeyun Lee, Kiho Cho, Sung-Un Park (2021), End-to-End Transformer-Based Open-Vocabulary Keyword Spotting with Location-Guided Local Attention, Interspeech
Saurabhchand Bhati, Jesús Villalba, Piotr Żelasko, Laureano Moro-Velázquez, Najim Dehak (2021), Segmental Contrastive Predictive Coding for Unsupervised Word Segmentation, Interspeech
Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu (2021), A Lightweight Framework for Online Voice Activity Detection in the Wild, Interspeech
Aurélie Chlébowski, Nicolas Ballier (2021), “See what I mean, huh?” Evaluating Visual Inspection of F0 Tracking in Nasal Grunts, Interspeech
Bruce Xiao Wang, Vincent Hughes (2021), System Performance as a Function of Calibration Methods, Sample Size and Sampling Variability in Likelihood Ratio-Based Forensic Voice Comparison, Interspeech
Anne Bonneau (2021), Voicing Assimilations by French Speakers of German in Stop-Fricative Sequences, Interspeech
Titas Chakraborty, Vaishali Patil, Preeti Rao (2021), The Four-Way Classification of Stops with Voicing and Aspiration for Non-Native Speech Evaluation, Interspeech
Saba Urooj, Benazir Mumtaz, Sarmad Hussain, Ehsan ul Haq (2021), Acoustic and Prosodic Correlates of Emotions in Urdu Speech, Interspeech
Nour Tamim, Silke Hamann (2021), Voicing Contrasts in the Singleton Stops of Palestinian Arabic: Production and Perception, Interspeech
Thomas Coy, Vincent Hughes, Philip Harrison, Amelia J. Gully (2021), A Comparison of the Accuracy of Dissen and Keshet’s (2016) DeepFormants and Traditional LPC Methods for Semi-Automatic Speaker Recognition, Interspeech
Michael Jessen (2021), MAP Adaptation Characteristics in Forensic Long-Term Formant Analysis, Interspeech
Justin J.H. Lo (2021), Cross-Linguistic Speaker Individuality of Long-Term Formant Distributions: Phonetic and Forensic Perspectives, Interspeech
Rachel Soo, Khia A. Johnson, Molly Babel (2021), Sound Change in Spontaneous Bilingual Speech: A Corpus Study on the Cantonese n-l Merger in Cantonese-English Bilinguals, Interspeech
Wendy Lalhminghlui, Priyankoo Sarmah (2021), Characterizing Voiced and Voiceless Nasals in Mizo, Interspeech
Björn W. Schuller, Anton Batliner, Christian Bergler, Cecilia Mascolo, Jing Han, Iulia Lefter, Heysem Kaya, Shahin Amiriparian, Alice Baird, Lukas Stappen, Sandra Ottl, Maurice Gerczuk, Panagiotis Tzirakis, Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Leon J.M. Rothkrantz, Joeri A. Zwerts, Jelle Treep, Casper S. Kaandorp (2021), The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates, Interspeech
Rubén Solera-Ureña, Catarina Botelho, Francisco Teixeira, Thomas Rolland, Alberto Abad, Isabel Trancoso (2021), Transfer Learning-Based Cough Representations for Automatic Detection of COVID-19, Interspeech
P. Klumpp, T. Bocklet, T. Arias-Vergara, J.C. Vásquez-Correa, P.A. Pérez-Toro, S.P. Bayerl, J.R. Orozco-Arroyave, Elmar Nöth (2021), The Phonetic Footprint of Covid-19?, Interspeech
Edresson Casanova, Arnaldo Candido Jr., Ricardo Corso Fernandes Jr., Marcelo Finger, Lucas Rafael Stefanel Gris, Moacir Antonelli Ponti, Daniel Peixoto Pinto da Silva (2021), Transfer Learning and Data Augmentation Techniques to the COVID-19 Identification Tasks in ComParE 2021, Interspeech
Steffen Illium, Robert Müller, Andreas Sedlmeier, Claudia-Linnhoff Popien (2021), Visual Transformers for Primates Classification and Covid Detection, Interspeech
Thomas Pellegrini (2021), Deep-Learning-Based Central African Primate Species Classification with MixUp and SpecAugment, Interspeech
Robert Müller, Steffen Illium, Claudia Linnhoff-Popien (2021), A Deep and Recurrent Architecture for Primate Vocalization Classification, Interspeech
Joeri A. Zwerts, Jelle Treep, Casper S. Kaandorp, Floor Meewis, Amparo C. Koot, Heysem Kaya (2021), Introducing a Central African Primate Vocalisation Dataset for Automated Species Classification, Interspeech
Georgios Rizos, Jenna Lawson, Zhuoda Han, Duncan Butler, James Rosindell, Krystian Mikolajczyk, Cristina Banks-Leite, Björn W. Schuller (2021), Multi-Attentive Detection of the Spider Monkey Whinny in the (Actual) Wild, Interspeech
José Vicente Egas-López, Mercedes Vetráb, László Tóth, Gábor Gosztolya (2021), Identifying Conflict Escalation and Primates by Using Ensemble X-Vectors and Fisher Vector Features, Interspeech
Oxana Verkholyak, Denis Dresvyanskiy, Anastasia Dvoynikova, Denis Kotov, Elena Ryumina, Alena Velichko, Danila Mamontov, Wolfgang Minker, Alexey Karpov (2021), Ensemble-Within-Ensemble Classification for Escalation Prediction from Speech, Interspeech
Dominik Schiller, Silvan Mertes, Pol van Rijn, Elisabeth André (2021), Analysis by Synthesis: Using an Expressive TTS Model as Feature Extractor for Paralinguistic Speech Classification, Interspeech
Heidi Christensen (2021), Towards Automatic Speech Recognition for People with Atypical Speech, Interspeech
Chau Luu, Peter Bell, Steve Renals (2021), Leveraging Speaker Attribute Information Using Multi Task Learning for Speaker Verification and Diarization, Interspeech
Magdalena Rybicka, Jesús Villalba, Piotr Żelasko, Najim Dehak, Konrad Kowalczyk (2021), Spine2Net: SpineNet with Res2Net and Time-Squeeze-and-Excitation Blocks for Speaker Recognition, Interspeech
Themos Stafylakis, Johan Rohdin, Lukáš Burget (2021), Speaker Embeddings by Modeling Channel-Wise Correlations, Interspeech
Weipeng He, Petr Motlicek, Jean-Marc Odobez (2021), Multi-Task Neural Network for Robust Multiple Speaker Embedding Extraction, Interspeech
Junyi Peng, Xiaoyang Qu, Jianzong Wang, Rongzhi Gu, Jing Xiao, Lukáš Burget, Jan Černocký (2021), ICSpk: Interpretable Complex Speaker Embedding Extractor from Raw Waveform, Interspeech
Xiao Xiao, Nicolas Audibert, Grégoire Locqueville, Christophe d'Alessandro, Barbara Kuhnert, Claire Pillot-Loiseau (2021), Prosodic Disambiguation Using Chironomic Stylization of Intonation with Native and Non-Native Speakers, Interspeech
Aleese Block, Michelle Cohn, Georgia Zellou (2021), Variation in Perceptual Sensitivity and Compensation for Coarticulation Across Adult and Child Naturally-Produced and TTS Voices, Interspeech
Mohammad Jalilpour Monesi, Bernd Accou, Tom Francart, Hugo Van hamme (2021), Extracting Different Levels of Speech Information from EEG Using an LSTM-Based Model, Interspeech
Louis ten Bosch, Lou Boves (2021), Word Competition: An Entropy-Based Approach in the DIANA Model of Human Word Comprehension, Interspeech
Louis ten Bosch, Lou Boves (2021), Time-to-Event Models for Analyzing Reaction Time Sequences, Interspeech
Sophie Brand, Kimberley Mulder, Louis ten Bosch, Lou Boves (2021), Models of Reaction Times in Auditory Lexical Decision: RTonset versus RToffset, Interspeech
Gwantae Kim, David K. Han, Hanseok Ko (2021), SpecMix : A Mixed Sample Data Augmentation Method for Training with Time-Frequency Domain Features, Interspeech
Helin Wang, Yuexian Zou, Wenwu Wang (2021), SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification, Interspeech
Xu Zheng, Yan Song, Li-Rong Dai, Ian McLoughlin, Lin Liu (2021), An Effective Mutual Mean Teaching Based Domain Adaptation Method for Sound Event Detection, Interspeech
Ritika Nandi, Shashank Shekhar, Manjunath Mulimani (2021), Acoustic Scene Classification Using Kervolution-Based SubSpectralNet, Interspeech
Harshavardhan Sundar, Ming Sun, Chao Wang (2021), Event Specific Attention for Polyphonic Sound Event Detection, Interspeech
Yuan Gong, Yu-An Chung, James Glass (2021), AST: Audio Spectrogram Transformer, Interspeech
Soonshin Seo, Donghyun Lee, Ji-Hwan Kim (2021), Shallow Convolution-Augmented Transformer with Differentiable Neural Computer for Low-Complexity Classification of Variable-Length Acoustic Scene, Interspeech
Helen L. Bear, Veronica Morfi, Emmanouil Benetos (2021), An Evaluation of Data Augmentation Methods for Sound Scene Geotagging, Interspeech
Chiori Hori, Takaaki Hori, Jonathan Le Roux (2021), Optimizing Latency for Online Video Captioning Using Audio-Visual Transformers, Interspeech
Shijing Si, Jianzong Wang, Huiming Sun, Jianhan Wu, Chuanyao Zhang, Xiaoyang Qu, Ning Cheng, Lei Chen, Jing Xiao (2021), Variational Information Bottleneck for Effective Low-Resource Audio Classification, Interspeech
Soham Deshmukh, Bhiksha Raj, Rita Singh (2021), Improving Weakly Supervised Sound Event Detection with Self-Supervised Auxiliary Tasks, Interspeech
Tatsuya Komatsu, Shinji Watanabe, Koichi Miyazaki, Tomoki Hayashi (2021), Acoustic Event Detection with Classifier Chains, Interspeech
Shu-Chuan Tseng, Yi-Fen Liu (2021), Segment and Tone Production in Continuous Speech of Hearing and Hearing-Impaired Children, Interspeech
Feng Wang, Jing Chen, Fei Chen (2021), Effect of Carrier Bandwidth on Understanding Mandarin Sentences in Simulated Electric-Acoustic Hearing, Interspeech
Manthan Sharma, Navaneetha Gaddam, Tejas Umesh, Aditya Murthy, Prasanta Kumar Ghosh (2021), A Comparative Study of Different EMG Features for Acoustics-to-EMG Mapping, Interspeech
Ajish K. Abraham, V. Sivaramakrishnan, N. Swapna, N. Manohar (2021), Image-Based Assessment of Jaw Parameters and Jaw Kinematics for Articulatory Simulation: Preliminary Results, Interspeech
Jianrong Wang, Nan Gu, Mei Yu, Xuewei Li, Qiang Fang, Li Liu (2021), An Attention Self-Supervised Contrastive Learning Based Three-Stage Model for Hand Shape Feature Representation in Cued Speech, Interspeech
Judith Dineley, Grace Lavelle, Daniel Leightley, Faith Matcham, Sara Siddi, Maria Teresa Peñarrubia-María, Katie M. White, Alina Ivan, Carolin Oetzmann, Sara Simblett, Erin Dawe-Lane, Stuart Bruce, Daniel Stahl, Yatharth Ranjan, Zulqarnain Rashid, Pauline Conde, Amos A. Folarin, Josep Maria Haro, Til Wykes, Richard J.B. Dobson, Vaibhav A. Narayan, Matthew Hotopf, Björn W. Schuller, Nicholas Cummins, The RADAR-CNS Consortium (2021), Remote Smartphone-Based Speech Collection: Acceptance and Barriers in Individuals with Major Depressive Disorder, Interspeech
Sarah R. Li, Colin T. Annand, Sarah Dugan, Sarah M. Schwab, Kathryn J. Eary, Michael Swearengen, Sarah Stack, Suzanne Boyce, Michael A. Riley, T. Douglas Mast (2021), An Automatic, Simple Ultrasound Biofeedback Parameter for Distinguishing Accurate and Misarticulated Rhotic Syllables, Interspeech
Manuel Sam Ribeiro, Aciel Eshky, Korin Richmond, Steve Renals (2021), Silent versus Modal Multi-Speaker Speech Recognition from Ultrasound and Video, Interspeech
David Ferreira, Samuel Silva, Francisco Curado, António Teixeira (2021), RaSSpeR: Radar-Based Silent Speech Recognition, Interspeech
Beiming Cao, Nordine Sebkhi, Arpan Bhavsar, Omer T. Inan, Robin Samlan, Ted Mau, Jun Wang (2021), Investigating Speech Reconstruction for Laryngectomees for Silent Speech Interfaces, Interspeech
Hendrik Schröter, Tobias Rosenkranz, Alberto N. Escalante-B, Andreas Maier (2021), LACOPE: Latency-Constrained Pitch Estimation for Speech Enhancement, Interspeech
Mathieu Fontaine, Kouhei Sekiguchi, Aditya Arie Nugraha, Yoshiaki Bando, Kazuyoshi Yoshii (2021), Alpha-Stable Autoregressive Fast Multichannel Nonnegative Matrix Factorization for Joint Speech Enhancement and Dereverberation, Interspeech
Siyuan Zhang, Xiaofei Li (2021), Microphone Array Generalization for Multichannel Narrowband Deep Speech Enhancement, Interspeech
Hyungchan Song, Jong Won Shin (2021), Multiple Sound Source Localization Based on Interchannel Phase Differences in All Frequencies with Spectral Masks, Interspeech
Pablo Pérez Zarazaga, Mariem Bouafif Mansali, Tom Bäckström, Zied Lachiri (2021), Cancellation of Local Competing Speaker with Near-Field Localization for Distributed ad-hoc Sensor Network, Interspeech
Hao Zhang, DeLiang Wang (2021), A Deep Learning Method to Multi-Channel Active Noise Control, Interspeech
Simone Graetzer, Jon Barker, Trevor J. Cox, Michael Akeroyd, John F. Culling, Graham Naylor, Eszter Porter, Rhoddy Viveros Muñoz (2021), Clarity-2021 Challenges: Machine Learning Challenges for Advancing Hearing Aid Processing, Interspeech
Zehai Tu, Ning Ma, Jon Barker (2021), Optimising Hearing Aid Fittings for Speech in Noise with a Differentiable Hearing Loss Model, Interspeech
Sunit Sivasankaran, Emmanuel Vincent, Dominique Fohr (2021), Explaining Deep Learning Models for Speech Enhancement, Interspeech
Weilong Huang, Jinwei Feng (2021), Minimum-Norm Differential Beamforming for Linear Array with Directional Microphones, Interspeech
Songjun Cao, Yueteng Kang, Yanzhe Fu, Xiaoshuo Xu, Sining Sun, Yike Zhang, Long Ma (2021), Improving Streaming Transformer Based ASR Under a Framework of Self-Supervised Learning, Interspeech
Samik Sadhu, Di He, Che-Wei Huang, Sri Harish Mallidi, Minhua Wu, Ariya Rastrow, Andreas Stolcke, Jasha Droppo, Roland Maas (2021), wav2vec-C: A Self-Supervised Model for Speech Representation Learning, Interspeech
Electra Wallington, Benji Kershenbaum, Ondřej Klejch, Peter Bell (2021), On the Learning Dynamics of Semi-Supervised Training for ASR, Interspeech
Wei-Ning Hsu, Anuroop Sriram, Alexei Baevski, Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Jacob Kahn, Ann Lee, Ronan Collobert, Gabriel Synnaeve, Michael Auli (2021), Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training, Interspeech
Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori (2021), Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition, Interspeech
Ananya Misra, Dongseong Hwang, Zhouyuan Huo, Shefali Garg, Nikhil Siddhartha, Arun Narayanan, Khe Chai Sim (2021), A Comparison of Supervised and Unsupervised Pre-Training of End-to-End Models, Interspeech
Zhehuai Chen, Andrew Rosenberg, Yu Zhang, Heiga Zen, Mohammadreza Ghodsi, Yinghui Huang, Jesse Emond, Gary Wang, Bhuvana Ramabhadran, Pedro J. Moreno (2021), Semi-Supervision in ASR: Sequential MixMatch and Factorized TTS-Based Augmentation, Interspeech
Tatiana Likhomanenko, Qiantong Xu, Jacob Kahn, Gabriel Synnaeve, Ronan Collobert (2021), slimIPL: Language-Model-Free Iterative Pseudo-Labeling, Interspeech
Xianghu Yue, Haizhou Li (2021), Phonetically Motivated Self-Supervised Speech Representation Learning, Interspeech
Yan Deng, Rui Zhao, Zhong Meng, Xie Chen, Bing Liu, Jinyu Li, Yifan Gong, Lei He (2021), Improving RNN-T for Domain Scaling Using Semi-Supervised Training with Neural TTS, Interspeech
Scott Seyfarth, Sundararajan Srinivasan, Katrin Kirchhoff (2021), Speaker-Conversation Factorial Designs for Diarization Error Analysis, Interspeech
Ross McGowan, Jinru Su, Vince DiCocco, Thejaswi Muniyappa, Grant P. Strimel (2021), SmallER: Scaling Neural Entity Resolution for Edge Devices, Interspeech
Johann C. Rocholl, Vicky Zayats, Daniel D. Walker, Noah B. Murad, Aaron Schneider, Daniel J. Liebling (2021), Disfluency Detection with Unlabeled Data and Small BERT Models, Interspeech
Qian Chen, Wen Wang, Mengzhe Chen, Qinglin Zhang (2021), Discriminative Self-Training for Punctuation Prediction, Interspeech
Mana Ihori, Naoki Makishima, Tomohiro Tanaka, Akihiko Takashima, Shota Orihashi, Ryo Masumura (2021), Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks Using Switching Tokens, Interspeech
Binghuai Lin, Liyuan Wang (2021), A Noise Robust Method for Word-Level Pronunciation Assessment, Interspeech
Jonathan Wintrode (2021), Targeted Keyword Filtering for Accelerated Spoken Topic Identification, Interspeech
Shruti Palaskar, Ruslan Salakhutdinov, Alan W. Black, Florian Metze (2021), Multimodal Speech Summarization Through Semantic Concept Learning, Interspeech
Hyunjae Lee, Jaewoong Yun, Hyunjin Choi, Seongho Joe, Youngjune L. Gwon (2021), Enhancing Semantic Understanding with Self-Supervised Methods for Abstractive Dialogue Summarization, Interspeech
Marcin Włodarczak, Emer Gilmartin (2021), Speaker Transition Patterns in Three-Party Conversation: Evidence from English, Estonian and Swedish, Interspeech
Samuel J. Broughton, Md. Asif Jalal, Roger K. Moore (2021), Investigating Deep Neural Structures and their Interpretability in the Domain of Voice Conversion, Interspeech
Kun Zhou, Berrak Sisman, Haizhou Li (2021), Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-Stage Sequence-to-Sequence Training, Interspeech
Yi-Yang Ding, Li-Juan Liu, Yu Hu, Zhen-Hua Ling (2021), Adversarial Voice Conversion Against Neural Spoofing Detectors, Interspeech
Xiangheng He, Junjie Chen, Georgios Rizos, Björn W. Schuller (2021), An Improved StarGAN for Emotional Voice Conversion: Enhancing Voice Quality and Data Augmentation, Interspeech
Ziyi Chen, Pengyuan Zhang (2021), TVQVC: Transformer Based Vector Quantized Variational Autoencoder with CTC Loss for Voice Conversion, Interspeech
Zhichao Wang, Xinyong Zhou, Fengyu Yang, Tao Li, Hongqiang Du, Lei Xie, Wendong Gan, Haitao Chen, Hai Li (2021), Enriching Source Style Transfer in Recognition-Synthesis Based Non-Parallel Voice Conversion, Interspeech
Jheng-hao Lin, Yist Y. Lin, Chung-Ming Chien, Hung-yi Lee (2021), S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations, Interspeech
Christopher Liberatore, Ricardo Gutierrez-Osuna (2021), An Exemplar Selection Algorithm for Native-Nonnative Voice Conversion, Interspeech
Jie Wang, Jingbei Li, Xintao Zhao, Zhiyong Wu, Shiyin Kang, Helen Meng (2021), Adversarially Learning Disentangled Speech Representations for Robust Multi-Factor Voice Conversion, Interspeech
Manh Luong, Viet Anh Tran (2021), Many-to-Many Voice Conversion Based Feature Disentanglement Using Variational Autoencoder, Interspeech
Oubaïda Chouchane, Baptiste Brossier, Jorge Esteban Gamboa Gamboa, Thomas Lardy, Hemlata Tak, Orhan Ermis, Madhu R. Kamble, Jose Patino, Nicholas Evans, Melek Önen, Massimiliano Todisco (2021), Privacy-Preserving Voice Anti-Spoofing Using Secure Multi-Party Computation, Interspeech
Ranya Aloufi, Hamed Haddadi, David Boyle (2021), Configurable Privacy-Preserving Automatic Speech Recognition, Interspeech
Scott Novotney, Yile Gu, Ivan Bulyko (2021), Adjunct-Emeritus Distillation for Semi-Supervised Language Model Adaptation, Interspeech
Jae Ro, Mingqing Chen, Rajiv Mathews, Mehryar Mohri, Ananda Theertha Suresh (2021), Communication-Efficient Agnostic Federated Averaging, Interspeech
Timm Koppelmann, Alexandru Nelus, Lea Schönherr, Dorothea Kolossa, Rainer Martin (2021), Privacy-Preserving Feature Extraction for Cloud-Based Wake Word Verification, Interspeech
Chao-Han Huck Yang, Sabato Marco Siniscalchi, Chin-Hui Lee (2021), PATE-AAE: Incorporating Adversarial Autoencoder into Private Aggregation of Teacher Ensembles for Spoken Command Classification, Interspeech
Haoxin Ma, Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengkun Tian, Chenglong Wang (2021), Continual Learning for Fake Audio Detection, Interspeech
Muhammad A. Shah, Joseph Szurley, Markus Mueller, Athanasios Mouchtaris, Jasha Droppo (2021), Evaluating the Vulnerability of End-to-End Automatic Speech Recognition Models to Membership Inference Attacks, Interspeech
Amin Fazel, Wei Yang, Yulan Liu, Roberto Barra-Chicote, Yixiong Meng, Roland Maas, Jasha Droppo (2021), SynthASR: Unlocking Synthetic Data for Speech Recognition, Interspeech
Ananya Muguli, Lancelot Pinto, Nirmala R, Neeraj Sharma, Prashant Krishnan, Prasanta Kumar Ghosh, Rohit Kumar, Shrirama Bhat, Srikanth Raj Chetupalli, Sriram Ganapathy, Shreyas Ramoji, Viral Nanda (2021), DiCOVA Challenge: Dataset, Task, and Baseline System for COVID-19 Diagnosis Using Acoustics, Interspeech
Madhu R. Kamble, Jose A. Gonzalez-Lopez, Teresa Grau, Juan M. Espin, Lorenzo Cascioli, Yiqing Huang, Alejandro Gomez-Alanis, Jose Patino, Roberto Font, Antonio M. Peinado, Angel M. Gomez, Nicholas Evans, Maria A. Zuluaga, Massimiliano Todisco (2021), PANACEA Cough Sound-Based Diagnosis of COVID-19 for the DiCOVA 2021 Challenge, Interspeech
Vincent Karas, Björn W. Schuller (2021), Recognising Covid-19 from Coughing Using Ensembles of SVMs and LSTMs with Handcrafted and Deep Audio Features, Interspeech
Isabella Södergren, Maryam Pahlavan Nodeh, Prakash Chandra Chhipa, Konstantina Nikolaidou, György Kovács (2021), Detecting COVID-19 from Audio Recording of Coughs Using Random Forests and Support Vector Machines, Interspeech
Rohan Kumar Das, Maulik Madhavi, Haizhou Li (2021), Diagnosis of COVID-19 Using Auditory Acoustic Cues, Interspeech
John Harvill, Yash R. Wani, Mark Hasegawa-Johnson, Narendra Ahuja, David Beiser, David Chestek (2021), Classification of COVID-19 from Cough Using Autoregressive Predictive Coding Pretraining and Spectral Data Augmentation, Interspeech
Gauri Deshpande, Björn W. Schuller (2021), The DiCOVA 2021 Challenge — An Encoder-Decoder Approach for COVID-19 Recognition from Coughing Audio, Interspeech
Kotra Venkata Sai Ritwik, Shareef Babu Kalluri, Deepu Vijayasenan (2021), COVID-19 Detection from Spectral Features on the DiCOVA Dataset, Interspeech
Adria Mallol-Ragolta, Helena Cuesta, Emilia Gómez, Björn W. Schuller (2021), Cough-Based COVID-19 Detection with Contextual Attention Convolutional Neural Networks and Gender Information, Interspeech
Swapnil Bhosale, Upasana Tiwari, Rupayan Chakraborty, Sunil Kumar Kopparapu (2021), Contrastive Learning of Cough Descriptors for Automatic COVID-19 Preliminary Diagnosis, Interspeech
Flavio Avila, Amir H. Poorjam, Deepak Mittal, Charles Dognin, Ananya Muguli, Rohit Kumar, Srikanth Raj Chetupalli, Sriram Ganapathy, Maneesh Singh (2021), Investigating Feature Selection and Explainability for COVID-19 Diagnostics from Cough Sounds, Interspeech
Gábor Kiss, Dávid Sztahó, Miklós Gábriel Tulics (2021), Application for Detecting Depression, Parkinson’s Disease and Dysphonic Speech, Interspeech
Lenka Weingartová, Veronika Volná, Ewa Balejová (2021), Beey: More Than a Speech-to-Text Editor, Interspeech
Takayuki Arai (2021), Downsizing of Vocal-Tract Models to Line up Variations and Reduce Manufacturing Costs, Interspeech
Maël Fabien, Shantipriya Parida, Petr Motlicek, Dawei Zhu, Aravind Krishnan, Hoang H. Nguyen (2021), ROXANNE Research Platform: Automate Criminal Investigations, Interspeech
Alexandre Flucha, Anthony Larcher, Ambuj Mehrish, Sylvain Meignier, Florian Plaut, Nicolas Poupon, Yevhenii Prokopalo, Adrien Puertolas, Meysam Shamsi, Marie Tahon (2021), The LIUM Human Active Correction Platform for Speaker Diarization, Interspeech
Yoo Rhee Oh, Kiyoung Park (2021), On-Device Streaming Transformer-Based End-to-End Speech Recognition, Interspeech
J. Čmejla, T. Kounovský, J. Janský, Jiri Malek, M. Rozkovec, Z. Koldovský (2021), Advanced Semi-Blind Speaker Extraction and Tracking Implemented in Experimental Device with Revolving Dense Microphone Array, Interspeech
Hermann Ney (2021), Forty Years of Speech and Language Processing: From Bayes Decision Rule to Deep Learning, Interspeech
Jan Chorowski, Grzegorz Ciesielski, Jarosław Dzikowski, Adrian Łańcucki, Ricard Marxer, Mateusz Opala, Piotr Pusz, Paweł Rychlikowski, Michał Stypułkowski (2021), Information Retrieval for ZeroSpeech 2021: The Submission by University of Wroclaw, Interspeech
Jan Chorowski, Grzegorz Ciesielski, Jarosław Dzikowski, Adrian Łańcucki, Ricard Marxer, Mateusz Opala, Piotr Pusz, Paweł Rychlikowski, Michał Stypułkowski (2021), Aligned Contrastive Predictive Coding, Interspeech
Benjamin Suter, Josef Novak (2021), Neural Text Denormalization for Speech Transcripts, Interspeech
Aditya Joglekar, Seyed Omid Sadjadi, Meena Chandra-Shekar, Christopher Cieri, John H.L. Hansen (2021), Fearless Steps Challenge Phase-3 (FSC P3): Advancing SLT for Unseen Channel and Mission Data Across NASA Apollo Audio, Interspeech
Hannah Leykum (2021), Voice Quality in Verbal Irony: Electroglottographic Analyses of Ironic Utterances in Standard Austrian German, Interspeech
Mathilde Hutin, Yaru Wu, Adèle Jatteau, Ioana Vasilescu, Lori Lamel, Martine Adda-Decker (2021), Synchronic Fortition in Five Romance Languages? A Large Corpus-Based Study of Word-Initial Devoicing, Interspeech
Ivan Kraljevski, Maria Paola Bissiri, Frank Duckhorn, Constanze Tschoepe, Matthias Wolff (2021), Glottal Stops in Upper Sorbian: A Data-Driven Approach, Interspeech
Bogdan Ludusan, Petra Wagner, Marcin Włodarczak (2021), Cue Interaction in the Perception of Prosodic Prominence: The Role of Voice Quality, Interspeech
Jenifer Vega Rodriguez, Nathalie Vallée (2021), Glottal Sounds in Korebaju, Interspeech
Anaïs Chanclu, Imen Ben Amor, Cédric Gendrot, Emmanuel Ferragne, Jean-François Bonastre (2021), Automatic Classification of Phonation Types in Spontaneous Speech: Towards a New Workflow for the Characterization of Speakers’ Voice Quality, Interspeech
Rob J.J.H. van Son (2021), Measuring Voice Quality Parameters After Speaker Pseudonymization, Interspeech
Lars Steinert, Felix Putze, Dennis Küster, Tanja Schultz (2021), Audio-Visual Recognition of Emotional Engagement of People with Dementia, Interspeech
Pascal Hecker, Florian B. Pokorny, Katrin D. Bartl-Pokorny, Uwe Reichel, Zhao Ren, Simone Hantke, Florian Eyben, Dagmar M. Schuller, Bert Arnrich, Björn W. Schuller (2021), Speaking Corona? Human and Machine Recognition of COVID-19 from Voice, Interspeech
Huyen Nguyen, Ralph Vente, David Lupea, Sarah Ita Levitan, Julia Hirschberg (2021), Acoustic-Prosodic, Lexical and Demographic Cues to Persuasiveness in Competitive Debate Speeches, Interspeech
Bengt J. Borgström (2021), Unsupervised Bayesian Adaptation of PLDA for Speaker Verification, Interspeech
Weiqing Wang, Danwei Cai, Jin Wang, Qingjian Lin, Xuyang Wang, Mi Hong, Ming Li (2021), The DKU-Duke-Lenovo System Description for the Fearless Steps Challenge Phase III, Interspeech
Yafeng Chen, Wu Guo, Bin Gu (2021), Improved Meta-Learning Training for Speaker Verification, Interspeech
Dan Wang, Yuanjie Dong, Yaxing Li, Yunfei Zi, Zhihui Zhang, Xiaoqi Li, Shengwu Xiong (2021), Variational Information Bottleneck Based Regularization for Speaker Recognition, Interspeech
Niko Brümmer, Luciana Ferrer, Albert Swart (2021), Out of a Hundred Trials, How Many Errors Does Your Speaker Verifier Make?, Interspeech
Roza Chojnacka, Jason Pelecanos, Quan Wang, Ignacio Lopez Moreno (2021), SpeakerStew: Scaling to Many Languages with a Triaged Multilingual Text-Dependent and Text-Independent Speaker Verification System, Interspeech
Zhiming Wang, Furong Xu, Kaisheng Yao, Yuan Cheng, Tao Xiong, Huijia Zhu (2021), AntVoice Neural Speaker Embedding System for FFSVC 2020, Interspeech
Jianchen Li, Jiqing Han, Hongwei Song (2021), Gradient Regularization for Noise-Robust Speaker Verification, Interspeech
Saurabh Kataria, Jesús Villalba, Piotr Żelasko, Laureano Moro-Velázquez, Najim Dehak (2021), Deep Feature CycleGANs: Speaker Identity Preserving Non-Parallel Microphone-Telephone Domain Adaptation for Speaker Verification, Interspeech
Jie Pu, Yuguang Yang, Ruirui Li, Oguz Elibol, Jasha Droppo (2021), Scaling Effect of Self-Supervised Speech Models, Interspeech
Yibo Wu, Longbiao Wang, Kong Aik Lee, Meng Liu, Jianwu Dang (2021), Joint Feature Enhancement and Speaker Recognition with Multi-Objective Task-Oriented Network, Interspeech
Li Zhang, Qing Wang, Kong Aik Lee, Lei Xie, Haizhou Li (2021), Multi-Level Transfer Learning from Near-Field to Far-Field Speaker Verification, Interspeech
Jose Patino, Natalia Tomashenko, Massimiliano Todisco, Andreas Nautsch, Nicholas Evans (2021), Speaker Anonymisation Using the McAdams Coefficient, Interspeech
Yiyu Luo, Jing Wang, Liang Xu, Lidong Yang (2021), Multi-Stream Gated and Pyramidal Temporal Convolutional Neural Networks for Audio-Visual Speech Separation in Multi-Talker Environments, Interspeech
Helin Wang, Bo Wu, Lianwu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu (2021), TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation, Interspeech
Jianjun Gu, Longbiao Cheng, Xingwei Sun, Junfeng Li, Yonghong Yan (2021), Residual Echo and Noise Cancellation with Feature Attention Module and Multi-Domain Loss Function, Interspeech
Xiyun Li, Yong Xu, Meng Yu, Shi-Xiong Zhang, Jiaming Xu, Bo Xu, Dong Yu (2021), MIMO Self-Attentive RNN Beamformer for Multi-Speaker Speech Separation, Interspeech
Ritwik Giri, Shrikant Venkataramani, Jean-Marc Valin, Umut Isik, Arvindh Krishnaswamy (2021), Personalized PercepNet: Real-Time, Low-Complexity Target Voice Separation and Enhancement, Interspeech
Yochai Yemini, Ethan Fetaya, Haggai Maron, Sharon Gannot (2021), Scene-Agnostic Multi-Microphone Speech Dereverberation, Interspeech
Keitaro Tanaka, Ryosuke Sawata, Shusuke Takahashi (2021), Manifold-Aware Deep Clustering: Maximizing Angles Between Embedding Vectors Based on Regular Simplex, Interspeech
Hao Zhang, DeLiang Wang (2021), A Deep Learning Approach to Multi-Channel and Multi-Microphone Acoustic Echo Cancellation, Interspeech
Yueyue Na, Ziteng Wang, Zhang Liu, Biao Tian, Qiang Fu (2021), Joint Online Multichannel Acoustic Echo Cancellation, Speech Dereverberation and Source Separation, Interspeech
Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Takafumi Moriya, Naoyuki Kamo (2021), Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition, Interspeech
Sathvik Udupa, Anwesha Roy, Abhayjeet Singh, Aravind Illa, Prasanta Kumar Ghosh (2021), Estimating Articulatory Movements in Speech Production with Transformer Networks, Interspeech
Dongchao Yang, Helin Wang, Yuexian Zou (2021), Unsupervised Multi-Target Domain Adaptation for Acoustic Scene Classification, Interspeech
Alfredo Esquivel Jaramillo, Jesper Kjær Nielsen, Mads Græsbøll Christensen (2021), Speech Decomposition Based on a Hybrid Speech Model and Optimal Segmentation, Interspeech
Jian Luo, Jianzong Wang, Ning Cheng, Jing Xiao (2021), Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation, Interspeech
Chiranjeevi Yarra, Prasanta Kumar Ghosh (2021), Noise Robust Pitch Stylization Using Minimum Mean Absolute Error Criterion, Interspeech
Yu-Lin Huang, Bo-Hao Su, Y.-W. Peter Hong, Chi-Chun Lee (2021), An Attribute-Aligned Strategy for Learning Speech Representation, Interspeech
Abdolreza Sabzi Shahrebabaki, Sabato Marco Siniscalchi, Torbjørn Svendsen (2021), Raw Speech-to-Articulatory Inversion by Temporal Filtering and Decimation, Interspeech
Jason Lilley, H. Timothy Bunnell (2021), Unsupervised Training of a DNN-Based Formant Tracker, Interspeech
Shu-wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Jeff Lai, Kushal Lakhotia, Yist Y. Lin, Andy T. Liu, Jiatong Shi, Xuankai Chang, Guan-Ting Lin, Tzu-Hsien Huang, Wei-Cheng Tseng, Ko-tik Lee, Da-Rong Liu, Zili Huang, Shuyan Dong, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee (2021), SUPERB: Speech Processing Universal PERformance Benchmark, Interspeech
Cong Zhang, Jian Zhu (2021), Synchronising Speech Segments with Musical Beats in Mandarin and English Singing, Interspeech
Jacob Peplinski, Joel Shor, Sachin Joglekar, Jake Garrison, Shwetak Patel (2021), FRILL: A Non-Semantic Speech Embedding for Mobile Devices, Interspeech
Hiroki Mori (2021), Pitch Contour Separation from Overlapping Speech, Interspeech
Anurag Kumar, Yun Wang, Vamsi Krishna Ithapu, Christian Fuegen (2021), Do Sound Event Representations Generalize to Other Audio Tasks? A Case Study in Audio Transfer Learning, Interspeech
Baolin Peng, Chenguang Zhu, Michael Zeng, Jianfeng Gao (2021), Data Augmentation for Spoken Language Understanding via Pretrained Language Models, Interspeech
Martin Radfar, Athanasios Mouchtaris, Siegfried Kunzmann, Ariya Rastrow (2021), FANS: Fusing ASR and NLU for On-Device SLU, Interspeech
Yiran Cao, Nihal Potdar, Anderson R. Avila (2021), Sequential End-to-End Intent and Slot Label Classification and Localization, Interspeech
Deepak Muralidharan, Joel Ruben Antony Moniz, Weicheng Zhang, Stephen Pulman, Lin Li, Megan Barnes, Jingjing Pan, Jason Williams, Alex Acero (2021), DEXTER: Deep Encoding of External Knowledge for Named Entity Recognition in Virtual Assistants, Interspeech
Ting-Wei Wu, Ruolin Su, Biing-Hwang Juang (2021), A Context-Aware Hierarchical BERT Fusion Network for Multi-Turn Dialog Act Detection, Interspeech
Qian Chen, Wen Wang, Qinglin Zhang (2021), Pre-Training for Spoken Language Understanding with Joint Textual and Phonetic Representation Learning, Interspeech
Quynh Do, Judith Gaspers, Daniil Sorokin, Patrick Lehnen (2021), Predicting Temporal Performance Drop of Deployed Production Spoken Language Understanding Models, Interspeech
Jatin Ganhotra, Samuel Thomas, Hong-Kwang J. Kuo, Sachindra Joshi, George Saon, Zoltán Tüske, Brian Kingsbury (2021), Integrating Dialog History into End-to-End Spoken Language Understanding Systems, Interspeech
Ting Han, Chongxuan Huang, Wei Peng (2021), Coreference Augmentation for Multi-Domain Task-Oriented Dialogue State Tracking, Interspeech
Siddhant Arora, Alissa Ostapenko, Vijay Viswanathan, Siddharth Dalmia, Florian Metze, Shinji Watanabe, Alan W. Black (2021), Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding, Interspeech
Jianwei Sun, Zhiyuan Tang, Hengxin Yin, Wei Wang, Xi Zhao, Shuaijiang Zhao, Xiaoning Lei, Wei Zou, Xiangang Li (2021), Semantic Data Augmentation for End-to-End Mandarin Speech Recognition, Interspeech
Xun Gong, Yizhou Lu, Zhikai Zhou, Yanmin Qian (2021), Layer-Wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition, Interspeech
Jinhan Wang, Yunzheng Zhu, Ruchao Fan, Wei Chu, Abeer Alwan (2021), Low Resource German ASR with Untranscribed Data Spoken by Non-Native Children — INTERSPEECH 2021 Shared Task SPAPL System, Interspeech
Khe Chai Sim, Angad Chandorkar, Fan Gao, Mason Chua, Tsendsuren Munkhdalai, Françoise Beaufays (2021), Robust Continuous On-Device Personalization for Automatic Speech Recognition, Interspeech
Shashi Kumar, Shakti P. Rath, Abhishek Pandey (2021), Speaker Normalization Using Joint Variational Autoencoder, Interspeech
Gaopeng Xu, Song Yang, Lu Ma, Chengfei Li, Zhongqin Wu (2021), The TAL System for the INTERSPEECH2021 Shared Task on Automatic Speech Recognition for Non-Native Childrens Speech, Interspeech
Tsz Kin Lam, Mayumi Ohta, Shigehiko Schamoni, Stefan Riezler (2021), On-the-Fly Aligned Data Augmentation for Sequence-to-Sequence ASR, Interspeech
Heting Gao, Junrui Ni, Yang Zhang, Kaizhi Qian, Shiyu Chang, Mark Hasegawa-Johnson (2021), Zero-Shot Cross-Lingual Phonetic Recognition with External Language Embedding, Interspeech
Yan Huang, Guoli Ye, Jinyu Li, Yifan Gong (2021), Rapid Speaker Adaptation for Conformer Transducer: Attention and Bias Are All You Need, Interspeech
Nilaksh Das, Sravan Bodapati, Monica Sunkara, Sundararajan Srinivasan, Duen Horng Chau (2021), Best of Both Worlds: Robust Accented Speech Recognition with Adversarial Transfer Learning, Interspeech
Wei Chu, Peng Chang, Jing Xiao (2021), Extending Pronunciation Dictionary with Automatically Detected Word Mispronunciations to Improve PAII’s System for Interspeech 2021 Non-Native Child English Close Track ASR Challenge, Interspeech
Tingle Li, Yichen Liu, Chenxu Hu, Hang Zhao (2021), CVC: Contrastive Learning for Non-Parallel Voice Conversion, Interspeech
Wen-Chin Huang, Kazuhiro Kobayashi, Yu-Huai Peng, Ching-Feng Liu, Yu Tsao, Hsin-Min Wang, Tomoki Toda (2021), A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion, Interspeech
Sefik Emre Eskimez, Dimitrios Dimitriadis, Kenichi Kumatani, Robert Gmyr (2021), One-Shot Voice Conversion with Speaker-Agnostic StarGAN, Interspeech
Takeshi Koshizuka, Hidefumi Ohmura, Kouichi Katsurada (2021), Fine-Tuning Pre-Trained Voice Conversion Model for Adding New Target Speakers with Limited Data, Interspeech
Disong Wang, Liqun Deng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng (2021), VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-Shot Voice Conversion, Interspeech
Yinghao Aaron Li, Ali Zare, Nima Mesgarani (2021), StarGANv2-VC: A Diverse, Unsupervised, Non-Parallel Framework for Natural-Sounding Voice Conversion, Interspeech
Neeraj Kumar, Srishti Goel, Ankur Narang, Brejesh Lall (2021), Normalization Driven Zero-Shot Multi-Speaker Speech Synthesis, Interspeech
Shoki Sakamoto, Akira Taniguchi, Tadahiro Taniguchi, Hirokazu Kameoka (2021), StarGAN-VC+ASR: StarGAN-Based Non-Parallel Voice Conversion Regularized by Automatic Speech Recognition, Interspeech
Xuexin Xu, Liang Shi, Jinhui Chen, Xunquan Chen, Jie Lian, Pingyuan Lin, Zhihong Zhang, Edwin R. Hancock (2021), Two-Pathway Style Embedding for Arbitrary Voice Conversion, Interspeech
Yufei Liu, Chengzhu Yu, Wang Shuai, Zhenchuan Yang, Yang Chao, Weibin Zhang (2021), Non-Parallel Any-to-Many Voice Conversion by Replacing Speaker Statistics, Interspeech
Yi Zhou, Xiaohai Tian, Zhizheng Wu, Haizhou Li (2021), Cross-Lingual Voice Conversion with a Cycle Consistency Loss on Linguistic Representation, Interspeech
Hongqiang Du, Lei Xie (2021), Improving Robustness of One-Shot Voice Conversion with Deep Discriminative Speaker Encoder, Interspeech
Hannah White, Joshua Penney, Andy Gibson, Anita Szakay, Felicity Cox (2021), Optimizing an Automatic Creaky Voice Detection Method for Australian English Speaking Females, Interspeech
Joshua Penney, Andy Gibson, Felicity Cox, Michael Proctor, Anita Szakay (2021), A Comparison of Acoustic Correlates of Voice Quality Across Different Recording Devices: A Cautionary Tale, Interspeech
Anna Sfakianaki, George P. Kafentzis (2021), Investigating Voice Function Characteristics of Greek Speakers with Hearing Loss Using Automatic Glottal Source Feature Extraction, Interspeech
Mark Huckvale, Catinca Buciuleac (2021), Automated Detection of Voice Disorder in the Saarbrücken Voice Database: Effects of Pathology Subset and Audio Materials, Interspeech
Steven M. Lulich, Rita R. Patel (2021), Accelerometer-Based Measurements of Voice Quality in Children During Semi-Occluded Vocal Tract Exercise with a Narrow Straw in Air, Interspeech
Matthew Perez, Amrit Romana, Angela Roberts, Noelle Carlozzi, Jennifer Ann Miner, Praveen Dayalu, Emily Mower Provost (2021), Articulatory Coordination for Speech Motor Tracking in Huntington Disease, Interspeech
Carlos A. Ferrer, Efren Aragón, María E. Hdez-Díaz, Marc S. de Bodt, Roman Cmejla, Marina Englert, Mara Behlau, Elmar Nöth (2021), Modeling Dysphonia Severity as a Function of Roughness and Breathiness Ratings in the GRBAS Scale, Interspeech
Nikolay Karpov, Alexander Denisenko, Fedor Minkin (2021), Golos: Russian Dataset for Speech Research, Interspeech
Samik Sadhu, Hynek Hermansky (2021), Radically Old Way of Computing Spectra: Applications in End-to-End ASR, Interspeech
Ragheb Al-Ghezi, Yaroslav Getman, Aku Rouhe, Raili Hildén, Mikko Kurimo (2021), Self-Supervised End-to-End ASR for Low Resource L2 Swedish, Interspeech
Patrick K. O’Neill, Vitaly Lavrukhin, Somshubra Majumdar, Vahid Noroozi, Yuekai Zhang, Oleksii Kuchaiev, Jagadeesh Balam, Yuliya Dovzhenko, Keenan Freyberg, Michael D. Shulman, Boris Ginsburg, Shinji Watanabe, Georg Kucsko (2021), SPGISpeech: 5,000 Hours of Transcribed Financial Audio for Fully Formatted End-to-End Speech Recognition, Interspeech
Solène Evain, Ha Nguyen, Hang Le, Marcely Zanon Boito, Salima Mdhaffar, Sina Alisamir, Ziyi Tong, Natalia Tomashenko, Marco Dinarelli, Titouan Parcollet, Alexandre Allauzen, Yannick Estève, Benjamin Lecouteux, François Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier (2021), LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech, Interspeech
Pavel Šturm, Radek Skarnitzl, Tomáš Nechanský (2021), Prosodic Accommodation in Face-to-Face and Telephone Dialogues, Interspeech
Josiane Riverin-Coutlée, Conceição Cunha, Enkeleida Kapia, Jonathan Harrington (2021), Dialect Features in Heterogeneous and Homogeneous Gheg Speaking Communities, Interspeech
Margaret Zellers, Alena Witzlack-Makarevich, Lilja Saeboe, Saudah Namyalo (2021), An Exploration of the Acoustic Space of Rhotics and Laterals in Ruruuli, Interspeech
Kubra Bodur, Sweeney Branje, Morgane Peirolo, Ingrid Tiscareno, James S. German (2021), Domain-Initial Strengthening in Turkish: Acoustic Cues to Prosodic Hierarchy in Stop Consonants, Interspeech
Katerina Zmolikova, Marc Delcroix, Desh Raj, Shinji Watanabe, Jan Černocký (2021), Auxiliary Loss Function for Target Speech Extraction and Recognition with Weak Supervision Based on Speaker Characteristics, Interspeech
Marvin Borsdorf, Chenglin Xu, Haizhou Li, Tanja Schultz (2021), Universal Speaker Extraction in the Presence and Absence of Target Speakers for Speech of One and Two Talkers, Interspeech
Lukas Mateju, Frantisek Kynych, Petr Cerva, Jindrich Zdansky, Jiri Malek (2021), Using X-Vectors for Speech Activity Detection in Broadcast Streams, Interspeech
Daniele Salvati, Carlo Drioli, Gian Luca Foresti (2021), Time Delay Estimation for Speaker Localization Using CNN-Based Parametrized GCC-PHAT Features, Interspeech
Midia Yousefi, John H.L. Hansen (2021), Real-Time Speaker Counting in a Cocktail Party Scenario Using Attention-Guided Convolutional Neural Network, Interspeech
Hexin Liu, Leibny Paola García Perera, Xinyi Zhang, Justin Dauwels, Andy W.H. Khong, Sanjeev Khudanpur, Suzy J. Styles (2021), End-to-End Language Diarization for Bilingual Code-Switching Speech, Interspeech
Raphaël Duroselle, Md. Sahidullah, Denis Jouvet, Irina Illina (2021), Modeling and Training Strategies for Language Recognition Systems, Interspeech
Hui Wang, Lin Liu, Yan Song, Lei Fang, Ian McLoughlin, Li-Rong Dai (2021), A Weight Moving Average Based Alternate Decoupled Learning Algorithm for Long-Tailed Language Identification, Interspeech
Keqi Deng, Songjun Cao, Long Ma (2021), Improving Accent Identification and Accented Speech Recognition Under a Framework of Self-Supervised Learning, Interspeech
Zhiyun Fan, Meng Li, Shiyu Zhou, Bo Xu (2021), Exploring wav2vec 2.0 on Speaker Verification and Language Identification, Interspeech
G. Ramesh, C. Shiva Kumar, K. Sri Rama Murty (2021), Self-Supervised Phonotactic Representations for Language Identification, Interspeech
Jicheng Zhang, Yizhou Peng, Van Tung Pham, Haihua Xu, Hao Huang, Eng Siong Chng (2021), E2E-Based Multi-Task Learning Approach to Joint Speech and Accent Recognition, Interspeech
Moakala Tzudir, Shikha Baghel, Priyankoo Sarmah, S.R. Mahadeva Prasanna (2021), Excitation Source Feature Based Dialect Identification in Ao — A Low Resource Language, Interspeech
Shreya Khare, Ashish Mittal, Anuj Diwan, Sunita Sarawagi, Preethi Jyothi, Samarth Bharadwaj (2021), Low Resource ASR: The Surprising Effectiveness of High Resource Transliteration, Interspeech
Siyuan Feng, Piotr Żelasko, Laureano Moro-Velázquez, Odette Scharenborg (2021), Unsupervised Acoustic Unit Discovery by Leveraging a Language-Independent Subword Discriminative Feature Representation, Interspeech
Herman Kamper, Benjamin van Niekerk (2021), Towards Unsupervised Phone and Word Segmentation Using Self-Supervised Vector-Quantized Neural Networks, Interspeech
Dongwei Jiang, Wubo Li, Miao Cao, Wei Zou, Xiangang Li (2021), Speech SimCLR: Combining Contrastive and Reconstruction Objective for Self-Supervised Speech Representation Learning, Interspeech
Christiaan Jacobs, Herman Kamper (2021), Multilingual Transfer of Acoustic Word Embeddings Improves When Training on Languages Related to the Target Zero-Resource Language, Interspeech
Benjamin van Niekerk, Leanne Nortje, Matthew Baas, Herman Kamper (2021), Analyzing Speaker Information in Self-Supervised Models to Improve Zero-Resource Speech Processing, Interspeech
Shun Takahashi, Sakriani Sakti, Satoshi Nakamura (2021), Unsupervised Neural-Based Graph Clustering for Variable-Length Speech Representation Discovery of Zero-Resource Languages, Interspeech
Takashi Maekaku, Xuankai Chang, Yuya Fujita, Li-Wei Chen, Shinji Watanabe, Alexander Rudnicky (2021), Speech Representation Learning Combining Conformer CPC with Deep Cluster for the ZeroSpeech Challenge 2021, Interspeech
Xia Cui, Amila Gamage, Terry Hanley, Tingting Mu (2021), Identifying Indicators of Vulnerability from Short Speech Segments Using Acoustic and Textual Features, Interspeech
Ewan Dunbar, Mathieu Bernard, Nicolas Hamilakis, Tu Anh Nguyen, Maureen de Seyssel, Patricia Rozé, Morgane Rivière, Eugene Kharitonov, Emmanuel Dupoux (2021), The Zero Resource Speech Challenge 2021: Spoken Language Modelling, Interspeech
Gautham Krishna Gudur, Satheesh Kumar Perepu (2021), Zero-Shot Federated Learning with New Classes for Audio Classification, Interspeech
Andrew Rouditchenko, Angie Boggust, David Harwath, Brian Chen, Dhiraj Joshi, Samuel Thomas, Kartik Audhkhasi, Hilde Kuehne, Rameswar Panda, Rogerio Feris, Brian Kingsbury, Michael Picheny, Antonio Torralba, James Glass (2021), AVLnet: Learning Audio-Visual Language Representations from Instructional Videos, Interspeech
Gyeong-Hoon Lee, Tae-Woo Kim, Hanbin Bae, Min-Ji Lee, Young-Ik Kim, Hoon-Young Cho (2021), N-Singer: A Non-Autoregressive Korean Singing Voice Synthesis System for Pronunciation Enhancement, Interspeech
Georgia Maniati, Nikolaos Ellinas, Konstantinos Markopoulos, Georgios Vamvoukakis, June Sig Sung, Hyoungmin Park, Aimilios Chalamandaris, Pirros Tsiakoulis (2021), Cross-Lingual Low Resource Speaker Adaptation Using Phonological Features, Interspeech
Haoyue Zhan, Haitong Zhang, Wenjie Ou, Yue Lin (2021), Improve Cross-Lingual Text-To-Speech Synthesis on Monolingual Corpora with Pitch Contour Information, Interspeech
Zhenchuan Yang, Weibin Zhang, Yufei Liu, Xiaofen Xing (2021), Cross-Lingual Voice Conversion with Disentangled Universal Linguistic Representations, Interspeech
Zhengchen Liu, Chenfeng Miao, Qingying Zhu, Minchuan Chen, Jun Ma, Shaojun Wang, Jing Xiao (2021), EfficientSing: A Chinese Singing Voice Synthesis System Using Duration-Free Acoustic Model and HiFi-GAN Vocoder, Interspeech
Detai Xin, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, Hiroshi Saruwatari (2021), Cross-Lingual Speaker Adaptation Using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis, Interspeech
Zengqiang Shang, Zhihua Huang, Haozhe Zhang, Pengyuan Zhang, Yonghong Yan (2021), Incorporating Cross-Speaker Style Transfer for Multi-Language Text-to-Speech, Interspeech
Ege Kesim, Engin Erzin (2021), Investigating Contributions of Speech and Facial Landmarks for Talking Head Generation, Interspeech
Shijing Si, Jianzong Wang, Xiaoyang Qu, Ning Cheng, Wenqi Wei, Xinghua Zhu, Jing Xiao (2021), Speech2Video: Cross-Modal Distillation for Speech to Video Generation, Interspeech
Junhyeok Lee, Seungu Han (2021), NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling, Interspeech
Gang-Xuan Lin, Shih-Wei Hu, Yen-Ju Lu, Yu Tsao, Chun-Shien Lu (2021), QISTA-Net-Audio: Audio Super-Resolution via Non-Convex ℓ_q-Norm Minimization, Interspeech
Liang Wen, Lizhong Wang, Xue Wen, Yuxing Zheng, Youngo Park, Kwang Pyo Choi (2021), X-net: A Joint Scale Down and Scale Up Method for Voice Call, Interspeech
Kexun Zhang, Yi Ren, Changliang Xu, Zhou Zhao (2021), WSRGlow: A Glow-Based Waveform Generative Model for Audio Super-Resolution, Interspeech
Jiangyan Yi, Ye Bai, Jianhua Tao, Haoxin Ma, Zhengkun Tian, Chenglong Wang, Tao Wang, Ruibo Fu (2021), Half-Truth: A Partially Fake Audio Detection Dataset, Interspeech
Bhusan Chettri, Rosa González Hautamäki, Md. Sahidullah, Tomi Kinnunen (2021), Data Quality as Predictor of Voice Anti-Spoofing Generalization, Interspeech
Youngju Cheon, Soojoong Hwang, Sangwook Han, Inseon Jang, Jong Won Shin (2021), Coded Speech Enhancement Using Neural Network-Based Vector-Quantized Residual Features, Interspeech
Lukas Drude, Jahn Heymann, Andreas Schwarz, Jean-Marc Valin (2021), Multi-Channel Opus Compression for Far-Field Automatic Speech Recognition with a Fixed Bitrate Budget, Interspeech
Ingo Siegert (2021), Effects of Prosodic Variations on Accidental Triggers of a Commercial Voice Assistant, Interspeech
Adam Gabryś, Yunlong Jiao, Viacheslav Klimkov, Daniel Korzekwa, Roberto Barra-Chicote (2021), Improving the Expressiveness of Neural Vocoding with Non-Affine Normalizing Flows, Interspeech
Gauri P. Prajapati, Dipesh K. Singh, Preet P. Amin, Hemant A. Patil (2021), Voice Privacy Through x-Vector and CycleGAN-Based Anonymization, Interspeech
Ju Lin, Yun Wang, Kaustubh Kalgaonkar, Gil Keren, Didi Zhang, Christian Fuegen (2021), A Two-Stage Approach to Speech Bandwidth Extension, Interspeech
Joon Byun, Seungmin Shin, Youngcheol Park, Jongmo Sung, Seungkwon Beack (2021), Development of a Psychoacoustic Loss Function for the Deep Neural Network (DNN)-Based Speech Coder, Interspeech
Dimitrios Stoidis, Andrea Cavallaro (2021), Protecting Gender and Identity with Disentangled Speech Representations, Interspeech
Yahya Aldholmi, Rawan Aldhafyan, Asma Alqahtani (2021), Perception of Standard Arabic Synthetic Speech Rate, Interspeech
Takeshi Kishiyama (2021), The Influence of Parallel Processing on Illusory Vowels, Interspeech
Anupama Chingacham, Vera Demberg, Dietrich Klakow (2021), Exploring the Potential of Lexical Paraphrases for Mitigating Noise-Induced Comprehension Errors, Interspeech
Olympia Simantiraki, Martin Cooke (2021), SpeechAdjuster: A Tool for Investigating Listener Preferences and Speech Intelligibility, Interspeech
Susumu Saito, Yuta Ide, Teppei Nakano, Tetsuji Ogawa (2021), VocalTurk: Exploring Feasibility of Crowdsourced Speaker Identification, Interspeech
Min Xu, Jing Shao, Lan Wang (2021), Effects of Aging and Age-Related Hearing Loss on Talker Discrimination, Interspeech
Yuqing Zhang, Zhu Li, Bin Wu, Yanlu Xie, Binghuai Lin, Jinsong Zhang (2021), Relationships Between Perceptual Distinctiveness, Articulatory Complexity and Functional Load in Speech Communication, Interspeech
Camryn Terblanche, Philip Harrison, Amelia J. Gully (2021), Human Spoofing Detection Performance on Degraded Speech, Interspeech
Marieke Einfeldt, Rita Sevastjanova, Katharina Zahner-Ritter, Ekaterina Kazak, Bettina Braun (2021), Reliable Estimates of Interpretable Cue Effects with Active Learning in Psycholinguistic Research, Interspeech
Puneet Kumar, Vishesh Kaushik, Balasubramanian Raman (2021), Towards the Explainability of Multimodal Speech Emotion Recognition, Interspeech
Biao Zeng, Rui Wang, Guoxing Yu, Christian Dobel (2021), Primacy of Mouth over Eyes: Eye Movement Evidence from Audiovisual Mandarin Lexical Tones and Vowels, Interspeech
Takanori Ashihara, Takafumi Moriya, Makio Kashino (2021), Investigating the Impact of Spectral and Temporal Degradation on End-to-End Automatic Speech Recognition Performance, Interspeech
Thai-Son Nguyen, Sebastian Stüker, Alex Waibel (2021), Super-Human Performance in Online Low-Latency Recognition of Conversational Speech, Interspeech
Vikas Joshi, Amit Das, Eric Sun, Rupesh R. Mehta, Jinyu Li, Yifan Gong (2021), Multiple Softmax Architecture for Streaming Multilingual End-to-End ASR Systems, Interspeech
Duc Le, Mahaveer Jain, Gil Keren, Suyoun Kim, Yangyang Shi, Jay Mahadeokar, Julian Chan, Yuan Shangguan, Christian Fuegen, Ozlem Kalinli, Yatharth Saraf, Michael L. Seltzer (2021), Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion, Interspeech
Tara N. Sainath, Yanzhang He, Arun Narayanan, Rami Botros, Ruoming Pang, David Rybach, Cyril Allauzen, Ehsan Variani, James Qin, Quoc-Nam Le-The, Shuo-Yiin Chang, Bo Li, Anmol Gulati, Jiahui Yu, Chung-Cheng Chiu, Diamantino Caseiro, Wei Li, Qiao Liang, Pat Rondon (2021), An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling, Interspeech
Liang Lu, Naoyuki Kanda, Jinyu Li, Yifan Gong (2021), Streaming Multi-Talker Speech Recognition with Joint Speaker Identification, Interspeech
Takafumi Moriya, Tomohiro Tanaka, Takanori Ashihara, Tsubasa Ochiai, Hiroshi Sato, Atsushi Ando, Ryo Masumura, Marc Delcroix, Taichi Asami (2021), Streaming End-to-End Speech Recognition for Hybrid RNN-T/Attention Architecture, Interspeech
Andreas Schwarz, Ilya Sklyar, Simon Wiesler (2021), Improving RNN-T ASR Accuracy Using Context Audio, Interspeech
Lu Huang, Jingyu Sun, Yufeng Tang, Junfeng Hou, Jinkun Chen, Jun Zhang, Zejun Ma (2021), HMM-Free Encoder Pre-Training for Streaming RNN Transducer, Interspeech
Xiaodong Cui, Brian Kingsbury, George Saon, David Haws, Zoltán Tüske (2021), Reducing Exposure Bias in Training Recurrent Neural Network Transducers, Interspeech
Thibault Doutre, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Olivier Siohan, Liangliang Cao (2021), Bridging the Gap Between Streaming and Non-Streaming ASR Systems by Distilling Ensembles of CTC and RNN-T Models, Interspeech
Kartik Audhkhasi, Tongzhou Chen, Bhuvana Ramabhadran, Pedro J. Moreno (2021), Mixture Model Attention: Flexible Streaming and Non-Streaming Automatic Speech Recognition, Interspeech
Hirofumi Inaguma, Tatsuya Kawahara (2021), StableEmit: Selection Probability Discount for Reducing Emission Latency of Streaming Monotonic Attention ASR, Interspeech
Niko Moritz, Takaaki Hori, Jonathan Le Roux (2021), Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition, Interspeech
Kwangyoun Kim, Felix Wu, Prashant Sridhar, Kyu J. Han, Shinji Watanabe (2021), Multi-Mode Transformer Transducer with Stochastic Future Context, Interspeech
Xinlei Ren, Xu Zhang, Lianwu Chen, Xiguang Zheng, Chen Zhang, Liang Guo, Bing Yu (2021), A Causal U-Net Based Neural Beamforming Network for Real-Time Multi-Channel Speech Enhancement, Interspeech
Rui Zhu, Feiran Yang, Yuepeng Li, Shidong Shang (2021), A Partitioned-Block Frequency-Domain Adaptive Kalman Filter for Stereophonic Acoustic Echo Cancellation, Interspeech
Taihui Wang, Feiran Yang, Rui Zhu, Jun Yang (2021), Real-Time Independent Vector Analysis Using Semi-Supervised Nonnegative Matrix Factorization as a Source Model, Interspeech
Jiangyu Han, Wei Rao, Yannan Wang, Yanhua Long (2021), Improving Channel Decorrelation for Multi-Channel Target Speech Extraction, Interspeech
Jinjiang Liu, Xueliang Zhang (2021), Inplace Gated Convolutional Recurrent Neural Network for Dual-Channel Speech Enhancement, Interspeech
R.G. Prithvi Raj, Rohit Kumar, M.K. Jayesh, Anurenjan Purushothaman, Sriram Ganapathy, M.A. Basha Shaik (2021), SRIB-LEAP Submission to Far-Field Multi-Channel Speech Enhancement Challenge for Video Conferencing, Interspeech
Cheng Xue, Weilong Huang, Weiguang Chen, Jinwei Feng (2021), Real-Time Multi-Channel Speech Enhancement Based on Neural Network Masking with Attention Model, Interspeech
Sriram Ganapathy (2021), Uncovering the Acoustic Cues of COVID-19 Infection, Interspeech
Pascale Fung (2021), Ethical and Technological Challenges of Conversational AI, Interspeech
Dominique Fohr, Irina Illina (2021), BERT-Based Semantic Model for Rescoring N-Best Speech Recognition List, Interspeech
Karel Beneš, Lukáš Burget (2021), Text Augmentation for Language Models in High Error Recognition Scenario, Interspeech
Yingbo Gao, David Thulke, Alexander Gerstenberger, Khoa Viet Tran, Ralf Schlüter, Hermann Ney (2021), On Sampling-Based Training Criteria for Neural Language Modeling, Interspeech
Janne Pylkkönen, Antti Ukkonen, Juho Kilpikoski, Samu Tamminen, Hannes Heikinheimo (2021), Fast Text-Only Domain Adaptation of RNN-Transducer Prediction Network, Interspeech
Christopher Cieri, James Fiumara, Jonathan Wright (2021), Using Games to Augment Corpora for Language Recognition and Confusability, Interspeech
Gianni Fenu, Mirko Marras, Giacomo Medda, Giacomo Meloni (2021), Fair Voice Biometrics: Impact of Demographic Imbalance on Group Fairness in Speaker Recognition, Interspeech
Leying Zhang, Zhengyang Chen, Yanmin Qian (2021), Knowledge Distillation from Multi-Modality to Single-Modality for Person Verification, Interspeech
Paul-Gauthier Noé, Mohammad Mohammadamini, Driss Matrouf, Titouan Parcollet, Andreas Nautsch, Jean-François Bonastre (2021), Adversarial Disentanglement of Speaker Representation for Attribute-Driven Privacy Preservation, Interspeech
Amrit Romana, John Bandon, Matthew Perez, Stephanie Gutierrez, Richard Richter, Angela Roberts, Emily Mower Provost (2021), Automatically Detecting Errors and Disfluencies in Read Speech to Predict Cognitive Impairment in People with Parkinson’s Disease, Interspeech
Robin Vaysse, Jérôme Farinas, Corine Astésano, Régine André-Obrecht (2021), Automatic Extraction of Speech Rhythm Descriptors for Speech Intelligibility Assessment in the Context of Head and Neck Cancers, Interspeech
Jinzi Qi, Hugo Van hamme (2021), Speech Disorder Classification Using Extended Factorized Hierarchical Variational Auto-Encoders, Interspeech
Vikram C. Mathad, Tristan J. Mahr, Nancy Scherer, Kathy Chapman, Katherine C. Hustad, Julie Liss, Visar Berisha (2021), The Impact of Forced-Alignment Errors on Automatic Pronunciation Evaluation, Interspeech
Esaú Villatoro-Tello, S. Pavankumar Dubagunta, Julian Fritsch, Gabriela Ramírez-de-la-Rosa, Petr Motlicek, Mathew Magimai-Doss (2021), Late Fusion of the Available Lexicon and Raw Waveform-Based Acoustic Modeling for Depression and Dementia Recognition, Interspeech
Amin Honarmandi Shandiz, László Tóth, Gábor Gosztolya, Alexandra Markó, Tamás Gábor Csapó (2021), Neural Speaker Embeddings for Ultrasound-Based Silent Speech Interfaces, Interspeech
Jatin Lamba, Abhishek, Jayaprakash Akula, Rishabh Dabral, Preethi Jyothi, Ganesh Ramakrishnan (2021), Cross-Modal Learning for Audio-Visual Video Parsing, Interspeech
Darren Cook, Miri Zilka, Simon Maskell, Laurence Alison (2021), A Psychology-Driven Computational Analysis of Political Interviews, Interspeech
Jennifer Santoso, Takeshi Yamada, Shoji Makino, Kenkichi Ishizuka, Takekatsu Hiramura (2021), Speech Emotion Recognition Based on Attention Weight Correction Using Word-Level Confidence Measure, Interspeech
Alif Silpachai, Ivana Rehman, Taylor Anne Barriuso, John Levis, Evgeny Chukharev-Hudilainen, Guanlong Zhao, Ricardo Gutierrez-Osuna (2021), Effects of Voice Type and Task on L2 Learners’ Awareness of Pronunciation Errors, Interspeech
Alla Menshikova, Daniil Kocharov, Tatiana Kachkovskaia (2021), Lexical Entrainment and Intra-Speaker Variability in Cooperative Dialogues, Interspeech
Shamila Nasreen, Julian Hough, Matthew Purver (2021), Detecting Alzheimer’s Disease Using Interactional and Acoustic Features from Spontaneous Speech, Interspeech
Hardik Kothare, Vikram Ramanarayanan, Oliver Roesler, Michael Neumann, Jackson Liscombe, William Burke, Andrew Cornish, Doug Habberstad, Alaa Sakallah, Sara Markuson, Seemran Kansara, Afik Faerman, Yasmine Bensidi-Slimane, Laura Fry, Saige Portera, David Suendermann-Oeft, David Pautler, Carly Demopoulos (2021), Investigating the Interplay Between Affective, Phonatory and Motoric Subsystems in Autism Spectrum Disorder Using a Multimodal Dialogue Agent, Interspeech
Carlos Toshinori Ishi, Taiken Shintani (2021), Analysis of Eye Gaze Reasons and Gaze Aversions During Three-Party Conversations, Interspeech
Suyoun Kim, Abhinav Arora, Duc Le, Ching-Feng Yeh, Christian Fuegen, Ozlem Kalinli, Michael L. Seltzer (2021), Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding, Interspeech
Xiaoqiang Wang, Yanqing Liu, Sheng Zhao, Jinyu Li (2021), A Light-Weight Contextual Spelling Correction Model for Customizing Transducer-Based Speech Recognition Systems, Interspeech
Ning Shi, Wei Wang, Boxin Wang, Jinfeng Li, Xiangyu Liu, Zhouhan Lin (2021), Incorporating External POS Tagger for Punctuation Restoration, Interspeech
Vasileios Papadourakis, Markus Müller, Jing Liu, Athanasios Mouchtaris, Maurizio Omologo (2021), Phonetically Induced Subwords for End-to-End Speech Recognition, Interspeech
Courtney Mansfield, Sara Ng, Gina-Anne Levow, Richard A. Wright, Mari Ostendorf (2021), Revisiting Parity of Human vs. Machine Conversational Speech Transcription, Interspeech
W. Ronny Huang, Tara N. Sainath, Cal Peyser, Shankar Kumar, David Rybach, Trevor Strohman (2021), Lookup-Table Recurrent Language Models for Long Tail Speech Recognition, Interspeech
Jesús Andrés-Ferrer, Dario Albesano, Puming Zhan, Paul Vozila (2021), Contextual Density Ratio for Language Model Biasing of Sequence to Sequence ASR Systems, Interspeech
Qiushi Huang, Tom Ko, H. Lilian Tang, Xubo Liu, Bo Wu (2021), Token-Level Supervised Contrastive Learning for Punctuation Restoration, Interspeech
Yun Zhao, Xuerui Yang, Jinchao Wang, Yongyu Gao, Chao Yan, Yuanfu Zhou (2021), BART Based Semantic Correction for Mandarin Automatic Speech Recognition System, Interspeech
Lingfeng Dai, Qi Liu, Kai Yu (2021), Class-Based Neural Network Language Model for Second-Pass Rescoring in ASR, Interspeech
Gakuto Kurata, George Saon, Brian Kingsbury, David Haws, Zoltán Tüske (2021), Improving Customization of Neural Transducers by Mitigating Acoustic Mismatch of Synthesized Audio, Interspeech
Mandana Saebi, Ernest Pusateri, Aaksha Meghawat, Christophe Van Gysel (2021), A Discriminative Entity-Aware Language Model for Virtual Assistants, Interspeech
Mahdi Namazifar, John Malik, Li Erran Li, Gokhan Tur, Dilek Hakkani Tür (2021), Correcting Automated and Manual Speech Transcription Errors Using Warped Language Models, Interspeech
Yangyang Shi, Varun Nagaraja, Chunyang Wu, Jay Mahadeokar, Duc Le, Rohit Prabhavalkar, Alex Xiao, Ching-Feng Yeh, Julian Chan, Christian Fuegen, Ozlem Kalinli, Michael L. Seltzer (2021), Dynamic Encoder Transducer: A Flexible Solution for Trading Off Accuracy for Latency, Interspeech
Shiqi Zhang, Yan Liu, Deyi Xiong, Pei Zhang, Boxing Chen (2021), Domain-Aware Self-Attention for Multi-Domain Neural Machine Translation, Interspeech
Albert Zeyer, André Merboldt, Wilfried Michel, Ralf Schlüter, Hermann Ney (2021), Librispeech Transducer Model with Internal Language Model Prior Correction, Interspeech
Sepand Mavandadi, Tara N. Sainath, Ke Hu, Zelin Wu (2021), A Deliberation-Based Joint Acoustic and Text Decoder, Interspeech
Zoltán Tüske, George Saon, Brian Kingsbury (2021), On the Limit of English Conversational Speech Recognition, Interspeech
Keyu An, Yi Zhang, Zhijian Ou (2021), Deformable TDNN with Adaptive Receptive Fields for Speech Recognition, Interspeech
Zhao You, Shulin Feng, Dan Su, Dong Yu (2021), SpeechMoE: Scaling to Large Acoustic Models with Dynamic Routing Mixture of Experts, Interspeech
Chi-Hang Leong, Yu-Han Huang, Jen-Tzung Chien (2021), Online Compressive Transformer for End-to-End Speech Recognition, Interspeech
Binghuai Lin, Liyuan Wang (2021), End to End Transformer-Based Contextual Speech Recognition Based on Pointer Network, Interspeech
Shigeki Karita, Yotaro Kubo, Michiel Adriaan Unico Bacchiani, Llion Jones (2021), A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition, Interspeech
Takaaki Hori, Niko Moritz, Chiori Hori, Jonathan Le Roux (2021), Advanced Long-Context End-to-End Speech Recognition Using Context-Expanded Transformers, Interspeech
Md. Akmal Haidar, Chao Xing, Mehdi Rezagholizadeh (2021), Transformer-Based ASR Incorporating Time-Reduction Layer and Fine-Tuning with Self-Knowledge Distillation, Interspeech
Jay Mahadeokar, Yangyang Shi, Yuan Shangguan, Chunyang Wu, Alex Xiao, Hang Su, Duc Le, Ozlem Kalinli, Christian Fuegen, Michael L. Seltzer (2021), Flexi-Transducer: Optimizing Latency, Accuracy and Compute for Multi-Domain On-Device Scenarios, Interspeech
Przemyslaw Falkowski-Gilski (2021), Difference in Perceived Speech Signal Quality Assessment Among Monolingual and Bilingual Teenage Students, Interspeech
Christopher Schymura, Benedikt Bönninghoff, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Dorothea Kolossa (2021), PILOT: Introducing Transformers for Probabilistic Sound Event Localization, Interspeech
Masahito Togami, Robin Scheibler (2021), Sound Source Localization with Majorization Minimization, Interspeech
Gabriel Mittag, Babak Naderi, Assmaa Chehadi, Sebastian Möller (2021), NISQA: A Deep CNN-Self-Attention Model for Multidimensional Speech Quality Prediction with Crowdsourced Datasets, Interspeech
Babak Naderi, Ross Cutler (2021), Subjective Evaluation of Noise Suppression Algorithms in Crowdsourcing, Interspeech
Jianhua Geng, Sifan Wang, Juan Li, JingWei Li, Xin Lou (2021), Reliable Intensity Vector Selection for Multi-Source Direction-of-Arrival Estimation Using a Single Acoustic Vector Sensor, Interspeech
Meng Yu, Chunlei Zhang, Yong Xu, Shi-Xiong Zhang, Dong Yu (2021), MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment, Interspeech
Andrea Toma, Daniele Salvati, Carlo Drioli, Gian Luca Foresti (2021), CNN-Based Processing of Acoustic and Radio Frequency Signals for Speaker Localization from MAVs, Interspeech
Katsutoshi Itoyama, Yoshiya Morimoto, Shungo Masaki, Ryosuke Kojima, Kenji Nishida, Kazuhiro Nakadai (2021), Assessment of von Mises-Bernoulli Deep Neural Network in Sound Source Localization, Interspeech
Rongliang Liu, Nengheng Zheng, Xi Chen (2021), Feature Fusion by Attention Networks for Robust DOA Estimation, Interspeech
Shoufeng Lin, Zhaojie Luo (2021), Far-Field Speaker Localization and Adaptive GLMB Tracking, Interspeech
Vivek Sivaraman Narayanaswamy, Jayaraman J. Thiagarajan, Andreas Spanias (2021), On the Design of Deep Priors for Unsupervised Audio Restoration, Interspeech
Weiguang Chen, Cheng Xue, Xionghu Zhong (2021), Cramér-Rao Lower Bound for DOA Estimation with an Array of Directional Microphones in Reverberant Environments, Interspeech
Jaeseong You, Dalhyun Kim, Gyuhyeon Nam, Geumbyeol Hwang, Gyeongsu Chae (2021), GAN Vocoder: Multi-Resolution Discriminator Is All You Need, Interspeech
Jian Cong, Shan Yang, Lei Xie, Dan Su (2021), Glow-WaveGAN: Learning Speech Representations from GAN-Based Variational Auto-Encoder for High Fidelity Flow-Based Speech Synthesis, Interspeech
Reo Yoneyama, Yi-Chiao Wu, Tomoki Toda (2021), Unified Source-Filter GAN: Unified Source-Filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN, Interspeech
Kazuki Mizuta, Tomoki Koriyama, Hiroshi Saruwatari (2021), Harmonic WaveGAN: GAN-Based Speech Waveform Generation Model with Harmonic Structure Discriminator, Interspeech
Ji-Hoon Kim, Sang-Hoon Lee, Ji-Hyun Lee, Seong-Whan Lee (2021), Fre-GAN: Adversarial Frequency-Consistent Audio Synthesis, Interspeech
Jinhyeok Yang, Jae-Sung Bae, Taejun Bak, Young-Ik Kim, Hoon-Young Cho (2021), GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis, Interspeech
Won Jang, Dan Lim, Jaesam Yoon, Bongwan Kim, Juntae Kim (2021), UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation, Interspeech
Mohammed Salah Al-Radhi, Tamás Gábor Csapó, Csaba Zainkó, Géza Németh (2021), Continuous Wavelet Vocoder-Based Decomposition of Parametric Speech Waveform Synthesis, Interspeech
Patrick Lumban Tobing, Tomoki Toda (2021), High-Fidelity and Low-Latency Universal Neural Vocoder Based on Multiband WaveRNN with Data-Driven Linear Prediction for Discrete Waveform Modeling, Interspeech
Zhengxi Liu, Yanmin Qian (2021), Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition, Interspeech
Min-Jae Hwang, Ryuichi Yamamoto, Eunwoo Song, Jae-Min Kim (2021), High-Fidelity Parallel WaveGAN with Multi-Band Harmonic-Plus-Noise Model, Interspeech
Junkun Chen, Mingbo Ma, Renjie Zheng, Liang Huang (2021), SpecRec: An Alternative Solution for Improving End-to-End Speech-to-Text Translation via Spectrogram Reconstruction, Interspeech
Colin Cherry, Naveen Arivazhagan, Dirk Padfield, Maxim Krikun (2021), Subtitle Translation as Markup Translation, Interspeech
Changhan Wang, Anne Wu, Juan Pino, Alexei Baevski, Michael Auli, Alexis Conneau (2021), Large-Scale Self- and Semi-Supervised Learning for Speech Translation, Interspeech
Changhan Wang, Anne Wu, Jiatao Gu, Juan Pino (2021), CoVoST 2 and Massively Multilingual Speech Translation, Interspeech
Yao-Fei Cheng, Hung-Shin Lee, Hsin-Min Wang (2021), AlloST: Low-Resource Speech Translation Without Source Transcription, Interspeech
Johanes Effendi, Sakriani Sakti, Satoshi Nakamura (2021), Weakly-Supervised Speech-to-Text Mapping with Visually Connected Non-Parallel Speech-Text Data Using Cyclic Partially-Aligned Transformer, Interspeech
Hirotaka Tokuyama, Sakriani Sakti, Katsuhito Sudoh, Satoshi Nakamura (2021), Transcribing Paralinguistic Acoustic Cues to Target Language Text in Transformer-Based Speech-to-Text Translation, Interspeech
Rong Ye, Mingxuan Wang, Lei Li (2021), End-to-End Speech Translation via Cross-Modal Progressive Training, Interspeech
Yuka Ko, Katsuhito Sudoh, Sakriani Sakti, Satoshi Nakamura (2021), ASR Posterior-Based Loss for Multi-Task End-to-End Speech Translation, Interspeech
Alejandro Pérez-González-de-Martos, Javier Iranzo-Sánchez, Adrià Giménez Pastor, Javier Jorge, Joan-Albert Silvestre-Cerdà, Jorge Civera, Albert Sanchis, Alfons Juan (2021), Towards Simultaneous Machine Interpretation, Interspeech
Giuseppe Martucci, Mauro Cettolo, Matteo Negri, Marco Turchi (2021), Lexical Modeling of ASR Errors for Robust Speech Translation, Interspeech
Piyush Vyas, Anastasia Kuznetsova, Donald S. Williamson (2021), Optimally Encoding Inductive Biases into the Transformer Improves End-to-End Speech Translation, Interspeech
Tejaswini Ananthanarayana, Lipisha Chaudhary, Ifeoma Nwogu (2021), Effects of Feature Scaling and Fusion on Sign Language Translation, Interspeech
Alexander Alenin, Anton Okhotnikov, Rostislav Makarov, Nikita Torgashov, Ilya Shigabeev, Konstantin Simonchik (2021), The ID R&D System Description for Short-Duration Speaker Verification Challenge 2021, Interspeech
Jenthe Thienpondt, Brecht Desplanques, Kris Demuynck (2021), Integrating Frequency Translational Invariance in TDNNs and Frequency Positional Information in 2D ResNets to Enhance Speaker Verification, Interspeech
Aleksei Gusev, Alisa Vinogradova, Sergey Novoselov, Sergei Astapov (2021), SdSVC Challenge 2021: Tips and Tricks to Boost the Short-Duration Speaker Verification System Performance, Interspeech
Woo Hyun Kang, Nam Soo Kim (2021), Team02 Text-Independent Speaker Verification System for SdSV Challenge 2021, Interspeech
Xiaoyi Qin, Chao Wang, Yong Ma, Min Liu, Shilei Zhang, Ming Li (2021), Our Learned Lessons from Cross-Lingual Speaker Verification: The CRMI-DKU System Description for the Short-Duration Speaker Verification Challenge 2021, Interspeech
Peng Zhang, Peng Hu, Xueliang Zhang (2021), Investigation of IMU&Elevoc Submission for the Short-Duration Speaker Verification Challenge 2021, Interspeech
Jie Yan, Shengyu Yao, Yiqian Pan, Wei Chen (2021), The Sogou System for Short-Duration Speaker Verification Challenge 2021, Interspeech
Bing Han, Zhengyang Chen, Zhikai Zhou, Yanmin Qian (2021), The SJTU System for Short-Duration Speaker Verification Challenge 2021, Interspeech
Sungjae Cho, Soo-Young Lee (2021), Multi-Speaker Emotional Text-to-Speech Synthesizer, Interspeech
Aleš Pražák, Zdeněk Loose, Josef V. Psutka, Vlasta Radová, Josef Psutka, Jan Švec (2021), Live TV Subtitling Through Respeaking, Interspeech
Stefan Fragner, Tobias Topar, Maximilian Giller, Lukas Pfeifenberger, Franz Pernkopf (2021), Autonomous Robot for Measuring Room Impulse Responses, Interspeech
Jonas Beskow, Charlie Caper, Johan Ehrenfors, Nils Hagberg, Anne Jansen, Chris Wood (2021), Expressive Robot Performance Based on Facial Motion Capture, Interspeech
Mónica Domínguez, Juan Soler-Company, Leo Wanner (2021), ThemePro 2.0: Showcasing the Role of Thematic Progression in Engaging Human-Computer Interaction, Interspeech
Sai Guruju, Jithendra Vepa (2021), Addressing Compliance in Call Centers with Entity Extraction, Interspeech
Krishnachaitanya Gogineni, Tarun Reddy Yadama, Jithendra Vepa (2021), Audio Segmentation Based Conversational Silence Detection for Contact Center Calls, Interspeech
Desh Raj, Sanjeev Khudanpur (2021), Reformulating DOVER-Lap Label Mapping as a Graph Partitioning Problem, Interspeech
Hemlata Tak, Jee-weon Jung, Jose Patino, Massimiliano Todisco, Nicholas Evans (2021), Graph Attention Networks for Anti-Spoofing, Interspeech
Victoria Mingote, Antonio Miguel, Alfonso Ortega, Eduardo Lleida (2021), Log-Likelihood-Ratio Cost Function as Objective Loss for Speaker Verification Systems, Interspeech
Junyi Peng, Xiaoyang Qu, Rongzhi Gu, Jianzong Wang, Jing Xiao, Lukáš Burget, Jan Černocký (2021), Effective Phase Encoding for End-To-End Speaker Verification, Interspeech
Ha Nguyen, Yannick Estève, Laurent Besacier (2021), Impact of Encoding and Segmentation Strategies on End-to-End Simultaneous Speech Translation, Interspeech
Dominik Macháček, Matúš Žilinec, Ondřej Bojar (2021), Lost in Interpreting: Speech Translation from Source or Interpreter?, Interspeech
Baptiste Pouthier, Laurent Pilati, Leela K. Gudupudi, Charles Bouveyron, Frederic Precioso (2021), Active Speaker Detection as a Multi-Objective Optimization with Uncertainty-Based Multimodal Fusion, Interspeech
Sarenne Wallbridge, Peter Bell, Catherine Lai (2021), It’s Not What You Said, it’s How You Said it: Discriminative Perception of Speech as a Multichannel Communication System, Interspeech
Thilo Michael, Gabriel Mittag, Andreas Bütow, Sebastian Möller (2021), Extending the Fullband E-Model Towards Background Noise, Bursty Packet Loss, and Conversational Degradations, Interspeech
Christian Bergler, Manuel Schmitt, Andreas Maier, Helena Symonds, Paul Spong, Steven R. Ness, George Tzanetakis, Elmar Nöth (2021), ORCA-SLANG: An Automatic Multi-Stage Semi-Supervised Deep Learning Framework for Large-Scale Killer Whale Call Type Identification, Interspeech
Wim Boes, Hugo Van hamme (2021), Audiovisual Transfer Learning for Audio Tagging and Sound Event Detection, Interspeech
Natalia Nessler, Milos Cernak, Paolo Prandoni, Pablo Mainar (2021), Non-Intrusive Speech Quality Assessment with Transfer Learning and Subject-Specific Scaling, Interspeech
Andreea-Maria Oncescu, A. Sophia Koepke, João F. Henriques, Zeynep Akata, Samuel Albanie (2021), Audio Retrieval with Natural Language Queries, Interspeech
Manuel Giollo, Deniz Gunceler, Yulan Liu, Daniel Willett (2021), Bootstrap an End-to-End ASR System by Multilingual Training, Transfer Learning, Text-to-Text Mapping and Synthetic Audio, Interspeech
Ngoc-Quan Pham, Tuan-Nam Nguyen, Sebastian Stüker, Alex Waibel (2021), Efficient Weight Factorization for Multilingual Speech Recognition, Interspeech
Alexis Conneau, Alexei Baevski, Ronan Collobert, Abdelrahman Mohamed, Michael Auli (2021), Unsupervised Cross-Lingual Representation Learning for Speech Recognition, Interspeech
Tomoaki Hayakawa, Chee Siang Leow, Akio Kobayashi, Takehito Utsuro, Hiromitsu Nishizaki (2021), Language and Speaker-Independent Feature Transformation for End-to-End Multilingual Speech Recognition, Interspeech
Krishna D. N, Pinyi Wang, Bruno Bozza (2021), Using Large Self-Supervised Models for Low-Resource Speech Recognition, Interspeech
Mari Ganesh Kumar, Jom Kuriakose, Anand Thyagachandran, Arun Kumar A, Ashish Seth, Lodagala V.S.V. Durga Prasad, Saish Jaiswal, Anusha Prakash, Hema A. Murthy (2021), Dual Script E2E Framework for Multilingual and Code-Switching ASR, Interspeech
Anuj Diwan, Rakesh Vaideeswaran, Sanket Shah, Ankita Singh, Srinivasa Raghavan, Shreya Khare, Vinit Unni, Saurabh Vyas, Akash Rajpuria, Chiranjeevi Yarra, Ashish Mittal, Prasanta Kumar Ghosh, Preethi Jyothi, Kalika Bali, Vivek Seshadri, Sunayana Sitaram, Samarth Bharadwaj, Jai Nanavati, Raoul Nanavati, Karthik Sankaranarayanan (2021), MUCS 2021: Multilingual and Code-Switching ASR Challenges for Low Resource Indian Languages, Interspeech
Genta Indra Winata, Guangsen Wang, Caiming Xiong, Steven Hoi (2021), Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition, Interspeech
Hardik Sailor, Kiran Praveen T, Vikas Agrawal, Abhinav Jain, Abhishek Pandey (2021), SRI-B End-to-End System for Multilingual and Code-Switching ASR Challenges for Low Resource Indian Languages, Interspeech
Xinjian Li, Juncheng Li, Florian Metze, Alan W. Black (2021), Hierarchical Phone Recognition with Compositional Phonetics, Interspeech
Shammur Absar Chowdhury, Amir Hussein, Ahmed Abdelali, Ahmed Ali (2021), Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR, Interspeech
Brian Yan, Siddharth Dalmia, David R. Mortensen, Florian Metze, Shinji Watanabe (2021), Differentiable Allophone Graphs for Language-Universal Speech Recognition, Interspeech
Vincent P. Martin, Jean-Luc Rouas, Florian Boyer, Pierre Philip (2021), Automatic Speech Recognition Systems Errors for Objective Sleepiness Detection Through Voice, Interspeech
Jon Gillick, Wesley Deng, Kimiko Ryokai, David Bamman (2021), Robust Laughter Detection in Noisy Environments, Interspeech
Mizuki Nagano, Yusuke Ijima, Sadao Hiroya (2021), Impact of Emotional State on Estimation of Willingness to Buy from Advertising Speech, Interspeech
Huda Alsofyani, Alessandro Vinciarelli (2021), Stacked Recurrent Neural Networks for Speech-Based Inference of Attachment Condition in School Age Children, Interspeech
Nujud Aloshban, Anna Esposito, Alessandro Vinciarelli (2021), Language or Paralanguage, This is the Problem: Comparing Depressed and Non-Depressed Speakers Through the Analysis of Gated Multimodal Units, Interspeech
Aniruddha Tammewar, Alessandra Cervone, Giuseppe Riccardi (2021), Emotion Carrier Recognition from Personal Narratives, Interspeech
Scott Condron, Georgia Clarke, Anita Klementiev, Daniela Morse-Kopp, Jack Parry, Dimitri Palaz (2021), Non-Verbal Vocalisation and Laughter Detection Using Sequence-to-Sequence Models and Multi-Label Training, Interspeech
Cong Cai, Mingyue Niu, Bin Liu, Jianhua Tao, Xuefei Liu (2021), TDCA-Net: Time-Domain Channel Attention Network for Depression Detection, Interspeech
Catarina Botelho, Alberto Abad, Tanja Schultz, Isabel Trancoso (2021), Visual Speech for Obstructive Sleep Apnea Detection, Interspeech
Hector A. Cordourier Maruri, Sinem Aslan, Georg Stemmer, Nese Alyuz, Lama Nachman (2021), Analysis of Contextual Voice Changes in Remote Meetings, Interspeech
Nadee Seneviratne, Carol Espy-Wilson (2021), Speech Based Depression Severity Level Classification Using a Multi-Stage Dilated CNN-LSTM Model, Interspeech
Ho-Gyeong Kim, Min-Joong Lee, Hoshik Lee, Tae Gyoon Kang, Jihyun Lee, Eunho Yang, Sung Ju Hwang (2021), Multi-Domain Knowledge Distillation via Uncertainty-Matching for End-to-End ASR Models, Interspeech
Jonathan Macoskey, Grant P. Strimel, Ariya Rastrow (2021), Learning a Neural Diff for Speech Models, Interspeech
Shucong Zhang, Erfan Loweimi, Peter Bell, Steve Renals (2021), Stochastic Attention Head Removal: A Simple and Effective Method for Improving Transformer Based ASR Models, Interspeech
Jiabin Xue, Tieran Zheng, Jiqing Han (2021), Model-Agnostic Fast Adaptive Multi-Objective Balancing Algorithm for Multilingual Automatic Speech Recognition Model Training, Interspeech
Heng-Jui Chang, Hung-yi Lee, Lin-shan Lee (2021), Towards Lifelong Learning of End-to-End ASR, Interspeech
Isabel Leal, Neeraj Gaur, Parisa Haghani, Brian Farris, Pedro J. Moreno, Manasa Prasad, Bhuvana Ramabhadran, Yun Zhu (2021), Self-Adaptive Distillation for Multilingual Speech Recognition: Leveraging Student Independence, Interspeech
Hainan Xu, Kartik Audhkhasi, Yinghui Huang, Jesse Emond, Bhuvana Ramabhadran (2021), Regularizing Word Segmentation by Creating Misspellings, Interspeech
Peidong Wang, Tara N. Sainath, Ron J. Weiss (2021), Multitask Training with Text Data for End-to-End Speech Recognition, Interspeech
Xianzhao Chen, Hao Ni, Yi He, Kang Wang, Zejun Ma, Zongxia Xie (2021), Emitting Word Timings with HMM-Free End-to-End System in Automatic Speech Recognition, Interspeech
Jasha Droppo, Oguz Elibol (2021), Scaling Laws for Acoustic Models, Interspeech
Jayadev Billa (2021), Leveraging Non-Target Language Resources to Improve ASR Performance in a Target Language, Interspeech
Andrea Fasoli, Chia-Yu Chen, Mauricio Serrano, Xiao Sun, Naigang Wang, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Wei Zhang, Zoltán Tüske, Kailash Gopalakrishnan (2021), 4-Bit Quantization of LSTM-Based Speech Recognition Models, Interspeech
Ryo Masumura, Daiki Okamura, Naoki Makishima, Mana Ihori, Akihiko Takashima, Tomohiro Tanaka, Shota Orihashi (2021), Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation, Interspeech
Zhong Meng, Yu Wu, Naoyuki Kanda, Liang Lu, Xie Chen, Guoli Ye, Eric Sun, Jinyu Li, Yifan Gong (2021), Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition, Interspeech
Dongcheng Jiang, Chao Zhang, Philip C. Woodland (2021), Variable Frame Rate Acoustic Models Using Minimum Error Reinforcement Learning, Interspeech
Constantijn Kaland, Matthew Gordon (2021), How f0 and Phrase Position Affect Papuan Malay Word Identification, Interspeech
Anna Bothe Jespersen, Pavel Šturm, Míša Hejná (2021), On the Feasibility of the Danish Model of Intonational Transcription: Phonetic Evidence from Jutlandic Danish, Interspeech
Adrien Méli, Nicolas Ballier, Achille Falaise, Alice Henderson (2021), An Experiment in Paratone Detection in a Prosodically Annotated EAP Spoken Corpus, Interspeech
Branislav Gerazov, Michael Wagner (2021), ProsoBeast Prosody Annotation Tool, Interspeech
Trang Tran, Mari Ostendorf (2021), Assessing the Use of Prosody in Constituency Parsing of Imperfect Transcripts, Interspeech
Roger Cheng-yen Liu, Feng-fan Hsieh, Yueh-chin Chang (2021), Targeted and Targetless Neutral Tones in Taiwanese Southern Min, Interspeech
Mária Gósy, Kálmán Abari (2021), The Interaction of Word Complexity and Word Duration in an Agglutinative Language, Interspeech
Ho-hsien Pan, Shao-ren Lyu (2021), Taiwan Min Nan (Taiwanese) Checked Tones Sound Change, Interspeech
Moritz Jakob, Bettina Braun, Katharina Zahner-Ritter (2021), In-Group Advantage in the Perception of Emotions: Evidence from Three Varieties of German, Interspeech
Christer Gobl (2021), The LF Model in the Frequency Domain for Glottal Airflow Modelling Without Aliasing Distortion, Interspeech
Michael Wagner, Alvaro Iturralde Zurita, Sijia Zhang (2021), Parsing Speech for Grouping and Prominence, and the Typology of Rhythm, Interspeech
Benazir Mumtaz, Massimiliano Canzi, Miriam Butt (2021), Prosody of Case Markers in Urdu, Interspeech
Brynhildur Stefansdottir, Francesco Burroni, Sam Tilsen (2021), Articulatory Characteristics of Icelandic Voiced Fricative Lenition: Gradience, Categoricity, and Speaker/Gesture-Specific Effects, Interspeech
Khia A. Johnson (2021), Leveraging the Uniformity Framework to Examine Crosslinguistic Similarity for Long-Lag Stops in Spontaneous Cantonese-English Bilingual Speech, Interspeech
Aswin Sivaraman, Sunwoo Kim, Minje Kim (2021), Personalized Speech Enhancement Through Self-Supervised Data Augmentation and Purification, Interspeech
Mark R. Saddler, Andrew Francl, Jenelle Feather, Kaizhi Qian, Yang Zhang, Josh H. McDermott (2021), Speech Denoising with Auditory Models, Interspeech
Sefik Emre Eskimez, Xiaofei Wang, Min Tang, Hemin Yang, Zirun Zhu, Zhuo Chen, Huaming Wang, Takuya Yoshioka (2021), Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement, Interspeech
Xinmeng Xu, Yang Wang, Dongxiang Xu, Yiyuan Peng, Cong Zhang, Jie Jia, Binbin Chen (2021), Multi-Stage Progressive Speech Enhancement Network, Interspeech
Oscar Chang, Dung N. Tran, Kazuhito Koishida (2021), Single-Channel Speech Enhancement Using Learnable Loss Mixup, Interspeech
Xiao-Qi Zhang, Jun Du, Li Chai, Chin-Hui Lee (2021), A Maximum Likelihood Approach to SNR-Progressive Learning Using Generalized Gaussian Distribution for LSTM-Based Speech Enhancement, Interspeech
Vikas Agrawal, Shashi Kumar, Shakti P. Rath (2021), Whisper Speech Enhancement Using Joint Variational Autoencoder for Improved Speech Recognition, Interspeech
Lukas Lee, Youna Ji, Minjae Lee, Min-Seok Choi (2021), DEMUCS-Mobile : On-Device Lightweight Speech Enhancement, Interspeech
Madhav Mahesh Kashyap, Anuj Tambwekar, Krishnamoorthy Manohara, S. Natarajan (2021), Speech Denoising Without Clean Training Data: A Noise2Noise Approach, Interspeech
Feng Dang, Pengyuan Zhang, Hangting Chen (2021), Improved Speech Enhancement Using a Complex-Domain GAN with Fused Time-Domain and Time-Frequency Domain Constraints, Interspeech
Xudong Zhang, Liang Zhao, Feng Gu (2021), Speech Enhancement with Topology-Enhanced Generative Adversarial Networks (GANs), Interspeech
Suliang Bu, Yunxin Zhao, Shaojun Wang, Mei Han (2021), Learning Speech Structure to Improve Time-Frequency Masks, Interspeech
Eesung Kim, Hyeji Seo (2021), SE-Conformer: Time-Domain Speech Enhancement Using Conformer, Interspeech
Thananchai Kongthaworn, Burin Naowarat, Ekapol Chuangsuwanich (2021), Spectral and Latent Speech Representation Distortion for TTS Evaluation, Interspeech
Cassia Valentini-Botinhao, Simon King (2021), Detection and Analysis of Attention Errors in Sequence-to-Sequence Text-to-Speech, Interspeech
Rohola Zandie, Mohammad H. Mahoor, Julia Madsen, Eshrat S. Emamian (2021), RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis, Interspeech
Yao Shi, Hui Bu, Xin Xu, Shaoji Zhang, Ming Li (2021), AISHELL-3: A Multi-Speaker Mandarin TTS Corpus, Interspeech
Nicholas Eng, C.T. Justine Hui, Yusuke Hioka, Catherine I. Watson (2021), Comparing Speech Enhancement Techniques for Voice Adaptation-Based Speech Synthesis, Interspeech
Chenye Cui, Yi Ren, Jinglin Liu, Feiyang Chen, Rongjie Huang, Ming Lei, Zhou Zhao (2021), EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model, Interspeech
Sai Sirisha Rallabandi, Abhinav Bharadwaj, Babak Naderi, Sebastian Möller (2021), Perception of Social Speaker Characteristics in Synthetic Speech, Interspeech
Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg, Yang Zhang (2021), Hi-Fi Multi-Speaker English TTS Dataset, Interspeech
Wei-Cheng Tseng, Chien-yu Huang, Wei-Tsung Kao, Yist Y. Lin, Hung-yi Lee (2021), Utilizing Self-Supervised Representations for MOS Prediction, Interspeech
Saida Mussakhojayeva, Aigerim Janaliyeva, Almas Mirzakhmetov, Yerbolat Khassanov, Huseyin Atakan Varol (2021), KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset, Interspeech
Jason Taylor, Korin Richmond (2021), Confidence Intervals for ASR-Based TTS Evaluation, Interspeech
Chandan K.A. Reddy, Harishchandra Dubey, Kazuhito Koishida, Arun Nair, Vishak Gopal, Ross Cutler, Sebastian Braun, Hannes Gamper, Robert Aichner, Sriram Srinivasan (2021), INTERSPEECH 2021 Deep Noise Suppression Challenge, Interspeech
Andong Li, Wenzhe Liu, Xiaoxue Luo, Guochen Yu, Chengshi Zheng, Xiaodong Li (2021), A Simultaneous Denoising and Dereverberation Framework with Target Decoupling, Interspeech
Ziyi Xu, Maximilian Strake, Tim Fingscheidt (2021), Deep Noise Suppression with Non-Intrusive PESQNet Supervision Enabling the Use of Real Training Data, Interspeech
Xiaohuai Le, Hongsheng Chen, Kai Chen, Jing Lu (2021), DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement, Interspeech
Shubo Lv, Yanxin Hu, Shimin Zhang, Lei Xie (2021), DCCRN+: Channel-Wise Subband DCCRN with SNR Estimation for Speech Enhancement, Interspeech
Kanghao Zhang, Shulin He, Hao Li, Xueliang Zhang (2021), DBNet: A Dual-Branch Network Architecture Processing on Spectrum and Waveform for Single-Channel Speech Enhancement, Interspeech
Xu Zhang, Xinlei Ren, Xiguang Zheng, Lianwu Chen, Chen Zhang, Liang Guo, Bing Yu (2021), Low-Delay Speech Enhancement Using Perceptually Motivated Target and Loss, Interspeech
Koen Oostermeijer, Qing Wang, Jun Du (2021), Lightweight Causal Transformer with Local Self-Attention for Real-Time Speech Enhancement, Interspeech
Nicolae-Cătălin Ristea, Radu Tudor Ionescu (2021), Self-Paced Ensemble Learning for Speech and Audio Classification, Interspeech
Atsushi Kojima (2021), Knowledge Distillation for Streaming Transformer–Transducer, Interspeech
Timo Lohrenz, Zhengyang Li, Tim Fingscheidt (2021), Multi-Encoder Learning and Stream Fusion for Transformer-Based End-to-End Automatic Speech Recognition, Interspeech
Salah Zaiem, Titouan Parcollet, Slim Essid (2021), Conditional Independence for Pretext Task Selection in Self-Supervised Speech Representation Learning, Interspeech
Mohammad Zeineldeen, Aleksandr Glushko, Wilfried Michel, Albert Zeyer, Ralf Schlüter, Hermann Ney (2021), Investigating Methods to Improve Language Model Integration for Attention-Based Encoder-Decoder ASR Models, Interspeech
Apoorv Vyas, Srikanth Madikeri, Hervé Bourlard (2021), Comparing CTC and LFMMI for Out-of-Domain Adaptation of wav2vec 2.0 Acoustic Model, Interspeech
Clément Le Moine, Nicolas Obin, Axel Roebel (2021), Speaker Attentive Speech Emotion Recognition, Interspeech
Seong-Gyun Leem, Daniel Fulford, Jukka-Pekka Onnela, David Gard, Carlos Busso (2021), Separation of Emotional and Reconstruction Embeddings on Ladder Network to Improve Speech Emotion Recognition Robustness in Noisy Conditions, Interspeech
Efthymios Georgiou, Georgios Paraskevopoulos, Alexandros Potamianos (2021), M3: MultiModal Masking Applied to Sentiment Analysis, Interspeech
Ondřej Klejch, Electra Wallington, Peter Bell (2021), The CSTR System for Multilingual and Code-Switching ASR Challenges for Low Resource Indian Languages, Interspeech
Wei Zhou, Mohammad Zeineldeen, Zuoyun Zheng, Ralf Schlüter, Hermann Ney (2021), Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition, Interspeech
Wei Zhou, Albert Zeyer, André Merboldt, Ralf Schlüter, Hermann Ney (2021), Equivalence of Segmental and Neural Transducer Modeling: A Proof of Concept, Interspeech
Abbas Khosravani, Philip N. Garner, Alexandros Lazaridis (2021), Modeling Dialectal Variation for Swiss German Automatic Speech Recognition, Interspeech
Ekaterina Egorova, Hari Krishna Vydana, Lukáš Burget, Jan Černocký (2021), Out-of-Vocabulary Words Detection with Attention and CTC Alignments in an End-to-End ASR System, Interspeech
Matthew Wiesner, Mousmita Sarma, Ashish Arora, Desh Raj, Dongji Gao, Ruizhe Huang, Supreet Preet, Moris Johnson, Zikra Iqbal, Nagendra Goel, Jan Trmal, Leibny Paola García Perera, Sanjeev Khudanpur (2021), Training Hybrid Models on Noisy Transliterated Transcripts for Code-Switched Speech Recognition, Interspeech
Wei Xue, Roeland van Hout, Fleur Boogmans, Mario Ganzeboom, Catia Cucchiarini, Helmer Strik (2021), Speech Intelligibility of Dysarthric Speech: Human Scores and Acoustic-Phonetic Features, Interspeech
Young-Kyung Kim, Rimita Lahiri, Md. Nasir, So Hyun Kim, Somer Bishop, Catherine Lord, Shrikanth S. Narayanan (2021), Analyzing Short Term Dynamic Speech Features for Understanding Behavioral Traits of Children with Autism Spectrum Disorder, Interspeech
Waldemar Jęśko (2021), Vocalization Recognition of People with Profound Intellectual and Multiple Disabilities (PIMD) Using Machine Learning Algorithms, Interspeech
Barbara Gili Fivela, Vincenzo Sallustio, Silvia Pede, Danilo Patrocinio (2021), Phonetic Complexity, Speech Accuracy and Intelligibility Assessment of Italian Dysarthric Speech, Interspeech
Si-Ioi Ng, Cymie Wing-Yee Ng, Jingyu Li, Tan Lee (2021), Detection of Consonant Errors in Disordered Speech Based on Consonant-Vowel Segment Embedding, Interspeech
Adam Hair, Guanlong Zhao, Beena Ahmed, Kirrie J. Ballard, Ricardo Gutierrez-Osuna (2021), Assessing Posterior-Based Mispronunciation Detection on Field-Collected Recordings from Child Speech Therapy Sessions, Interspeech
Bahman Mirheidari, Yilin Pan, Daniel Blackburn, Ronan O’Malley, Heidi Christensen (2021), Identifying Cognitive Impairment Using Sentence Representation Vectors, Interspeech
Zhengjun Yue, Jon Barker, Heidi Christensen, Cristina McKean, Elaine Ashton, Yvonne Wren, Swapnil Gadgil, Rebecca Bright (2021), Parental Spoken Scaffolding and Narrative Skills in Crowd-Sourced Storytelling Samples of Young Children, Interspeech
Tong Xia, Jing Han, Lorena Qendro, Ting Dang, Cecilia Mascolo (2021), Uncertainty-Aware COVID-19 Detection from Imbalanced Sound Data, Interspeech
Disong Wang, Liqun Deng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng (2021), Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization, Interspeech
Tanuka Bhattacharjee, Jhansi Mallela, Yamini Belur, Nalini Atchayaram, Ravi Yadav, Pradeep Reddy, Dipanjan Gope, Prasanta Kumar Ghosh (2021), Source and Vocal Tract Cues for Speech-Based Classification of Patients with Parkinson’s Disease and Healthy Subjects, Interspeech
R’mani Haulcy, James Glass (2021), CLAC: A Speech Corpus of Healthy English Speakers, Interspeech
Leanne Nortje, Herman Kamper (2021), Direct Multimodal Few-Shot Learning of Speech and Images, Interspeech
Ramon Sanabria, Austin Waters, Jason Baldridge (2021), Talk, Don’t Write: A Study of Direct Speech-Based Image Retrieval, Interspeech
Huan Zhao, Kaili Ma (2021), A Fast Discrete Two-Step Learning Hashing for Scalable Cross-Modal Retrieval, Interspeech
Jianrong Wang, Ziyue Tang, Xuewei Li, Mei Yu, Qiang Fang, Li Liu (2021), Cross-Modal Knowledge Distillation Method for Automatic Cued Speech Recognition, Interspeech
Kayode Olaleye, Herman Kamper (2021), Attention-Based Keyword Localisation in Speech Using Visual Grounding, Interspeech
Khazar Khorrami, Okko Räsänen (2021), Evaluation of Audio-Visual Alignments in Visually Grounded Speech Models, Interspeech
Hang Chen, Jun Du, Yu Hu, Li-Rong Dai, Bao-Cai Yin, Chin-Hui Lee (2021), Automatic Lip-Reading with Hierarchical Pyramidal Convolution and Self-Attention for Image Sequences with No Word Boundaries, Interspeech
Andrew Rouditchenko, Angie Boggust, David Harwath, Samuel Thomas, Hilde Kuehne, Brian Chen, Rameswar Panda, Rogerio Feris, Brian Kingsbury, Michael Picheny, James Glass (2021), Cascaded Multilingual Audio-Visual Learning from Videos, Interspeech
Pingchuan Ma, Rodrigo Mira, Stavros Petridis, Björn W. Schuller, Maja Pantic (2021), LiRA: Learning Visual Speech Representations from Audio Through Self-Supervision, Interspeech
Richard Rose, Olivier Siohan, Anshuman Tripathi, Otavio Braga (2021), End-to-End Audio-Visual Speech Recognition for Overlapping Speech, Interspeech
Yifei Wu, Chenda Li, Song Yang, Zhongqin Wu, Yanmin Qian (2021), Audio-Visual Multi-Talker Speech Recognition in a Cocktail Party, Interspeech
Sanyuan Chen, Yu Wu, Zhuo Chen, Jian Wu, Takuya Yoshioka, Shujie Liu, Jinyu Li, Xiangzhan Yu (2021), Ultra Fast Speech Separation Model with Teacher Student Learning, Interspeech
Murtiza Ali, Ashwani Koul, Karan Nathwani (2021), Group Delay Based Re-Weighted Sparse Recovery Algorithms for Robust and High-Resolution Source Separation in DOA Framework, Interspeech
Cong Han, Yi Luo, Chenda Li, Tianyan Zhou, Keisuke Kinoshita, Shinji Watanabe, Marc Delcroix, Hakan Erdogan, John R. Hershey, Nima Mesgarani, Zhuo Chen (2021), Continuous Speech Separation Using Speaker Inventory for Long Recording, Interspeech
Weitao Yuan, Shengbei Wang, Xiangrui Li, Masashi Unoki, Wenwu Wang (2021), Crossfire Conditional Generative Adversarial Networks for Singing Voice Extraction, Interspeech
Kai Wang, Hao Huang, Ying Hu, Zhihua Huang, Sheng Li (2021), End-to-End Speech Separation Using Orthogonal Representation in Complex and Real Time-Frequency Domain, Interspeech
Yu Nakagome, Masahito Togami, Tetsuji Ogawa, Tetsunori Kobayashi (2021), Efficient and Stable Adversarial Learning Using Unpaired Data for Unsupervised Multichannel Speech Separation, Interspeech
Sung-Feng Huang, Shun-Po Chuang, Da-Rong Liu, Yi-Chen Chen, Gene-Ping Yang, Hung-yi Lee (2021), Stabilizing Label Assignment for Speech Separation by Self-Supervised Pre-Training, Interspeech
Fan-Lin Wang, Yu-Huai Peng, Hung-Shin Lee, Hsin-Min Wang (2021), Dual-Path Filter Network: Speaker-Aware Modeling for Speech Separation, Interspeech
Jian Wu, Zhuo Chen, Sanyuan Chen, Yu Wu, Takuya Yoshioka, Naoyuki Kanda, Shujie Liu, Jinyu Li (2021), Investigation of Practical Aspects of Single Channel Speech Separation for ASR, Interspeech
Yi Luo, Nima Mesgarani (2021), Implicit Filter-and-Sum Network for End-to-End Multi-Channel Speech Separation, Interspeech
Yong Xu, Zhuohuang Zhang, Meng Yu, Shi-Xiong Zhang, Dong Yu (2021), Generalized Spatio-Temporal RNN Beamformer for Target Speech Separation, Interspeech
Yi Chieh Liu, Eunjung Han, Chul Lee, Andreas Stolcke (2021), End-to-End Neural Diarization: From Transformer to Conformer, Interspeech
Jee-weon Jung, Hee-Soo Heo, Youngki Kwon, Joon Son Chung, Bong-Jin Lee (2021), Three-Class Overlapped Speech Detection Using a Convolutional Recurrent Neural Network, Interspeech
Xucheng Wan, Kai Liu, Huan Zhou (2021), Online Speaker Diarization Equipped with Discriminative Modeling and Guided Inference, Interspeech
Yuki Takashima, Yusuke Fujita, Shota Horiguchi, Shinji Watanabe, Leibny Paola García Perera, Kenji Nagamatsu (2021), Semi-Supervised Training with Pseudo-Labeling for End-To-End Neural Diarization, Interspeech
Youngki Kwon, Jee-weon Jung, Hee-Soo Heo, You Jin Kim, Bong-Jin Lee, Joon Son Chung (2021), Adapting Speaker Embeddings for Speaker Diarisation, Interspeech
Yu-Xuan Wang, Jun Du, Maokui He, Shu-Tong Niu, Lei Sun, Chin-Hui Lee (2021), Scenario-Dependent Speaker Diarization for DIHARD-III Challenge, Interspeech
Hervé Bredin, Antoine Laurent (2021), End-To-End Speaker Segmentation for Overlap-Aware Resegmentation, Interspeech
Yawen Xue, Shota Horiguchi, Yusuke Fujita, Yuki Takashima, Shinji Watanabe, Leibny Paola García Perera, Kenji Nagamatsu (2021), Online Streaming End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers, Interspeech
Or Haim Anidjar, Itshak Lapidot, Chen Hajaj, Amit Dvir (2021), A Thousand Words are Worth More Than One Recording: Word-Embedding Based Speaker Change Detection, Interspeech
Kosuke Futamata, Byeongseon Park, Ryuichi Yamamoto, Kentaro Tachibana (2021), Phrase Break Prediction with Bidirectional Encoder Representations in Japanese Text-to-Speech Synthesis, Interspeech
Iván Vallés-Pérez, Julian Roth, Grzegorz Beringer, Roberto Barra-Chicote, Jasha Droppo (2021), Improving Multi-Speaker TTS Prosody Variance with a Residual Encoder and Normalizing Flows, Interspeech
Chenpeng Du, Kai Yu (2021), Rich Prosody Diversity Modelling with Phone-Level Mixture Density Network, Interspeech
Kenichi Fujita, Atsushi Ando, Yusuke Ijima (2021), Phoneme Duration Modeling Using Speech Rhythm-Based Speaker Embeddings for Multi-Speaker Speech Synthesis, Interspeech
Yuxiang Zou, Shichao Liu, Xiang Yin, Haopeng Lin, Chunfeng Wang, Haoyu Zhang, Zejun Ma (2021), Fine-Grained Prosody Modeling in Neural Speech Synthesis Using ToBI Representation, Interspeech
Mayank Sharma, Yogesh Virkar, Marcello Federico, Roberto Barra-Chicote, Robert Enyedi (2021), Intra-Sentential Speaking Rate Control in Neural Text-To-Speech for Automatic Dubbing, Interspeech
Guangyan Zhang, Ying Qin, Daxin Tan, Tan Lee (2021), Applying the Information Bottleneck Principle to Prosodic Representation Learning, Interspeech
Alice Baird, Silvan Mertes, Manuel Milling, Lukas Stappen, Thomas Wiest, Elisabeth André, Björn W. Schuller (2021), A Prototypical Network Approach for Evaluating Generated Emotional Speech, Interspeech
Tsukasa Yoshinaga, Kohei Tada, Kazunori Nozaki, Akiyoshi Iida (2021), A Simplified Model for the Vocal Tract of [s] with Inclined Incisors, Interspeech
Takayuki Arai (2021), Vocal-Tract Models to Visualize the Airstream of Human Breath and Droplets While Producing Speech, Interspeech
Ryo Tanji, Hidefumi Ohmura, Kouichi Katsurada (2021), Using Transposed Convolution for Articulatory-to-Acoustic Conversion from Real-Time MRI Data, Interspeech
Rafia Inaam, Tsukasa Yoshinaga, Takayuki Arai, Hiroshi Yokoyama, Akiyoshi Iida (2021), Comparison Between Lumped-Mass Modeling and Flow Simulation of the Reed-Type Artificial Vocal Fold, Interspeech
Raphael Werner, Susanne Fuchs, Jürgen Trouvain, Bernd Möbius (2021), Inhalations in Speech: Acoustic and Physiological Characteristics, Interspeech
Anqi Xu, Daniel van Niekerk, Branislav Gerazov, Paul Konstantin Krug, Santitham Prom-on, Peter Birkholz, Yi Xu (2021), Model-Based Exploration of Linking Between Vowel Articulatory Space and Acoustic Space, Interspeech
Mikey Elmers, Raphael Werner, Beeke Muhlack, Bernd Möbius, Jürgen Trouvain (2021), Take a Breath: Respiratory Sounds Improve Recollection in Synthetic Speech, Interspeech
Taijing Chen, Adam Lammert, Benjamin Parrell (2021), Modeling Sensorimotor Adaptation in Speech Through Alterations to Forward and Inverse Models, Interspeech
Hideki Kawahara, Toshie Matsui, Kohei Yatabe, Ken-Ichi Sakakibara, Minoru Tsuzaki, Masanori Morise, Toshio Irino (2021), Mixture of Orthogonal Sequences Made from Extended Time-Stretched Pulses Enables Measurement of Involuntary Voice Fundamental Frequency Response to Pitch Perturbation, Interspeech
Chenyu You, Nuo Chen, Yuexian Zou (2021), Contextualized Attention-Based Knowledge Transfer for Spoken Conversational Question Answering, Interspeech
Wenying Duan, Xiaoxi He, Zimu Zhou, Hong Rao, Lothar Thiele (2021), Injecting Descriptive Meta-Information into Pre-Trained Language Models with Hypernetworks, Interspeech
Mahdin Rohmatillah, Jen-Tzung Chien (2021), Causal Confusion Reduction for Robust Multi-Domain Dialogue Policy, Interspeech
Shinya Fujie, Hayato Katayama, Jin Sakuma, Tetsunori Kobayashi (2021), Timing Generating Networks: Neural Network Based Precise Turn-Taking Timing Prediction in Multiparty Conversation, Interspeech
Kehan Chen, Zezhong Li, Suyang Dai, Wei Zhou, Haiqing Chen (2021), Human-to-Human Conversation Dataset for Learning Fine-Grained Turn-Taking Action, Interspeech
Mukuntha Narayanan Sundararaman, Ayush Kumar, Jithendra Vepa (2021), PhonemeBERT: Joint Language Modelling of Phoneme Sequence and ASR Transcript, Interspeech
Hongyin Luo, James Glass, Garima Lalwani, Yi Zhang, Shang-Wen Li (2021), Joint Retrieval-Extraction Training for Evidence-Aware Dialog Response Selection, Interspeech
Ashish Shenoy, Sravan Bodapati, Monica Sunkara, Srikanth Ronanki, Katrin Kirchhoff (2021), Adapting Long Context NLM for ASR Rescoring in Conversational Agents, Interspeech
Jing Li, Binling Wang, Yiming Zhi, Zheng Li, Lin Li, Qingyang Hong, Dong Wang (2021), Oriental Language Recognition (OLR) 2020: Summary and Analysis, Interspeech
Raphaël Duroselle, Md. Sahidullah, Denis Jouvet, Irina Illina (2021), Language Recognition on Unknown Conditions: The LORIA-Inria-MULTISPEECH System for AP20-OLR Challenge, Interspeech
Tianlong Kong, Shouyi Yin, Dawei Zhang, Wang Geng, Xin Wang, Dandan Song, Jinwen Huang, Huiyu Shi, Xiaorui Wang (2021), Dynamic Multi-Scale Convolution for Dialect Identification, Interspeech
Ding Wang, Shuaishuai Ye, Xinhui Hu, Sheng Li, Xinkang Xu (2021), An End-to-End Dialect Identification System with Transfer Learning from a Multilingual Automatic Speech Recognition Model, Interspeech
Haibin Yu, Jing Zhao, Song Yang, Zhongqin Wu, Yuting Nie, Wei-Qiang Zhang (2021), Language Recognition Based on Unsupervised Pretrained Models, Interspeech
Zheng Li, Yan Liu, Lin Li, Qingyang Hong (2021), Additive Phoneme-Aware Margin Softmax Loss for Language Recognition, Interspeech
Nataly Jahchan, Florentin Barbier, Ariyanidevi Dharma Gita, Khaled Khelif, Estelle Delpech (2021), Towards an Accent-Robust Approach for ATC Communications Transcription, Interspeech
Igor Szöke, Santosh Kesiraju, Ondřej Novotný, Martin Kocour, Karel Veselý, Jan Černocký (2021), Detecting English Speech in the Air Traffic Control Voice Communication, Interspeech
Oliver Ohneiser, Seyyed Saeed Sarfjoo, Hartmut Helmke, Shruthi Shetty, Petr Motlicek, Matthias Kleinert, Heiko Ehr, Šarūnas Murauskas (2021), Robust Command Recognition for Lithuanian Air Traffic Control Tower Utterances, Interspeech
Juan Zuluaga-Gomez, Iuliia Nigmatulina, Amrutha Prasad, Petr Motlicek, Karel Veselý, Martin Kocour, Igor Szöke (2021), Contextual Semi-Supervised Learning: An Approach to Leverage Air-Surveillance and Untranscribed ATC Data in ASR Systems, Interspeech
Martin Kocour, Karel Veselý, Alexander Blatt, Juan Zuluaga Gomez, Igor Szöke, Jan Černocký, Dietrich Klakow, Petr Motlicek (2021), Boosting of Contextual Information in ASR for Air-Traffic Call-Sign Recognition, Interspeech
Benjamin Elie, Jodie Gauvain, Jean-Luc Gauvain, Lori Lamel (2021), Modeling the Effect of Military Oxygen Masks on Speech Characteristics, Interspeech
Benjamin Milde, Tim Fischer, Steffen Remus, Chris Biemann (2021), MoM: Minutes of Meeting Bot, Interspeech
Alexander Wilbrandt, Simon Stone, Peter Birkholz (2021), Articulatory Data Recorder: A Framework for Real-Time Articulatory Data Recording, Interspeech
Joan Codina-Filbà, Guillermo Cámbara, Alex Peiró-Lilja, Jens Grivolla, Roberto Carlini, Mireia Farrús (2021), The INGENIOUS Multilingual Operations App, Interspeech
Joanna Rownicka, Kilian Sprenkamp, Antonio Tripiana, Volodymyr Gromoglasov, Timo P. Kunz (2021), Digital Einstein Experience: Fast Text-to-Speech for Conversational AI, Interspeech
Robert Geislinger, Benjamin Milde, Timo Baumann, Chris Biemann (2021), Live Subtitling for BigBlueButton with Open-Source Software, Interspeech
Dāvis Nicmanis, Askars Salimbajevs (2021), Expressive Latvian Speech Synthesis for Dialog Systems, Interspeech
Pramod H. Kachare, Prem C. Pandey, Vishal Mane, Hirak Dasgupta, K.S. Nataraj, Akshada Rathod, Sheetal K. Pathak (2021), ViSTAFAE: A Visual Speech-Training Aid with Feedback of Articulatory Efforts, Interspeech
Karen Livescu (2021), Learning Speech Models from Multi-Modal Data, Interspeech
Mounya Elhilali (2021), Adaptive Listening to Everyday Soundscapes, Interspeech
Vinicius Ribeiro, Karyna Isaieva, Justine Leclere, Pierre-André Vuissoz, Yves Laprie (2021), Towards the Prediction of the Vocal Tract Shape from the Sequence of Phonemes to be Articulated, Interspeech
Rémi Blandin, Marc Arnela, Simon Félix, Jean-Baptiste Doc, Peter Birkholz (2021), Comparison of the Finite Element Method, the Multimodal Method and the Transmission-Line Model for the Computation of Vocal Tract Transfer Functions, Interspeech
Petra Wagner, Sina Zarrieß, Joana Cholin (2021), Effects of Time Pressure and Spontaneity on Phonotactic Innovations in German Dialogues, Interspeech
Salvador Medina, Sarah Taylor, Mark Tiede, Alexander Hauptmann, Iain Matthews (2021), Importance of Parasagittal Sensor Information in Tongue Motion Capture Through a Diphonic Analysis, Interspeech
Marc-Antoine Georges, Laurent Girin, Jean-Luc Schwartz, Thomas Hueber (2021), Learning Robust Speech Representation with an Articulatory-Regularized Variational Autoencoder, Interspeech
Heather Weston, Laura L. Koenig, Susanne Fuchs (2021), Changes in Glottal Source Parameter Values with Light to Moderate Physical Load, Interspeech
Mohammad Hassan Vali, Tom Bäckström (2021), End-to-End Optimized Multi-Stage Vector Quantization of Spectral Envelopes for Speech and Audio Coding, Interspeech
Santhan Kumar Reddy Nareddula, Subrahmanyam Gorthi, Rama Krishna Sai S. Gorthi (2021), Fusion-Net: Time-Frequency Information Fusion Y-Network for Speech Enhancement, Interspeech
Ľuboš Marcinek, Michael Stone, Rebecca Millman, Patrick Gaydecki (2021), N-MTTL SI Model: Non-Intrusive Multi-Task Transfer Learning-Based Speech Intelligibility Prediction Model with Scenery Classification, Interspeech
Yangyang Xia, Li-Wei Chen, Alexander Rudnicky, Richard M. Stern (2021), Temporal Context in Speech Emotion Recognition, Interspeech
Hang Li, Wenbiao Ding, Zhongqin Wu, Zitao Liu (2021), Learning Fine-Grained Cross Modality Excitement for Speech Emotion Recognition, Interspeech
Einari Vaaras, Sari Ahlqvist-Björkroth, Konstantinos Drossos, Okko Räsänen (2021), Automatic Analysis of the Emotional Content of Speech in Daylong Child-Centered Recordings from a Neonatal Intensive Care Unit, Interspeech
Fan Qian, Jiqing Han (2021), Multimodal Sentiment Analysis with Temporal Modality Attention, Interspeech
Mani Kumar T, Enrique Sanchez, Georgios Tzimiropoulos, Timo Giesbrecht, Michel Valstar (2021), Stochastic Process Regression for Cross-Cultural Speech Emotion Recognition, Interspeech
Haoqi Li, Yelin Kim, Cheng-Hao Kuo, Shrikanth S. Narayanan (2021), Acted vs. Improvised: Domain Adaptation for Elicitation Approaches in Audio-Visual Emotion Recognition, Interspeech
Leonardo Pepino, Pablo Riera, Luciana Ferrer (2021), Emotion Recognition from Speech Using wav2vec 2.0 Embeddings, Interspeech
Jiawang Liu, Haoxiang Wang (2021), Graph Isomorphism Network for Speech Emotion Recognition, Interspeech
Pooja Kumawat, Aurobinda Routray (2021), Applying TDNN Architectures for Analyzing Duration Dependencies on Speech Emotion Recognition, Interspeech
Aaron Keesing, Yun Sing Koh, Michael Witbrock (2021), Acoustic Features and Neural Representations for Categorical Emotion Recognition from Speech, Interspeech
Suwon Shon, Pablo Brusco, Jing Pan, Kyu J. Han, Shinji Watanabe (2021), Leveraging Pre-Trained Language Model for Speech Sentiment Analysis, Interspeech
Wenxin Hou, Jindong Wang, Xu Tan, Tao Qin, Takahiro Shinozaki (2021), Cross-Domain Speech Recognition with Unsupervised Character-Level Distribution Matching, Interspeech
Naoyuki Kanda, Guoli Ye, Yu Wu, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka (2021), Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone, Interspeech
Liang Lu, Zhong Meng, Naoyuki Kanda, Jinyu Li, Yifan Gong (2021), On Minimum Word Error Rate Training of the Hybrid Autoregressive Transducer, Interspeech
Jaeyoung Kim, Han Lu, Anshuman Tripathi, Qian Zhang, Hasim Sak (2021), Reducing Streaming ASR Model Delay with Self Alignment, Interspeech
Anuj Diwan, Preethi Jyothi (2021), Reduce and Reconstruct: ASR for Low-Resource Phonetic Languages, Interspeech
Takashi Fukuda, Samuel Thomas (2021), Knowledge Distillation Based Training of Universal ASR Source Models for Cross-Lingual Transfer, Interspeech
Swayambhu Nath Ray, Minhua Wu, Anirudh Raju, Pegah Ghahremani, Raghavendra Bilgi, Milind Rao, Harish Arsikere, Ariya Rastrow, Andreas Stolcke, Jasha Droppo (2021), Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End, Interspeech
Zhiyun Lu, Wei Han, Yu Zhang, Liangliang Cao (2021), Exploring Targeted Universal Adversarial Perturbations to End-to-End ASR Models, Interspeech
Miguel Del Rio, Natalie Delworth, Ryan Westerman, Michelle Huang, Nishchal Bhandari, Joseph Palakapilly, Quinten McNamara, Joshua Dong, Piotr Żelasko, Miguel Jetté (2021), Earnings-21: A Practical Benchmark for ASR in the Wild, Interspeech
Eric Sun, Jinyu Li, Zhong Meng, Yu Wu, Jian Xue, Shujie Liu, Yifan Gong (2021), Improving Multilingual Transformer Transducer Models by Reducing Language Confusions, Interspeech
Ahmed Ali, Shammur Absar Chowdhury, Amir Hussein, Yasser Hifny (2021), Arabic Code-Switching Speech Recognition Using Monolingual Data, Interspeech
Aviad Eisenberg, Boaz Schwartz, Sharon Gannot (2021), Online Blind Audio Source Separation Using Recursive Expectation-Maximization, Interspeech
Yi Luo, Cong Han, Nima Mesgarani (2021), Empirical Analysis of Generalized Iterative Speech Separation Networks, Interspeech
Thilo von Neumann, Keisuke Kinoshita, Christoph Boeddeker, Marc Delcroix, Reinhold Haeb-Umbach (2021), Graph-PIT: Generalized Permutation Invariant Training for Continuous Separation of Arbitrary Numbers of Speakers, Interspeech
Jisi Zhang, Cătălin Zorilă, Rama Doddipatla, Jon Barker (2021), Teacher-Student MixIT for Unsupervised and Semi-Supervised Speech Separation, Interspeech
Marc Delcroix, Jorge Bennasar Vázquez, Tsubasa Ochiai, Keisuke Kinoshita, Shoko Araki (2021), Few-Shot Learning of New Sound Classes for Target Sound Extraction, Interspeech
Cong Han, Yi Luo, Nima Mesgarani (2021), Binaural Speech Separation of Moving Speakers With Preserved Spatial Cues, Interspeech
Shell Xu Hu, Md. Rifat Arefin, Viet-Nhat Nguyen, Alish Dipani, Xaq Pitkow, Andreas Savas Tolias (2021), AvaTr: One-Shot Speaker Extraction with Transformers, Interspeech
Saurjya Sarkar, Emmanouil Benetos, Mark Sandler (2021), Vocal Harmony Separation Using Time-Domain Neural Networks, Interspeech
Matthew Maciejewski, Shinji Watanabe, Sanjeev Khudanpur (2021), Speaker Verification-Based Evaluation of Single-Channel Speech Separation, Interspeech
Tian Lan, Yuxin Qian, Yilan Lyu, Refuoe Mokhosi, Wenxin Tai, Qiao Liu (2021), Improved Speech Separation with Time-and-Frequency Cross-Domain Feature Selection, Interspeech
Chengyun Deng, Shiqian Ma, Yongtao Sha, Yi Zhang, Hui Zhang, Hui Song, Fei Wang (2021), Robust Speaker Extraction Network Based on Iterative Refined Adaptation, Interspeech
Wupeng Wang, Chenglin Xu, Meng Ge, Haizhou Li (2021), Neural Speaker Extraction with Speaker-Speech Cross-Attention Network, Interspeech
Rémi Rigal, Jacques Chodorowski, Benoît Zerr (2021), Deep Audio-Visual Speech Separation Based on Facial Motion, Interspeech
Prachi Singh, Rajat Varma, Venkat Krishnamohan, Srikanth Raj Chetupalli, Sriram Ganapathy (2021), LEAP Submission for the Third DIHARD Diarization Challenge, Interspeech
Shiliang Zhang, Siqi Zheng, Weilong Huang, Ming Lei, Hongbin Suo, Jinwei Feng, Zhijie Yan (2021), Investigation of Spatial-Acoustic Features for Overlapping Speech Detection in Multiparty Meetings, Interspeech
Maokui He, Desh Raj, Zili Huang, Jun Du, Zhuo Chen, Shinji Watanabe (2021), Target-Speaker Voice Activity Detection with Improved i-Vector Estimation for Unknown Number of Speaker, Interspeech
Nauman Dawalatabad, Mirco Ravanelli, François Grondin, Jenthe Thienpondt, Brecht Desplanques, Hwidong Na (2021), ECAPA-TDNN Embeddings for Speaker Diarization, Interspeech
Keisuke Kinoshita, Marc Delcroix, Naohiro Tawara (2021), Advances in Integration of End-to-End Neural and Clustering-Based Diarization for Real Conversational Speech, Interspeech
Neville Ryant, Prachi Singh, Venkat Krishnamohan, Rajat Varma, Kenneth Church, Christopher Cieri, Jun Du, Sriram Ganapathy, Mark Liberman (2021), The Third DIHARD Diarization Challenge, Interspeech
Tsun-Yat Leung, Lahiru Samarakoon (2021), Robust End-to-End Speaker Diarization with Conformer and Additive Margin Penalty, Interspeech
Benjamin O’Brien, Natalia Tomashenko, Anaïs Chanclu, Jean-François Bonastre (2021), Anonymous Speaker Clusters: Making Distinctions Between Anonymised Speech Recordings with Clustering Interface, Interspeech
Kiran Karra, Alan McCree (2021), Speaker Diarization Using Two-Pass Leave-One-Out Gaussian PLDA Clustering of DNN Embeddings, Interspeech
Zhenhou Hong, Jianzong Wang, Xiaoyang Qu, Jie Liu, Chendong Zhao, Jing Xiao (2021), Federated Learning with Dynamic Transformer for Text to Speech, Interspeech
Huu-Kim Nguyen, Kihyuk Jeong, Seyun Um, Min-Jae Hwang, Eunwoo Song, Hong-Goo Kang (2021), LiteTTS: A Lightweight Mel-Spectrogram-Free Text-to-Wave Synthesizer Based on Generative Adversarial Networks, Interspeech
Chuanxin Tang, Chong Luo, Zhiyuan Zhao, Dacheng Yin, Yucheng Zhao, Wenjun Zeng (2021), Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration, Interspeech
Myeonghun Jeong, Hyeongju Kim, Sung Jun Cheon, Byoung Jin Choi, Nam Soo Kim (2021), Diff-TTS: A Denoising Diffusion Model for Text-to-Speech, Interspeech
Jae-Sung Bae, Taejun Bak, Young-Sun Joo, Hoon-Young Cho (2021), Hierarchical Context-Aware Transformers for Non-Autoregressive Text to Speech, Interspeech
Adam Polyak, Yossi Adi, Jade Copet, Eugene Kharitonov, Kushal Lakhotia, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux (2021), Speech Resynthesis from Discrete Disentangled Self-Supervised Representations, Interspeech
Penny Karanasou, Sri Karlapati, Alexis Moinet, Arnaud Joly, Ammar Abbas, Simon Slangen, Jaime Lorenzo-Trueba, Thomas Drugman (2021), A Learned Conditional Prior for the VAE Acoustic Space of a TTS System, Interspeech
Dipjyoti Paul, Sankar Mukherjee, Yannis Pantazis, Yannis Stylianou (2021), A Universal Multi-Speaker Multi-Style Text-to-Speech via Disentangled Representation Learning Based on Rényi Divergence Minimization, Interspeech
Yi-Chiao Wu, Cheng-Hung Hu, Hung-Shin Lee, Yu-Huai Peng, Wen-Chin Huang, Yu Tsao, Hsin-Min Wang, Tomoki Toda (2021), Relational Data Selection for Data Augmentation of Speaker-Dependent Multi-Band MelGAN Vocoder, Interspeech
Hyunseung Chung, Sang-Hoon Lee, Seong-Whan Lee (2021), Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech, Interspeech
Shilun Lin, Fenglong Xie, Li Meng, Xinhui Li, Li Lu (2021), Triple M: A Practical Text-to-Speech Synthesis System with Multi-Guidance Attention and Multi-Band Multi-Time LPCNet, Interspeech
Edresson Casanova, Christopher Shulby, Eren Gölge, Nicolas Michael Müller, Frederico Santos de Oliveira, Arnaldo Candido Jr., Anderson da Silva Soares, Sandra Maria Aluisio, Moacir Antonelli Ponti (2021), SC-GlowTTS: An Efficient Zero-Shot Multi-Speaker Text-To-Speech Model, Interspeech
Ian Palmer, Andrew Rouditchenko, Andrei Barbu, Boris Katz, James Glass (2021), Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset, Interspeech
Elizabeth Salesky, Matthew Wiesner, Jacob Bremerman, Roldano Cattoni, Matteo Negri, Marco Turchi, Douglas W. Oard, Matt Post (2021), The Multilingual TEDx Corpus for Speech Recognition and Translation, Interspeech
David R. Mortensen, Jordan Picone, Xinjian Li, Kathleen Siminyu (2021), Tusom2021: A Phonetically Transcribed Speech Dataset from an Endangered Language for Universal Phone Recognition Experiments, Interspeech
Yihui Fu, Luyao Cheng, Shubo Lv, Yukai Jv, Yuxiang Kong, Zhuo Chen, Yanxin Hu, Lei Xie, Jian Wu, Hui Bu, Xin Xu, Jun Du, Jingdong Chen (2021), AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario, Interspeech
Guoguo Chen, Shuzhou Chai, Guan-Bo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie Jin, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Zhao You, Zhiyong Yan (2021), GigaSpeech: An Evolving, Multi-Domain ASR Corpus with 10,000 Hours of Transcribed Audio, Interspeech
You Jin Kim, Hee-Soo Heo, Soyeon Choe, Soo-Whan Chung, Yoohwan Kwon, Bong-Jin Lee, Youngki Kwon, Joon Son Chung (2021), Look Who’s Talking: Active Speaker Detection in the Wild, Interspeech
Beena Ahmed, Kirrie J. Ballard, Denis Burnham, Tharmakulasingam Sirojan, Hadi Mehmood, Dominique Estival, Elise Baker, Felicity Cox, Joanne Arciuli, Titia Benders, Katherine Demuth, Barbara Kelly, Chloé Diskin-Holdaway, Mostafa Shahin, Vidhyasaharan Sethu, Julien Epps, Chwee Beng Lee, Eliathamby Ambikairajah (2021), AusKidTalk: An Auditory-Visual Corpus of 3- to 12-Year-Old Australian Children’s Speech, Interspeech
Per Fallgren, Jens Edlund (2021), Human-in-the-Loop Efficiency Analysis for Binary Classification in Edyson, Interspeech
Elena Ryumina, Oxana Verkholyak, Alexey Karpov (2021), Annotation Confidence vs. Training Sample Size: Trade-Off Solution for Partially-Continuous Categorical Emotion Recognition, Interspeech
Gonçal V. Garcés Díaz-Munío, Joan-Albert Silvestre-Cerdà, Javier Jorge, Adrià Giménez Pastor, Javier Iranzo-Sánchez, Pau Baquero-Arnal, Nahuel Roselló, Alejandro Pérez-González-de-Martos, Jorge Civera, Albert Sanchis, Alfons Juan (2021), Europarl-ASR: A Large Corpus of Parliamentary Debates for Streaming ASR Benchmarking and Speech Data Filtering/Verbatimization, Interspeech
Parul Kapoor, Rudrabha Mukhopadhyay, Sindhu B. Hegde, Vinay Namboodiri, C.V. Jawahar (2021), Towards Automatic Speech to Sign Language Generation, Interspeech
Won Ik Cho, Seok Min Kim, Hyunchang Cho, Nam Soo Kim (2021), kosp2e: Korean Speech to English Translation Corpus, Interspeech
Junbo Zhang, Zhiwen Zhang, Yongqing Wang, Zhiyong Yan, Qiong Song, Yukai Huang, Ke Li, Daniel Povey, Yujun Wang (2021), speechocean762: An Open-Source Non-Native English Speech Corpus for Pronunciation Assessment, Interspeech
Ruchao Fan, Wei Chu, Peng Chang, Jing Xiao, Abeer Alwan (2021), An Improved Single Step Non-Autoregressive Transformer for Automatic Speech Recognition, Interspeech
Pengcheng Guo, Xuankai Chang, Shinji Watanabe, Lei Xie (2021), Multi-Speaker ASR Combining Non-Autoregressive Conformer CTC and Conditional Speaker Chain, Interspeech
Edwin G. Ng, Chung-Cheng Chiu, Yu Zhang, William Chan (2021), Pushing the Limits of Non-Autoregressive Speech Recognition, Interspeech
Alexander H. Liu, Yu-An Chung, James Glass (2021), Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies, Interspeech
Jumon Nozaki, Tatsuya Komatsu (2021), Relaxing the Conditional Independence Assumption of CTC-Based ASR by Conditioning on Intermediate Predictions, Interspeech
Yuya Fujita, Tianzi Wang, Shinji Watanabe, Motoi Omachi (2021), Toward Streaming ASR with Non-Autoregressive Insertion-Based Model, Interspeech
Jaesong Lee, Jingu Kang, Shinji Watanabe (2021), Layer Pruning on Demand with Intermediate CTC, Interspeech
Song Li, Beibei Ouyang, Fuchuan Tong, Dexin Liao, Lin Li, Qingyang Hong (2021), Real-Time End-to-End Monaural Multi-Speaker Speech Recognition, Interspeech
Tianzi Wang, Yuya Fujita, Xuankai Chang, Shinji Watanabe (2021), Streaming End-to-End ASR Based on Blockwise Non-Autoregressive Models, Interspeech
Stanislav Beliaev, Boris Ginsburg (2021), TalkNet: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis, Interspeech
Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan (2021), WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis, Interspeech
Nanxin Chen, Piotr Żelasko, Laureano Moro-Velázquez, Jesús Villalba, Najim Dehak (2021), Align-Denoise: Single-Pass Non-Autoregressive Speech Recognition, Interspeech
Hui Lu, Zhiyong Wu, Xixin Wu, Xu Li, Shiyin Kang, Xunying Liu, Helen Meng (2021), VAENAR-TTS: Variational Auto-Encoder Based Non-AutoRegressive Text-to-Speech Synthesis, Interspeech
Saturnino Luz, Fasih Haider, Sofia de la Fuente, Davida Fromm, Brian MacWhinney (2021), Detecting Cognitive Decline Using Speech Only: The ADReSSo Challenge, Interspeech
P.A. Pérez-Toro, S.P. Bayerl, T. Arias-Vergara, J.C. Vásquez-Correa, P. Klumpp, M. Schuster, Elmar Nöth, J.R. Orozco-Arroyave, K. Riedhammer (2021), Influence of the Interviewer on the Automatic Assessment of Alzheimer’s Disease in the Context of the ADReSSo Challenge, Interspeech
Youxiang Zhu, Abdelrahman Obyat, Xiaohui Liang, John A. Batsis, Robert M. Roth (2021), WavBERT: Exploiting Semantic and Non-Semantic Speech Using Wav2vec and BERT for Dementia Detection, Interspeech
Lara Gauder, Leonardo Pepino, Luciana Ferrer, Pablo Riera (2021), Alzheimer Disease Recognition Using Speech-Based Embeddings From Pre-Trained Models, Interspeech
Aparna Balagopalan, Jekaterina Novikova (2021), Comparing Acoustic-Based Approaches for Alzheimer’s Disease Detection, Interspeech
Yu Qiao, Xuefeng Yin, Daniel Wiechmann, Elma Kerz (2021), Alzheimer’s Disease Detection from Spontaneous Speech Through Combining Linguistic Complexity and (Dis)Fluency Features with Pretrained Language Models, Interspeech
Yilin Pan, Bahman Mirheidari, Jennifer M. Harris, Jennifer C. Thompson, Matthew Jones, Julie S. Snowden, Daniel Blackburn, Heidi Christensen (2021), Using the Outputs of Different Automatic Speech Recognition Paradigms for Acoustic- and BERT-Based Alzheimer’s Dementia Detection Through Spontaneous Speech, Interspeech
Zafi Sherhan Syed, Muhammad Shehram Shah Syed, Margaret Lech, Elena Pirogova (2021), Tackling the ADRESSO Challenge 2021: The MUET-RMIT System for Alzheimer’s Dementia Recognition from Spontaneous Speech, Interspeech
Morteza Rohanian, Julian Hough, Matthew Purver (2021), Alzheimer’s Dementia Recognition Using Acoustic, Lexical, Disfluency and Speech Pause Features Robust to Noisy Inputs, Interspeech
Raghavendra Pappagari, Jaejin Cho, Sonal Joshi, Laureano Moro-Velázquez, Piotr Żelasko, Jesús Villalba, Najim Dehak (2021), Automatic Detection and Assessment of Alzheimer Disease Using Speech and Language Technologies in Low-Resource Scenarios, Interspeech
Jun Chen, Jieping Ye, Fengyi Tang, Jiayu Zhou (2021), Automatic Detection of Alzheimer’s Disease Using Spontaneous Speech Only, Interspeech
Ning Wang, Yupeng Cao, Shuai Hao, Zongru Shao, K.P. Subbalakshmi (2021), Modular Multi-Modal Attention Network for Alzheimer’s Disease Detection Using Patient Audio and Language Data, Interspeech
Rong Gong, Carl Quillen, Dushyant Sharma, Andrew Goderre, José Laínez, Ljubomir Milanović (2021), Self-Attention Channel Combinator Frontend for End-to-End Multichannel Far-Field Speech Recognition, Interspeech
R. Gretter, Marco Matassoni, D. Falavigna, A. Misra, C.W. Leong, K. Knill, L. Wang (2021), ETLT 2021: Shared Task on Automatic Speech Recognition for Non-Native Children’s Speech, Interspeech
Lars Rumberg, Hanna Ehlert, Ulrike Lüdtke, Jörn Ostermann (2021), Age-Invariant Training for End-to-End Child Speech Recognition Using Adversarial Multi-Task Learning, Interspeech
Samuele Cornell, Alessio Brutti, Marco Matassoni, Stefano Squartini (2021), Learning to Rank Microphones for Distant Speech Recognition, Interspeech
Lucile Gelin, Thomas Pellegrini, Julien Pinquier, Morgane Daniel (2021), Simulating Reading Mistakes for Child Speech Transformer-Based Phone Recognition, Interspeech
Brooke Stephenson, Thomas Hueber, Laurent Girin, Laurent Besacier (2021), Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input, Interspeech
Pol van Rijn, Silvan Mertes, Dominik Schiller, Peter M.C. Harrison, Pauline Larrouy-Maestri, Elisabeth André, Nori Jacoby (2021), Exploring Emotional Prototypes in a High Dimensional TTS Latent Space, Interspeech
Devang S. Ram Mohan, Vivian Hu, Tian Huey Teh, Alexandra Torresquintero, Christopher G.R. Wallis, Marlene Staib, Lorenzo Foglianti, Jiameng Gao, Simon King (2021), Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis, Interspeech
Alexandra Torresquintero, Tian Huey Teh, Christopher G.R. Wallis, Marlene Staib, Devang S. Ram Mohan, Vivian Hu, Lorenzo Foglianti, Jiameng Gao, Simon King (2021), ADEPT: A Dataset for Evaluating Prosody Transfer, Interspeech
Nguyen Thi Thu Trang, Nguyen Hoang Ky, Albert Rilliard, Christophe d'Alessandro (2021), Prosodic Boundary Prediction Model for Vietnamese Text-To-Speech, Interspeech
Shaked Dovrat, Eliya Nachmani, Lior Wolf (2021), Many-Speakers Single Channel Speech Separation with Optimal Permutation Training, Interspeech
Mieszko Fraś, Marcin Witkowski, Konrad Kowalczyk (2021), Combating Reverberation in NTF-Based Speech Separation Using a Sub-Source Weighted Multichannel Wiener Filter and Linear Prediction, Interspeech
Martin Strauss, Jouni Paulus, Matteo Torcoli, Bernd Edler (2021), A Hands-On Comparison of DNNs for Dialog Separation Using Transfer Learning from Music Source Separation, Interspeech
Marvin Borsdorf, Chenglin Xu, Haizhou Li, Tanja Schultz (2021), GlobalPhone Mix-To-Separate Out of 2: A Multilingual 2000 Speakers Mixtures Database for Speech Separation, Interspeech
Kimiko Tsukada, Yurong, Joo-Yeon Kim, Jeong-Im Han, John Hajek (2021), Cross-Linguistic Perception of the Japanese Singleton/Geminate Contrast: Korean, Mandarin and Mongolian Compared, Interspeech
Daniel Korzekwa, Roberto Barra-Chicote, Szymon Zaporowski, Grzegorz Beringer, Jaime Lorenzo-Trueba, Alicja Serafinowicz, Jasha Droppo, Thomas Drugman, Bozena Kostek (2021), Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention, Interspeech
Bettina Braun, Nicole Dehé, Marieke Einfeldt, Daniela Wochner, Katharina Zahner-Ritter (2021), Testing Acoustic Voice Quality Classification Across Languages and Speech Styles, Interspeech
Qianyutong Zhang, Kexin Lyu, Zening Chen, Ping Tang (2021), Acquisition of Prosodic Focus Marking by Three- to Six-Year-Old Children Learning Mandarin Chinese, Interspeech
Maryam Sadat Mirzaei, Kourosh Meshgi (2021), Adaptive Listening Difficulty Detection for L2 Learners Through Moderating ASR Resources, Interspeech
Hongwei Ding, Binghuai Lin, Liyuan Wang (2021), F0 Patterns of L2 English Speech by Mandarin Chinese Learners, Interspeech
Binghuai Lin, Liyuan Wang (2021), A Neural Network-Based Noise Compensation Method for Pronunciation Assessment, Interspeech
Jacek Kudera, Philip Georgis, Bernd Möbius, Tania Avgustinova, Dietrich Klakow (2021), Phonetic Distance and Surprisal in Multilingual Priming: Evidence from Slavic, Interspeech
Yuqing Zhang, Zhu Li, Binghuai Lin, Jinsong Zhang (2021), A Preliminary Study on Discourse Prosody Encoding in L1 and L2 English Spontaneous Narratives, Interspeech
Minglin Wu, Kun Li, Wai-Kim Leung, Helen Meng (2021), Transformer Based End-to-End Mispronunciation Detection and Diagnosis, Interspeech
Calbert Graham (2021), L1 Identification from L2 Speech Using Neural Spectrogram Analysis, Interspeech
Miran Oh, Dani Byrd, Shrikanth S. Narayanan (2021), Leveraging Real-Time MRI for Illuminating Linguistic Velum Action, Interspeech
Zirui Liu, Yi Xu (2021), Segmental Alignment of English Syllables with Singleton and Cluster Onsets, Interspeech
Míša Hejná (2021), Exploration of Welsh English Pre-Aspiration: How Wide-Spread is it?, Interspeech
Beeke Muhlack, Mikey Elmers, Heiner Drenhaus, Jürgen Trouvain, Marjolein van Os, Raphael Werner, Margarita Ryzhova, Bernd Möbius (2021), Revisiting Recall Effects of Filler Particles in German and English, Interspeech
Chunyu Ge, Yixuan Xiong, Peggy Mok (2021), How Reliable Are Phonetic Data Collected Remotely? Comparison of Recording Devices and Environments on Acoustic Measurements, Interspeech
Jing Huang, Feng-fan Hsieh, Yueh-chin Chang (2021), A Cross-Dialectal Comparison of Apical Vowels in Beijing Mandarin, Northeastern Mandarin and Southwestern Mandarin: An EMA and Ultrasound Study, Interspeech
Mark Gibson, Oihane Muxika, Marianne Pouplier (2021), Dissecting the Aero-Acoustic Parameters of Open Articulatory Transitions, Interspeech
Amelia J. Gully (2021), Quantifying Vocal Tract Shape Variation and its Acoustic Impact: A Geometric Morphometric Approach, Interspeech
Adriana Guevara-Rukoz, Shi Yu, Sharon Peperkamp (2021), Speech Perception and Loanword Adaptations: The Case of Copy-Vowel Epenthesis, Interspeech
Zhe-chen Guo, Rajka Smiljanic (2021), Speakers Coarticulate Less When Facing Real and Imagined Communicative Difficulties: An Analysis of Read and Spontaneous Speech from the LUCID Corpus, Interspeech
Einar Meister, Lya Meister (2021), Developmental Changes of Vowel Acoustics in Adolescents, Interspeech
Sonia d'Apolito, Barbara Gili Fivela (2021), Context and Co-Text Influence on the Accuracy Production of Italian L2 Non-Native Sounds, Interspeech
Wilbert Heeringa, Hans Van de Velde (2021), A New Vowel Normalization for Sociophonetics, Interspeech
Rosey Billington, Hywel Stoakes, Nick Thieberger (2021), The Pacific Expansion: Optimizing Phonetic Transcription of Archival Corpora, Interspeech
Zhengkun Tian, Jiangyan Yi, Ye Bai, Jianhua Tao, Shuai Zhang, Zhengqi Wen (2021), FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization, Interspeech
Anton Mitrofanov, Mariya Korenevskaya, Ivan Podluzhny, Yuri Khokhlov, Aleksandr Laptev, Andrei Andrusenko, Aleksei Ilin, Maxim Korenevsky, Ivan Medennikov, Aleksei Romanenko (2021), LT-LM: A Novel Non-Autoregressive Language Model for Single-Shot Lattice Rescoring, Interspeech
Cyril Allauzen, Ehsan Variani, Michael Riley, David Rybach, Hao Zhang (2021), A Hybrid Seq-2-Seq ASR Design for On-Device and Server Applications, Interspeech
Hirofumi Inaguma, Tatsuya Kawahara (2021), VAD-Free Streaming Hybrid CTC/Attention ASR for Unsegmented Recording, Interspeech
Zhuoyuan Yao, Di Wu, Xiong Wang, Binbin Zhang, Fan Yu, Chao Yang, Zhendong Peng, Xiaoyu Chen, Lei Xie, Xin Lei (2021), WeNet: Production Oriented Streaming and Non-Streaming End-to-End Speech Recognition Toolkit, Interspeech
Tomohiro Tanaka, Ryo Masumura, Mana Ihori, Akihiko Takashima, Takafumi Moriya, Takanori Ashihara, Shota Orihashi, Naoki Makishima (2021), Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition, Interspeech
Mun-Hak Lee, Joon-Hyuk Chang (2021), Deep Neural Network Calibration for E2E Speech Recognition System, Interspeech
Qiujia Li, Yu Zhang, Bo Li, Liangliang Cao, Philip C. Woodland (2021), Residual Energy-Based Models for End-to-End Speech Recognition, Interspeech
David Qiu, Yanzhang He, Qiujia Li, Yu Zhang, Liangliang Cao, Ian McGraw (2021), Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction, Interspeech
Anna Ollerenshaw, Md. Asif Jalal, Thomas Hain (2021), Insights on Neural Representations for End-to-End Speech Recognition, Interspeech
Amber Afshan, Kshitiz Kumar, Jian Wu (2021), Sequence-Level Confidence Classifier for ASR Utterance Accuracy and Application to Acoustic Models, Interspeech
Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita (2021), Unsupervised Learning of Disentangled Speech Content and Style Representation, Interspeech
Eunbi Choi, Hwa-Yeon Kim, Jong-Hwan Kim, Jae-Min Kim (2021), Label Embedding for Chinese Grapheme-to-Phoneme Conversion, Interspeech
Haiteng Zhang (2021), PDF: Polyphone Disambiguation in Chinese by Using FLAT, Interspeech
Junjie Li, Zhiyu Zhang, Minchuan Chen, Jun Ma, Shaojun Wang, Jing Xiao (2021), Improving Polyphone Disambiguation for Mandarin Chinese by Combining Mix-Pooling Strategy and Window-Based Attention, Interspeech
Yi Shi, Congyi Wang, Yu Chen, Bin Wang (2021), Polyphone Disambiguation in Mandarin Chinese with Semi-Supervised Learning, Interspeech
Yue Chen, Zhen-Hua Ling, Qing-Feng Liu (2021), A Neural-Network-Based Approach to Identifying Speakers in Novels, Interspeech
Xiao Zhou, Zhen-Hua Ling, Li-Rong Dai (2021), UnitNet-Based Hybrid Speech Synthesis, Interspeech
Sashi Novitasari, Sakriani Sakti, Satoshi Nakamura (2021), Dynamically Adaptive Machine Speech Chain Inference for TTS in Noisy Environment: Listen and Speak Louder, Interspeech
Haozhe Zhang, Zhihua Huang, Zengqiang Shang, Pengyuan Zhang, Yonghong Yan (2021), LinearSpeech: Parallel Text-to-Speech with Linear Complexity, Interspeech
Noa Mansbach, Evgeny Hershkovitch Neiterman, Amos Azaria (2021), An Agent for Competing with Humans in a Deceptive Game Based on Vocal Cues, Interspeech
Ahmed Fakhry, Xinyi Jiang, Jaclyn Xiao, Gunvant Chaudhari, Asriel Han (2021), A Multi-Branch Deep Learning Network for Automated Detection of COVID-19, Interspeech
Youxuan Ma, Zongze Ren, Shugong Xu (2021), RW-Resnet: A Novel Speech Anti-Spoofing Model Using Raw Waveform, Interspeech
Hira Dhamyal, Ayesha Ali, Ihsan Ayyub Qazi, Agha Ali Raza (2021), Fake Audio Detection in Resource-Constrained Settings Using Microfeatures, Interspeech
Tianhao Yan, Hao Meng, Emilia Parada-Cabaleiro, Shuo Liu, Meishu Song, Björn W. Schuller (2021), Coughing-Based Recognition of Covid-19 with Spatial Attentive ConvLSTM Recurrent Neural Networks, Interspeech
Soumava Paul, Gurunath Reddy M, K. Sreenivasa Rao, Partha Pratim Das (2021), Knowledge Distillation for Singing Voice Detection, Interspeech
Ryu Takeda, Kazunori Komatani (2021), Age Estimation with Speech-Age Model for Heterogeneous Speech Datasets, Interspeech
Kah Kuan Teh, Huy Dat Tran (2021), Open-Set Audio Classification with Limited Training Resources Based on Augmentation Enhanced Variational Auto-Encoder GAN with Detection-Classification Joint Training, Interspeech
Takahiro Fukumori (2021), Deep Spectral-Cepstral Fusion for Shouted and Normal Speech Classification, Interspeech
Shikha Baghel, Mrinmoy Bhattacharjee, S.R. Mahadeva Prasanna, Prithwijit Guha (2021), Automatic Detection of Shouted Speech Segments in Indian News Debates, Interspeech
Yang Gao, Tyler Vuong, Mahsa Elyasi, Gaurav Bharaj, Rita Singh (2021), Generalized Spoofing Detection Inspired from Audio Generation Artifacts, Interspeech
Weiguang Chen, Van Tung Pham, Eng Siong Chng, Xionghu Zhong (2021), Overlapped Speech Detection Based on Spectral and Spatial Feature Fusion, Interspeech
Badr M. Abdullah, Marius Mosbach, Iuliia Zaitova, Bernd Möbius, Dietrich Klakow (2021), Do Acoustic Word Embeddings Capture Phonological Similarity? An Empirical Study, Interspeech
Zheng Gao, Radhika Arava, Qian Hu, Xibin Gao, Thahir Mohamed, Wei Xiao, Mohamed AbdelHady (2021), Paraphrase Label Alignment for Voice Application Retrieval in Spoken Language Understanding, Interspeech
Rajeev Rikhye, Quan Wang, Qiao Liang, Yanzhang He, Ding Zhao, Yiteng Huang, Arun Narayanan, Ian McGraw (2021), Personalized Keyphrase Detection Using Speaker and Environment Information, Interspeech
Vineet Garg, Wonil Chang, Siddharth Sigtia, Saurabh Adya, Pramod Simha, Pranay Dighe, Chandra Dhir (2021), Streaming Transformer for Hardware Efficient Voice Trigger Detection and False Trigger Mitigation, Interspeech
Mark Mazumder, Colby Banbury, Josh Meyer, Pete Warden, Vijay Janapa Reddi (2021), Few-Shot Keyword Spotting in Any Language, Interspeech
Li Wang, Rongzhi Gu, Nuo Chen, Yuexian Zou (2021), Text Anchor Based Metric Learning for Small-Footprint Keyword Spotting, Interspeech
Yangbin Chen, Tom Ko, Jianping Wang (2021), A Meta-Learning Approach for User-Defined Spoken Term Classification with Varying Classes and Examples, Interspeech
Dongyub Lee, Byeongil Ko, Myeong Cheol Shin, Taesun Whang, Daniel Lee, Eunhwa Kim, Eunggyun Kim, Jaechoon Jo (2021), Auxiliary Sequence Labeling Tasks for Disfluency Detection, Interspeech
Hang Zhou, Wenchao Hu, Yu Ting Yeung, Xiao Chen (2021), Energy-Friendly Keyword Spotting System Using Add-Based Convolution, Interspeech
Yan Jia, Xingming Wang, Xiaoyi Qin, Yinping Zhang, Xuyang Wang, Junjie Wang, Dong Zhang, Ming Li (2021), The 2020 Personalized Voice Trigger Challenge: Open Datasets, Evaluation Metrics, Baseline System and Results, Interspeech
Jingsong Wang, Yuxuan He, Chunyu Zhao, Qijie Shao, Wei-Wei Tu, Tom Ko, Hung-yi Lee, Lei Xie (2021), Auto-KWS 2021 Challenge: Task, Datasets, and Baselines, Interspeech
Axel Berg, Mark O’Connor, Miguel Tairum Cruz (2021), Keyword Transformer: A Self-Attention Model for Keyword Spotting, Interspeech
Abhijeet Awasthi, Kevin Kilgour, Hassan Rom (2021), Teaching Keyword Spotters to Spot New Keywords with Limited Examples, Interspeech
Xin Wang, Junichi Yamagishi (2021), A Comparative Study on Recent Neural Spoofing Countermeasures for Synthetic Speech Detection, Interspeech
Lin Zhang, Xin Wang, Erica Cooper, Junichi Yamagishi, Jose Patino, Nicholas Evans (2021), An Initial Investigation for Detecting Partially Spoofed Audio, Interspeech
Yang Xie, Zhenchuan Zhang, Yingchun Yang (2021), Siamese Network with wav2vec Feature for Spoofing Speech Detection, Interspeech
Xingliang Cheng, Mingxing Xu, Thomas Fang Zheng (2021), Cross-Database Replay Detection in Terminal-Dependent Speaker Verification, Interspeech
Yuxiang Zhang, Wenchao Wang, Pengyuan Zhang (2021), The Effect of Silence and Dual-Band Fusion in Anti-Spoofing System, Interspeech
Zhiyuan Peng, Xu Li, Tan Lee (2021), Pairing Weak with Strong: Twin Models for Defending Against Adversarial Attack on Speaker Verification, Interspeech
Hefei Ling, Leichao Huang, Junrui Huang, Baiyan Zhang, Ping Li (2021), Attention-Based Convolutional Neural Network for ASV Spoofing Detection, Interspeech
Haibin Wu, Yang Zhang, Zhiyong Wu, Dong Wang, Hung-yi Lee (2021), Voting for the Right Answer: Adversarial Defense for Speaker Verification, Interspeech
Tomi Kinnunen, Andreas Nautsch, Md. Sahidullah, Nicholas Evans, Xin Wang, Massimiliano Todisco, Héctor Delgado, Junichi Yamagishi, Kong Aik Lee (2021), Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing, Interspeech
Jesús Villalba, Sonal Joshi, Piotr Żelasko, Najim Dehak (2021), Representation Learning to Classify and Detect Adversarial Attacks Against Speaker and Speech Recognition Systems, Interspeech
You Zhang, Ge Zhu, Fei Jiang, Zhiyao Duan (2021), An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems, Interspeech
Xu Li, Xixin Wu, Hui Lu, Xunying Liu, Helen Meng (2021), Channel-Wise Gated Res2Net: Towards Robust Detection of Synthetic Speech Attacks, Interspeech
Wanying Ge, Michele Panariello, Jose Patino, Massimiliano Todisco, Nicholas Evans (2021), Partially-Connected Differentiable Architecture Search for Deepfake and Spoofing Detection, Interspeech
Kay Peterson, Audrey Tong, Yan Yu (2021), OpenASR20: An Open Challenge for Automatic Speech Recognition of Conversational Telephone Speech in Low-Resource Languages, Interspeech
Srikanth Madikeri, Petr Motlicek, Hervé Bourlard (2021), Multitask Adaptation with Lattice-Free MMI for Multi-Genre Speech Recognition of Low Resource Languages, Interspeech
Qiu-shi Zhu, Jie Zhang, Ming-hui Wu, Xin Fang, Li-Rong Dai (2021), An Improved Wav2Vec 2.0 Pre-Training Approach Using Enhanced Local Dependency Modeling for Speech Recognition, Interspeech
Hung-Pang Lin, Yu-Jia Zhang, Chia-Ping Chen (2021), Systems for Low-Resource Speech Recognition Tasks in Open Automatic Speech Recognition and Formosa Speech Recognition Challenges, Interspeech
Jing Zhao, Zhiqiang Lv, Ambyera Han, Guan-Bo Wang, Guixin Shi, Jian Kang, Jinghao Yan, Pengfei Hu, Shen Huang, Wei-Qiang Zhang (2021), The TNT Team System Descriptions of Cantonese and Mongolian for IARPA OpenASR20, Interspeech
Tanel Alumäe, Jiaming Kong (2021), Combining Hybrid and End-to-End Approaches for the OpenASR20 Challenge, Interspeech
Ethan Morris, Robbie Jimerson, Emily Prud’hommeaux (2021), One Size Does Not Fit All in Resource-Constrained ASR, Interspeech
Alejandrina Cristia (2021), Child Language Acquisition Studied with Wearables, Interspeech
Tomáš Mikolov (2021), Language Modeling and Artificial Intelligence, Interspeech
Pablo Gimeno, Alfonso Ortega, Antonio Miguel, Eduardo Lleida (2021), Unsupervised Representation Learning for Speech Activity Detection in the Fearless Steps Challenge 2021, Interspeech
Tyler Vuong, Yangyang Xia, Richard M. Stern (2021), The Application of Learnable STRF Kernels to the 2021 Fearless Steps Phase-03 SAD Challenge, Interspeech
Seyyed Saeed Sarfjoo, Srikanth Madikeri, Petr Motlicek (2021), Speech Activity Detection Based on Multilingual Speech Recognition System, Interspeech
Jarrod Luckenbaugh, Samuel Abplanalp, Rachel Gonzalez, Daniel Fulford, David Gard, Carlos Busso (2021), Voice Activity Detection with Teacher-Student Domain Emulation, Interspeech
Omid Ghahabi, Volker Fischer (2021), EML Online Speech Activity Detection for the Fearless Steps Challenge Phase-III, Interspeech
Kuba Łopatka, Katarzyna Kaszuba-Miotke, Piotr Klinke, Paweł Trella (2021), Device Playback Augmentation with Echo Cancellation for Keyword Spotting, Interspeech
Bolaji Yusuf, Alican Gok, Batuhan Gundogdu, Murat Saraclar (2021), End-to-End Open Vocabulary Keyword Search, Interspeech
Danny Merkx, Stefan L. Frank, Mirjam Ernestus (2021), Semantic Sentence Similarity: Size does not Always Matter, Interspeech
Jan Švec, Luboš Šmídl, Josef V. Psutka, Aleš Pražák (2021), Spoken Term Detection and Relevance Score Estimation Using Dot-Product of Pronunciation Embeddings, Interspeech
François Buet, François Yvon (2021), Toward Genre Adapted Closed Captioning, Interspeech
Daniel Korzekwa, Jaime Lorenzo-Trueba, Thomas Drugman, Shira Calamaro, Bozena Kostek (2021), Weakly-Supervised Word-Level Pronunciation Error Detection in Non-Native English Speech, Interspeech
Naoyuki Kanda, Guoli Ye, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka (2021), End-to-End Speaker-Attributed ASR with Transformer, Interspeech
Hagen Soltau, Mingqiu Wang, Izhak Shafran, Laurent El Shafey (2021), Understanding Medical Conversations: Rich Transcription, Confidence Scores & Information Extraction, Interspeech
Jazmín Vidal, Cyntia Bonomi, Marcelo Sancinetti, Luciana Ferrer (2021), Phone-Level Pronunciation Scoring for Spanish Speakers Learning English Using a GOP-DNN System, Interspeech
Xiaoshuo Xu, Yueteng Kang, Songjun Cao, Binghuai Lin, Long Ma (2021), Explore wav2vec 2.0 for Mispronunciation Detection, Interspeech
Shintaro Ando, Nobuaki Minematsu, Daisuke Saito (2021), Lexical Density Analysis of Word Productions in Japanese English Using Acoustic Word Embeddings, Interspeech
Binghuai Lin, Liyuan Wang (2021), Deep Feature Transfer Learning for Automatic Pronunciation Assessment, Interspeech
Huayun Zhang, Ke Shi, Nancy F. Chen (2021), Multilingual Speech Evaluation: Case Studies on English, Malay and Tamil, Interspeech
Linkai Peng, Kaiqi Fu, Binghuai Lin, Dengfeng Ke, Jinsong Zhan (2021), A Study on Fine-Tuning wav2vec2.0 Model for the Task of Mispronunciation Detection and Diagnosis, Interspeech
Yu Qiao, Wei Zhou, Elma Kerz, Ralf Schlüter (2021), The Impact of ASR on the Automatic Analysis of Linguistic Complexity and Sophistication in Spontaneous L2 Speech, Interspeech
Tomohiro Tanaka, Ryo Masumura, Mana Ihori, Akihiko Takashima, Shota Orihashi, Naoki Makishima (2021), End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning, Interspeech
Ronald Cumbal, Birger Moell, José Lopes, Olov Engwall (2021), “You don’t understand me!”: Comparing ASR Results for L1 and L2 Speakers of Swedish, Interspeech
Yang Zhang, Evelina Bakhturina, Kyle Gorman, Boris Ginsburg (2021), NeMo Inverse Text Normalization: From Development to Production, Interspeech
Satsuki Naijo, Akinori Ito, Takashi Nose (2021), Improvement of Automatic English Pronunciation Assessment with Small Number of Utterances Using Sentence Speakability, Interspeech
Fasih Haider, Saturnino Luz (2021), Affect Recognition Through Scalogram and Multi-Resolution Cochleagram Features, Interspeech
Jiawang Liu, Haoxiang Wang (2021), A Speech Emotion Recognition Framework for Better Discrimination of Confusions, Interspeech
Ruichen Li, Jinming Zhao, Qin Jin (2021), Speech Emotion Recognition via Multi-Level Cross-Modal Distillation, Interspeech
Koichiro Ito, Takuya Fujioka, Qinghua Sun, Kenji Nagamatsu (2021), Audio-Visual Speech Emotion Recognition by Disentangling Emotion and Identity Attributes, Interspeech
Deboshree Bose, Vidhyasaharan Sethu, Eliathamby Ambikairajah (2021), Parametric Distributions to Model Numerical Emotion Labels, Interspeech
Yuan Gao, Jiaxing Liu, Longbiao Wang, Jianwu Dang (2021), Metric Learning Based Feature Representation with Gated Fusion Model for Speech Emotion Recognition, Interspeech
Xingyu Cai, Jiahong Yuan, Renjie Zheng, Liang Huang, Kenneth Church (2021), Speech Emotion Recognition with Multi-Task Learning, Interspeech
Nadee Seneviratne, Carol Espy-Wilson (2021), Generalized Dilated CNN Models for Depression Detection Using Inverted Vocal Tract Variables, Interspeech
Yuhua Wang, Guang Shen, Yuezhu Xu, Jiahang Li, Zhengdao Zhao (2021), Learning Mutual Correlation in Multimodal Transformer for Speech Emotion Recognition, Interspeech
Jiaxing Liu, Yaodong Song, Longbiao Wang, Jianwu Dang, Ruiguo Yu (2021), Time-Frequency Representation Learning with Graph Convolutional Network for Dialogue-Level Speech Emotion Recognition, Interspeech
Gonçalo Mordido, Matthijs Van keirsbilck, Alexander Keller (2021), Compressing 1D Time-Channel Separable Convolutions Using Sparse Random Ternary Matrices, Interspeech
Mengli Cheng, Chengyu Wang, Jun Huang, Xiaobo Wang (2021), Weakly Supervised Construction of ASR Systems from Massive Video Data, Interspeech
Byeonggeun Kim, Simyung Chang, Jinkyu Lee, Dooyong Sung (2021), Broadcasted Residual Learning for Efficient Keyword Spotting, Interspeech
Rupak Vignesh Swaminathan, Brian King, Grant P. Strimel, Jasha Droppo, Athanasios Mouchtaris (2021), CoDERT: Distilling Encoder Representations with Co-Learning for Transducer-Based Speech Recognition, Interspeech
Zhifu Gao, Yiwu Yao, Shiliang Zhang, Jun Yang, Ming Lei, Ian McLoughlin (2021), Extremely Low Footprint End-to-End ASR System for Smart Device, Interspeech
Yuan Shangguan, Rohit Prabhavalkar, Hang Su, Jay Mahadeokar, Yangyang Shi, Jiatong Zhou, Chunyang Wu, Duc Le, Ozlem Kalinli, Christian Fuegen, Michael L. Seltzer (2021), Dissecting User-Perceived Latency of On-Device E2E Speech Recognition, Interspeech
Jonathan Macoskey, Grant P. Strimel, Jinru Su, Ariya Rastrow (2021), Amortized Neural Networks for Low-Latency Speech Recognition, Interspeech
Rami Botros, Tara N. Sainath, Robert David, Emmanuel Guzman, Wei Li, Yanzhang He (2021), Tied & Reduced RNN-T Decoder, Interspeech
Jangho Kim, Simyung Chang, Nojun Kwak (2021), PQK: Model Compression via Pruning, Quantization, and Knowledge Distillation, Interspeech
Varun Nagaraja, Yangyang Shi, Ganesh Venkatesh, Ozlem Kalinli, Michael L. Seltzer, Vikas Chandra (2021), Collaborative Training of Acoustic Encoders for Speech Recognition, Interspeech
Xiong Wang, Sining Sun, Lei Xie, Long Ma (2021), Efficient Conformer with Prob-Sparse Attention Mechanism for End-to-End Speech Recognition, Interspeech
Titouan Parcollet, Mirco Ravanelli (2021), The Energy and Carbon Footprint of Training End-to-End Speech Recognizers, Interspeech
Long Chen, Venkatesh Ravichandran, Andreas Stolcke (2021), Graph-Based Label Propagation for Semi-Supervised Speaker Identification, Interspeech
Ruirui Li, Chelsea J.-T. Ju, Zeya Chen, Hongda Mao, Oguz Elibol, Andreas Stolcke (2021), Fusion of Embeddings Networks for Robust Combination of Text Dependent and Independent Speaker Recognition, Interspeech
Sandro Cumani, Salvatore Sarni (2021), A Generative Model for Duration-Dependent Score Calibration, Interspeech
Jason Pelecanos, Quan Wang, Ignacio Lopez Moreno (2021), Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition, Interspeech
Saurabh Kataria, Shi-Xiong Zhang, Dong Yu (2021), Multi-Channel Speaker Verification for Single and Multi-Talker Speech, Interspeech
Dirk Padfield, Daniel J. Liebling (2021), Chronological Self-Training for Real-Time Speaker Diarization, Interspeech
Runqiu Xiao, Xiaoxiao Miao, Wenchao Wang, Pengyuan Zhang, Bin Cai, Liuping Luo (2021), Adaptive Margin Circle Loss for Speaker Verification, Interspeech
Benjamin O’Brien, Christine Meunier, Alain Ghio (2021), Presentation Matters: Evaluating Speaker Identification Tasks, Interspeech
Fuchuan Tong, Yan Liu, Song Li, Jie Wang, Lin Li, Qingyang Hong (2021), Automatic Error Correction for Speaker Embedding Learning with Noisy Labels, Interspeech
Dexin Liao, Jing Li, Yiming Zhi, Song Li, Qingyang Hong, Lin Li (2021), An Integrated Framework for Two-Pass Personalized Voice Trigger, Interspeech
Jiachen Lian, Aiswarya Vinod Kumar, Hira Dhamyal, Bhiksha Raj, Rita Singh (2021), Masked Proxy Loss for Text-Independent Speaker Verification, Interspeech
Keon Lee, Kyumin Park, Daeyoung Kim (2021), STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech, Interspeech
Rui Liu, Berrak Sisman, Haizhou Li (2021), Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability, Interspeech
Sarath Sivaprasad, Saiteja Kosgi, Vineet Gandhi (2021), Emotional Prosody Control for Speech Generation, Interspeech
Jian Cong, Shan Yang, Na Hu, Guangzhi Li, Lei Xie, Dan Su (2021), Controllable Context-Aware Conversational Speech Synthesis, Interspeech
Minchan Kim, Sung Jun Cheon, Byoung Jin Choi, Jong Jin Kim, Nam Soo Kim (2021), Expressive Text-to-Speech Using Style Tag, Interspeech
Yuzi Yan, Xu Tan, Bohan Li, Guangyan Zhang, Tao Qin, Sheng Zhao, Yuan Shen, Wei-Qiang Zhang, Tie-Yan Liu (2021), Adaptive Text to Speech for Spontaneous Style, Interspeech
Xiang Li, Changhe Song, Jingbei Li, Zhiyong Wu, Jia Jia, Helen Meng (2021), Towards Multi-Scale Style Control for Expressive Speech Synthesis, Interspeech
Shifeng Pan, Lei He (2021), Cross-Speaker Style Transfer with Prosody Bottleneck in Neural Speech Synthesis, Interspeech
Daxin Tan, Tan Lee (2021), Fine-Grained Style Modeling, Transfer and Prediction in Text-to-Speech Synthesis via Phone-Level Content-Style Disentanglement, Interspeech
Xiaochun An, Frank K. Soong, Lei Xie (2021), Improving Performance of Seen and Unseen Speech Style Transfer in End-to-End Neural TTS, Interspeech
Slava Shechtman, Raul Fernandez, Alexander Sorin, David Haws (2021), Synthesis of Expressive Speaking Styles with Limited Training Data in a Multi-Speaker, Prosody-Controllable Sequence-to-Sequence Architecture, Interspeech
Mai Hoang Dao, Thinh Hung Truong, Dat Quoc Nguyen (2021), Intent Detection and Slot Filling for Vietnamese, Interspeech
Haitao Lin, Lu Xiang, Yu Zhou, Jiajun Zhang, Chengqing Zong (2021), Augmenting Slot Values and Contexts for Spoken Language Understanding with Pretrained Models, Interspeech
Judith Gaspers, Quynh Do, Daniil Sorokin, Patrick Lehnen (2021), The Impact of Intent Distribution Mismatch on Semi-Supervised Spoken Language Understanding, Interspeech
Yidi Jiang, Bidisha Sharma, Maulik Madhavi, Haizhou Li (2021), Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification, Interspeech
Nick J.C. Wang, Lu Wang, Yandan Sun, Haimei Kang, Dejun Zhang (2021), Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-Trained DNN-HMM-Based Acoustic-Phonetic Model, Interspeech
Sujeong Cha, Wangrui Hou, Hyun Jung, My Phung, Michael Picheny, Hong-Kwang J. Kuo, Samuel Thomas, Edmilson Morais (2021), Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs, Interspeech
Xianwei Zhang, Liang He (2021), End-to-End Cross-Lingual Spoken Language Understanding Model with Multilingual Pretraining, Interspeech
Hamidreza Saghir, Samridhi Choudhary, Sepehr Eghbali, Clement Chung (2021), Factorization-Aware Training of Transformers for Natural Language Understanding on the Edge, Interspeech
Michael Saxon, Samridhi Choudhary, Joseph P. McKenna, Athanasios Mouchtaris (2021), End-to-End Spoken Language Understanding for Generalized Voice Assistants, Interspeech
Soyeon Caren Han, Siqu Long, Huichun Li, Henry Weld, Josiah Poon (2021), Bi-Directional Joint Neural Networks for Intent Classification and Slot Filling, Interspeech
Ross Cutler, Ando Saabas, Tanel Parnamaa, Markus Loide, Sten Sootla, Marju Purin, Hannes Gamper, Sebastian Braun, Karsten Sorensen, Robert Aichner, Sriram Srinivasan (2021), INTERSPEECH 2021 Acoustic Echo Cancellation Challenge, Interspeech
Lukas Pfeifenberger, Matthias Zoehrer, Franz Pernkopf (2021), Acoustic Echo Cancellation with Cross-Domain Learning, Interspeech
Shimin Zhang, Yuxiang Kong, Shubo Lv, Yanxin Hu, Lei Xie (2021), F-T-LSTM Based Complex Network for Joint Acoustic Echo Cancellation and Speech Enhancement, Interspeech
Ernst Seidel, Jan Franzen, Maximilian Strake, Tim Fingscheidt (2021), Y2-Net FCRN for Acoustic Echo and Noise Suppression, Interspeech
Renhua Peng, Linjuan Cheng, Chengshi Zheng, Xiaodong Li (2021), Acoustic Echo Cancellation Using Deep Complex Neural Network with Nonlinear Magnitude Compression and Phase Information, Interspeech
Amir Ivry, Israel Cohen, Baruch Berdugo (2021), Nonlinear Acoustic Echo Cancellation with Deep Learning, Interspeech
Jordan R. Green, Robert L. MacDonald, Pan-Pan Jiang, Julie Cattiau, Rus Heywood, Richard Cave, Katie Seaver, Marilyn A. Ladewig, Jimmy Tobin, Michael P. Brenner, Philip C. Nelson, Katrin Tomanek (2021), Automatic Speech Recognition of Disordered Speech: Personalized Models Outperforming Human Listeners on Short Phrases, Interspeech
Michael Neumann, Oliver Roesler, Jackson Liscombe, Hardik Kothare, David Suendermann-Oeft, David Pautler, Indu Navar, Aria Anvar, Jochen Kumm, Raquel Norel, Ernest Fraenkel, Alexander V. Sherman, James D. Berry, Gary L. Pattee, Jun Wang, Jordan R. Green, Vikram Ramanarayanan (2021), Investigating the Utility of Multimodal Conversational Technology and Audiovisual Analytic Measures for the Assessment and Monitoring of Amyotrophic Lateral Sclerosis at Scale, Interspeech
Enno Hermann, Mathew Magimai-Doss (2021), Handling Acoustic Variation in Dysarthric Speech Recognition Systems Through Model Combination, Interspeech
Mengzhe Geng, Shansong Liu, Jianwei Yu, Xurong Xie, Shoukang Hu, Zi Ye, Zengrui Jin, Xunying Liu, Helen Meng (2021), Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition, Interspeech
Sarah E. Gutz, Hannah P. Rowe, Jordan R. Green (2021), Speaking with a KN95 Face Mask: ASR Performance and Speaker Compensation, Interspeech
Zengrui Jin, Mengzhe Geng, Xurong Xie, Jianwei Yu, Shansong Liu, Xunying Liu, Helen Meng (2021), Adversarial Data Augmentation for Disordered Speech Recognition, Interspeech
Xurong Xie, Rukiye Ruzi, Xunying Liu, Lan Wang (2021), Variational Auto-Encoder Based Variability Encoding for Dysarthric Speech Recognition, Interspeech
Disong Wang, Songxiang Liu, Lifa Sun, Xixin Wu, Xunying Liu, Helen Meng (2021), Learning Explicit Prosody Models and Deep Speaker Embeddings for Atypical Voice Conversion, Interspeech
Jiajun Deng, Fabian Ritter Gutierrez, Shoukang Hu, Mengzhe Geng, Xurong Xie, Zi Ye, Shansong Liu, Jianwei Yu, Xunying Liu, Helen Meng (2021), Bayesian Parametric and Architectural Domain Adaptation of LF-MMI Trained TDNNs for Elderly and Dysarthric Speech Recognition, Interspeech
Shanqing Cai, Lisie Lillianfeld, Katie Seaver, Jordan R. Green, Michael P. Brenner, Philip C. Nelson, D. Sculley (2021), A Voice-Activated Switch for Persons with Motor and Speech Impairments: Isolated-Vowel Spotting Using Neural Networks, Interspeech
Zhehuai Chen, Bhuvana Ramabhadran, Fadi Biadsy, Xia Zhang, Youzheng Chen, Liyang Jiang, Fang Chu, Rohan Doshi, Pedro J. Moreno (2021), Conformer Parrotron: A Faster and Stronger End-to-End Speech Conversion and Recognition Model for Atypical Speech, Interspeech
Robert L. MacDonald, Pan-Pan Jiang, Julie Cattiau, Rus Heywood, Richard Cave, Katie Seaver, Marilyn A. Ladewig, Jimmy Tobin, Michael P. Brenner, Philip C. Nelson, Jordan R. Green, Katrin Tomanek (2021), Disordered Speech Data Collection: Lessons Learned at 1 Million Utterances from Project Euphonia, Interspeech
Eun Jung Yeo, Sunhee Kim, Minhwa Chung (2021), Automatic Severity Classification of Korean Dysarthric Speech Using Phoneme-Level Pronunciation Features, Interspeech
Subhashini Venugopalan, Joel Shor, Manoj Plakal, Jimmy Tobin, Katrin Tomanek, Jordan R. Green, Michael P. Brenner (2021), Comparing Supervised Models and Learned Speech Representations for Classifying Intelligibility of Disordered Speech on Selected Phrases, Interspeech
Vikramjit Mitra, Zifang Huang, Colin Lea, Lauren Tooley, Sarah Wu, Darren Botten, Ashwini Palekar, Shrinath Thelapurath, Panayiotis Georgiou, Sachin Kajarekar, Jefferey Bigham (2021), Analysis and Tuning of a Voice Assistant System for Dysfluent Speech, Interspeech
Hideki Kawahara, Kohei Yatabe, Ken-Ichi Sakakibara, Mitsunori Mizumachi, Masanori Morise, Hideki Banno, Toshio Irino (2021), Interactive and Real-Time Acoustic Measurement Tools for Speech Data Acquisition and Presentation: Application of an Extended Member of Time Stretched Pulses, Interspeech
Daniel Tihelka, Markéta Řezáčková, Martin Grůber, Zdeněk Hanzlíček, Jakub Vít, Jindřich Matoušek (2021), Save Your Voice: Voice Banking and TTS for Anyone, Interspeech
Yang Zhang, Evelina Bakhturina, Boris Ginsburg (2021), NeMo (Inverse) Text Normalization: From Development to Production, Interspeech
Corentin Hembise, Lucile Gelin, Morgane Daniel (2021), Lalilo: A Reading Assistant for Children Featuring Speech Recognition-Based Reading Mistake Detection, Interspeech
Manh Hung Nguyen, Vu Hoang, Tu Anh Nguyen, Trung H. Bui (2021), Automatic Radiology Report Editing Through Voice, Interspeech
Ke Shi, Kye Min Tan, Huayun Zhang, Siti Umairah Md. Salleh, Shikang Ni, Nancy F. Chen (2021), WittyKiddy: Multilingual Spoken Language Learning for Kids, Interspeech
Chunxiang Jin, Minghui Yang, Zujie Wen (2021), Duplex Conversation in Outbound Agent System, Interspeech
Sathvik Udupa, Anwesha Roy, Abhayjeet Singh, Aravind Illa, Prasanta Kumar Ghosh (2021), Web Interface for Estimating Articulatory Movements in Speech Production from Acoustics and Text, Interspeech
Andrew Caines (2017), Spoken CALL Shared Task system description, SLaTE
Ildiko Pilan, David Alfter, Elena Volodina (2017), Lärka: an online platform where language learning meets natural language processing, SLaTE
Chuan Wang, Ruobing Li, Hui Lin (2017), Deep Context Model for Grammatical Error Correction, SLaTE
Richeng Duan, Tatsuya Kawahara, Masatake Dantsuji, Hiroaki Nanjo (2017), Transfer Learning based Non-native Acoustic Modeling for Pronunciation Error Detection, SLaTE
Albara Khalifa, Tsuneo Kato, Seiichi Yamamoto (2017), Measuring Effect of Repetitive Queries and Implicit Learning with Joining-in-type Robot Assisted Language Learning System, SLaTE
Manny Rayner, Irene Strasly, Nikolaos Tsourakis, Johanna Gerlach, Pierrette Bouillon (2017), Menusigne: A Serious Game for Learning Sign Language Grammar, SLaTE
Claudia Baur, Cathy Chua, Johanna Gerlach, Manny Rayner, Martin Russell, Helmer Strik, Xizi Wei (2017), Overview of the 2017 Spoken CALL Shared Task, SLaTE
Seung Hee Yang, Minhwa Chung (2017), Linguistic Factors Affecting Evaluation of L2 Korean Speech Proficiency, SLaTE
Ronja Laarmann-Quante (2017), Towards a Tool for Automatic Spelling Error Analysis and Feedback Generation for Freely Written German Texts Produced by Primary School Children, SLaTE
Hyuksu Ryu, Minhwa Chung (2017), Mispronunciation Diagnosis of L2 English at Articulatory Level Using Articulatory Goodness-Of-Pronunciation Features, SLaTE
Nico Axtmann, Carolina Mehmet, Kay Berkling (2017), The CSU-K Rule-Based Pipeline System for Spoken CALL Shared Task, SLaTE
Marlisa Hommel (2017), Speech perception training as a serious game in the EFL classroom, SLaTE
Kamini Sabu, Prakhar Swarup, Hitesh Tulsiani, Preeti Rao (2017), Automatic Assessment of Children's L2 Reading for Accuracy and Fluency, SLaTE
Gary Yeung, Amber Afshan, Kaan Ege Ozgun, Canton Kaewtip, Steven M. Lulich, Abeer Alwan (2017), Predicting Clinical Evaluations of Children’s Speech with Limited Data Using Exemplar Word Template References, SLaTE
Muneeb Ahmad, Omar Mubin, Suleman Shahid, Joanne Orlando (2017), Emotion and Memory Model for a Robotic Tutor as a Social Partner in a Learning Environment, SLaTE
Nobuaki Minematsu, Daisuke Saito (2017), New Features and Effectiveness of Suzuki-kun, the First and Only Prosodic Reading Tutor of Tokyo Japanese, SLaTE
Erika Godde, Gérard Bailly, David Escudero, Marie-Line Bosse, Maryse Bianco, Coriandre Vilain (2017), Improving fluency of young readers: introducing a Karaoke to learn how to breathe during a Reading-while-Listening task, SLaTE
Xinhao Wang, Keelan Evanini (2017), Empirical Evaluation of the Communicative Effectiveness of an Automatic Speech-to-Speech Translation System, SLaTE
Cristian Tejedor-García, David Escudero, César González-Ferreras, Enrique Cámara-Arenas, Valentín Cardeñoso-Payo (2017), Evaluating the Efficiency of Synthetic Voice for Providing Corrective Feedback in a Pronunciation Training Tool Based on Minimal Pairs, SLaTE
Keelan Evanini, Matthew Mulholland, Eugene Tsuprun, Yao Qian (2017), Using an Automated Content Scoring Engine for Spoken CALL Responses: The ETS submission for the Spoken CALL Challenge, SLaTE
Xinhao Wang, Keelan Evanini, Klaus Zechner, Matthew Mulholland (2017), Modeling Discourse Coherence for the Automated Scoring of Spontaneous Spoken Responses, SLaTE
John Sloan, Julie Carson-Berndsen (2017), Was it something I said? Facial Expressions in Language Learning, SLaTE
Junwei Yue, Daisuke Saito, Nobuaki Minematsu, Yutaka Yamauchi, Kayoko Ito (2017), Development and Maintenance of Practical and In-service Systems for Recording Shadowing Utterances and Their Assessment, SLaTE
José Lopes, Olov Engwall, Gabriel Skantze (2017), A First Visit to the Robot Language Café, SLaTE
Emer Gilmartin, Jaebok Kim, Alpha Diallo, Yong Zhao, Neasa Ni Chiarain, Ketong Su, Yuyun Huang, Benjamin Cowan, Nick Campbell (2017), CARAMILLA - Speech Mediated Language Learning Modules for Refugee and High School Learners of English and Irish, SLaTE
Andrey Malinin, Kate Knill, Anton Ragni, Yu Wang, Mark Gales (2017), An attention based model for off-topic spontaneous spoken response detection: An Initial Study, SLaTE
Stefano Artuso, Luca Cristoforetti, Daniele Falavigna, Roberto Gretter, Nadia Mana, Gianluca Schiavo (2017), A System for Asessing Children Readings as School, SLaTE
Aku Rouhe, Reima Karhila, Heini Kallio, Mikko Kurimo (2017), A pipeline for automatic assessment of foreign language pronunciation, SLaTE
Mengjie Qian, Xizi Wei, Peter Jančovič, Martin Russell (2017), The University of Birmingham 2017 SLaTE CALL Shared Task Systems, SLaTE
Yoo Rhee Oh, Hyung-Bae Jeon, Hwa Jeon Song, Byung Ok Kang, Yun-Kyung Lee, Jeon-Gue Park, Yun-Keun Lee (2017), Deep-Learning Based Automatic Spontaneous Speech Assessment in a Data-Driven Approach for the 2017 SLaTE CALL Shared Challenge, SLaTE
Ahmed Magooda, Diane Litman (2017), Syntactic and semantic features for human like judgement in spoken CALL, SLaTE
Neasa Ní Chiaráin, Ailbhe Ní Chasaide (2017), Effects of Educational Context on Learners’ Ratings of a Synthetic Voice, SLaTE
Konstantinos Kyriakopoulos, Mark Gales, Kate Knill (2017), Automatic Characterisation of the Pronunciation of Non-native English Speakers using Phone Distance Features, SLaTE
Yutaka Yamauchi, Junwei Yue, Kayoko Ito, Nobuaki Minematsu (2017), Investigation of teacher-selected sentences and machine-suggested sentences in terms of correlation between human ratings and GOP-based machine scores, SLaTE
Maryam Sadat Mirzaei, Kourosh Meshgi, Tatsuya Kawahara (2017), Detecting listening difficulty for second language learners using Automatic Speech Recognition errors, SLaTE
Jared Bernstein, Jian Cheng, Jennifer Balogh, Elizabeth Rosenfeld (2017), Studies of a Self-Administered Oral Reading Assessment, SLaTE
Victor Zue (2007), On organic interfaces, Interspeech
Sophie K. Scott (2007), The neural basis of speech perception - a view from functional imaging, Interspeech
Alex Waibel, Keni Bernardin, Matthias Wölfel (2007), Computer-supported human-human multilingual communication, Interspeech
Pierre-Yves Oudeyer (2007), Self-organization in the evolution of shared systems of speech sounds: a computational study, Interspeech
Jinyu Li, Chin-Hui Lee (2007), Soft margin feature extraction for automatic speech recognition, Interspeech
Yan Yin, Hui Jiang (2007), A fast optimization method for large margin estimation of HMMs based on second order cone programming, Interspeech
Hao-Zheng Li, Douglas O'Shaughnessy (2007), Frame margin probability discriminative training algorithm for noisy speech recognition, Interspeech
Fabio Valente, Jithendra Vepa, Christian Plahl, Christian Gollan, Hynek Hermansky, Ralf Schlüter (2007), Hierarchical neural networks feature extraction for LVCSR system, Interspeech
Peder A. Olsen, John R. Hershey (2007), Bhattacharyya error and divergence using variational importance sampling, Interspeech
Tingyao Wu, Jacques Duchateau, Dirk Compernolle (2007), Phoneme dependent frame selection preference, Interspeech
Xinhui Zhou, Carol Y. Espy-Wilson, Mark Tiede, Suzanne Boyce (2007), An articulatory and acoustic study of "retroflex" and "bunched" american English rhotic sound based on MRI, Interspeech
Paula Martins, Inês Carbone, Augusto Silva, António J. S. Teixeira (2007), An MRI study of european portuguese nasals, Interspeech
Sayoko Takano, Hiroki Matsuzaki, Kunitoshi Motoki (2007), A four-cube FEM model of the extrinsic and intrinsic tongue muscles to simulate the production of vowel /i/, Interspeech
Juan Torres, Elliot Moore (2007), Performance evaluation of glottal quality measures from the perspective of vocal tract filter consistency, Interspeech
Veena D. Singampalli, Philip J. B. Jackson (2007), Statistical identification of critical, dependent and redundant articulators, Interspeech
Chao Qin, Miguel Á. Carreira-Perpiñán (2007), An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping, Interspeech
Sorin Dusan (2007), Vocal tract length during speech production, Interspeech
Nobuhiro Miki, Kyohei Hayashi (2007), Approximation method of subglottal system using ARMA filter, Interspeech
Asterios Toutios, Konstantinos Margaritis (2007), Enhancing acoustic-to-EPG mapping with lip position information, Interspeech
Tokihiko Kaburagi, Yosuke Tanabe (2007), A model of glottal flow incorporating viscous-inviscid interaction, Interspeech
Kilian G. Seeber (2007), Thinking outside the cube: modeling language processing tasks in a multiple resource paradigm, Interspeech
Julien Cisonni, Annemie Van Hirtum, Jan Willems, Xavier Pelorson (2007), Experimental validation of direct and inverse glottal flow models for unsteady flow conditions, Interspeech
Hideyuki Nomura, Tetsuo Funada (2007), Effect of unsteady glottal flow on the speech production process, Interspeech
Katrin Schneider, Bernd Möbius (2007), Word stress correlates in spontaneous child-directed speech in German, Interspeech
Michael Aron, Nicolas Ferveur, Erwan Kerrien, Marie-Odile Berger, Yves Laprie (2007), Acquisition and synchronization of multimodal articulatory data, Interspeech
Vincent Robert, Yves Laprie, Anne Bonneau (2007), A phonetic concatenative approach of labial coarticulation, Interspeech
Aseel Turkmani, Adrian Hilton, Philip J. B. Jackson, James Edge (2007), Visual analysis of lip coarticulation in VCV utterances, Interspeech
Matti Airas, Paavo Alku (2007), Comparison of multiple voice source parameters in different phonation types, Interspeech
Monja Knoll, Lisa Scharrer (2007), Acoustic and affective comparisons of natural and imaginary infant-, foreigner- and adult-directed speech, Interspeech
André Araújo, Luis M. T. Jesus, Isabel M. Costa (2007), Vowel production in two occlusal classes, Interspeech
Rajesh Khatiwada (2007), Nepalese retroflex stops: a static palatography study of inter- and intra-speaker variability, Interspeech
Charles A. Lamoureux, Victor J. Boucher (2007), Effects of testosterone levels on temporal and intonational aspects of speech: more exploratory data, Interspeech
Peter Karsmakers, Kristiaan Pelckmans, Johan Suykens, Hugo Van hamme (2007), Fixed-size kernel logistic regression for phoneme classification, Interspeech
Seung Seop Park, Jong Won Shin, Jong Kyu Kim, Nam Soo Kim (2007), A multiple-model based framework for automatic speech segmentation, Interspeech
Aren Jansen, Partha Niyogi (2007), Semi-supervised learning of speech sounds, Interspeech
Abhinav Parate, Ashish Verma, Jayanta Basak (2007), Evaluation of syllable stress using single class classifier, Interspeech
Mohammad Nurul Huda, Ghulam Muhammad, Junsei Horikawa, Tsuneo Nitta (2007), Distinctive phonetic feature (DPF) based phone segmentation using hybrid neural networks, Interspeech
J. -Ph. Goldman, M. Avanzi, A. -C. Simon, Anne Lacheret, A. Auchlin (2007), A methodology for the automatic detection of perceived prominent syllables in spoken French, Interspeech
Xiaochuan Niu, Jan P. H. van Santen (2007), Dual-channel acoustic detection of nasalization states, Interspeech
Tarun Pruthi, Carol Y. Espy-Wilson (2007), Acoustic parameters for the automatic detection of vowel nasalization, Interspeech
Jun Hou, Lawrence R. Rabiner, Sorin Dusan (2007), On the use of time-delay neural networks for highly accurate classification of stop consonants, Interspeech
Ladan Golipour, Douglas O'Shaughnessy (2007), A new approach for phoneme segmentation of speech signals, Interspeech
Veronique Stouten, Kris Demuynck, Hugo Van hamme (2007), Automatically learning the units of speech by non-negative matrix factorisation, Interspeech
Ozlem Kalinli, Shrikanth S. Narayanan (2007), A saliency-based auditory attention model with applications to unsupervised prominent syllable detection in speech, Interspeech
Sung Jun An, Young-Ik Kim, Rhee Man Kil (2007), Zero-crossing-based ratio masking for sound segregation, Interspeech
Satomi Tanaka, Minoru Tsuzaki, Hiroaki Kato, Yoshinori Sagisaka (2007), Event detection of speech signals based on auditory processing with a dynamic compressive gammachirp filterbank, Interspeech
Odette Scharenborg, Mirjam Ernestus, Vincent Wan (2007), Segmentation of speech: child's play?, Interspeech
Andrew Errity, John McKenna, Barry Kirkpatrick (2007), Dimensionality reduction methods applied to both magnitude and phase derived features, Interspeech
Hiroki Mori, Hideki Kasuya (2007), Voice source and vocal tract variations as cues to emotional states perceived from expressive conversational speech, Interspeech
Fan Yang, Peter A. Heeman (2007), Exploring initiative strategies using computer simulation, Interspeech
Chiu-yu Tseng, Zhao-yu Su (2007), From one base form to multiple output styles - predicting stylistic dynamics of discourse prosody, Interspeech
Claudia Crocco, Renata Savy (2007), Topic in dialogue: prosodic and syntactic features, Interspeech
Michiko Watanabe, Yasuharu Den, Keikichi Hirose, Shusaku Miwa, Nobuaki Minematsu (2007), Features of pauses and conjunctions at syntactic and discourse boundaries in Japanese monologues, Interspeech
Craig Wootton, Michael McTear, Terry Anderson (2007), Utilizing online content as domain knowledge in a multi-domain dynamic dialogue system, Interspeech
Boris van Schooten, Sophie Rosset, Olivier Galibert, Aurélien Max, Rieks op den Akker, Gabriel Illouz (2007), Handling speech input in the ritel QA dialogue system, Interspeech
Woosung Kim (2007), Online call quality monitoring for automating agent-based call centers, Interspeech
Sebastian Möller, Klaus-Peter Engelbrecht, Antti Oulasvirta (2007), Analysis of communication failures for spoken dialogue systems, Interspeech
Sandra Mann, André Berton, Ute Ehrlich (2007), How to access audio files of large data bases using in-car speech dialogue systems, Interspeech
Kazunori Komatani, Tatsuya Kawahara, Hiroshi G. Okuno (2007), Analyzing temporal transition of real user's behaviors in a spoken dialogue system, Interspeech
J. Sherwani, Dong Yu, Tim Paek, Mary Czerwinski, Yun-Cheng Ju, Alex Acero (2007), Voicepedia: towards speech-based access to unstructured information, Interspeech
Vivek Rangarajan, Srinivas Bangalore, Shrikanth S. Narayanan (2007), Exploiting prosodic features for dialog act tagging in a discriminative modeling framework, Interspeech
Hua Ai, Antonio Roque, Anton Leuski, David Traum (2007), Using information state to improve dialogue move identification in a spoken dialogue system, Interspeech
Shiu-Wah Chu, Ian O'Neill, Philip Hanna (2007), Using multiple strategies to manage spoken dialogue, Interspeech
Marcelo Quinderé, Luís Seabra Lopes, António J. S. Teixeira (2007), An information state based dialogue manager for a mobile robot, Interspeech
Dong Yu, Yun-Cheng Ju, Ye-Yi Wang, Geoffrey Zweig, Alex Acero (2007), Automated directory assistance system - from theory to practice, Interspeech
Geoffrey Zweig, Patrick Nguyen, Yun-Cheng Ju, Ye-Yi Wang, Dong Yu, Alex Acero (2007), The voice-rate dialog system for consumer ratings, Interspeech
Andi Winterboer, Jiang Hu, Johanna D. Moore, Clifford Nass (2007), The influence of user tailoring and cognitive load on user performance in spoken dialogue systems, Interspeech
Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Geoffrey Zweig, Alex Acero (2007), Confidence measures for voice search applications, Interspeech
Ryuichiro Higashinaka, Kohji Dohsaka, Shigeaki Amano, Hideki Isozaki (2007), Effects of quiz-style information presentation on user understanding, Interspeech
Hong-Kwang Jeff Kuo, Vaibhava Goel (2007), A data visualization and analysis method for natural language call routing system design, Interspeech
Josef G. Bauer, Bernt Andrassy, Ekaterina Timoshenko (2007), Discriminative optimization of language adapted HMMs for a language identification system based on parallel phoneme recognizers, Interspeech
Khe Chai Sim, Haizhou Li (2007), Fusion of contrastive acoustic models for parallel phonotactic spoken language identification, Interspeech
Liang Wang, Eliathamby Ambikairajah, Eric H. C. Choi (2007), Multi-layer kohonen self-organizing feature map for language identification, Interspeech
Bo Yin, Eliathamby Ambikairajah, Fang Chen (2007), Hierarchical language identification based on automatic language clustering, Interspeech
Ekaterina Timoshenko, Harald Höge (2007), Using speech rhythm for acoustic language identification, Interspeech
Ka-keung Wong, Man-hung Siu, Brian Mak (2007), A model-based estimation of phonotactic language verification performance, Interspeech
Mike Rosner, Paulseph-John Farrugia (2007), A tagging algorithm for mixed language identification in a noisy domain, Interspeech
Doroteo T. Toledano, Javier Gonzalez-Dominguez, Alejandro Abejon-Gonzalez, Danilo Spada, Ismael Mateos-Garcia, Joaquin Gonzalez-Rodriguez (2007), Improved language recognition using better phonetic decoders and fusion with MFCC and SDC features, Interspeech
David A. van Leeuwen, Khiet P. Truong (2007), An open-set detection evaluation methodology applied to language and emotion recognition, Interspeech
Xi Yang, Man-hung Siu, Herbert Gish, Brian Mak (2007), Boosting with anti-models for automatic language identification, Interspeech
Fabio Castaldo, Daniele Colibro, Emanuele Dalmasso, Pietro Laface, Claudio Vair (2007), Acoustic language identification using fast discriminative training, Interspeech
Ming Li, Hongbin Suo, Xiao Wu, Ping Lu, Yonghong Yan (2007), Spoken language identification using score vector modeling and support vector machine, Interspeech
R. Cordoba, L. F. D'Haro, F. Fernandez-Martinez, J. Macias-Guarasa, J. Ferreiros (2007), Language identification based on n-gram frequency ranking, Interspeech
Wade Shen, Douglas Reynolds (2007), Improving phonotactic language recognition with acoustic adaptation, Interspeech
Daniel Bolanos, Wayne Ward, Sarel Van Vuuren, Javier Garrido (2007), Syllable lattices as a basis for a children's speech reading tracker, Interspeech
Fuping Pan, Qingwei Zhao, Yonghong Yan (2007), Mandarin vowel pronunciation quality evaluation by using formant pattern recognition, Interspeech
Matthew Black, Joseph Tepperman, Sungbok Lee, Patti Price, Shrikanth S. Narayanan (2007), Automatic detection and classification of disfluent reading miscues in young children's speech for the purpose of assessment, Interspeech
Nobuaki Minematsu, K. Kamata, Satoshi Asakawa, T. Makino, T. Nishimura, Keikichi Hirose (2007), Structural assessment of language learners' pronunciation, Interspeech
Abdurrahman Samir, Sherif Mahdy Abdou, Ahmed Husien Khalil, Mohsen Rashwan (2007), Enhancing usability of CAPL system for qur'an recitation learning, Interspeech
Febe de Wet, Christa van der Walt, Thomas Niesler (2007), Automatic large-scale oral language proficiency assessment, Interspeech
Yuki Denda, Takamasa Tanaka, Masato Nakayama, Takanobu Nishiura, Yoichi Yamashita (2007), Noise-robust hands-free voice activity detection with adaptive zero crossing detection using talker direction estimation, Interspeech
A. Álvarez, R. Martínez, P. Gómez, V. Nieto, V. Rodellar (2007), A robust mel-scale subband voice activity detector for a car platform, Interspeech
Kentaro Ishizuka, Tomohiro Nakatani, Masakiyo Fujimoto, Noboru Miyazaki (2007), Noise robust front-end processing with voice activity detection based on periodic to aperiodic component ratio, Interspeech
A. M. Toh, Roberto Togneri, Sven Nordholm (2007), Feature and distribution normalization schemes for statistical mismatch reduction in reverberant speech recognition, Interspeech
Matthew Gibson, Thomas Hain (2007), Temporal masking for unsupervised minimum Bayes risk speaker adaptation, Interspeech
Tsung-hsueh Hsieh, Jeih-weih Hung (2007), Speech feature compensation based on pseudo stereo codebooks for robust speech recognition in additive noise environments, Interspeech
Dimitrios Dimitriadis, Petros Maragos, Stamatios Lefkimmiatis (2007), Multiband, multisensor robust features for noisy speech recognition, Interspeech
Akira Sasou, Hiroaki Kojima (2007), Noise robust speech recognition for voice driven wheelchair, Interspeech
Yu Hu, Qiang Huo (2007), Irrelevant variability normalization based HMM training using VTS approximation of an explicit model of environmental distortions, Interspeech
Luis Buera, Antonio Miguel, Eduardo Lleida, Óscar Saz, Alfonso Ortega (2007), On the jointly unsupervised feature vector normalization and acoustic model compensation for robust speech recognition, Interspeech
Yu Tsao, Chin-Hui Lee (2007), An ensemble modeling approach to joint characterization of speaker and speaking environments, Interspeech
Shih-Hsiang Lin, Yao-Ming Yeh, Berlin Chen (2007), Cluster-based polynomial-fit histogram equalization (CPHEQ) for robust speech recognition, Interspeech
Pedro M. Martinez, Jose C. Segura, Luz Garcia (2007), Robust distributed speech recognition using histogram equalization and correlation information, Interspeech
Jen-Tzung Chien, Koichi Shinoda, Sadaoki Furui (2007), Predictive minimum Bayes risk classification for robust speech recognition, Interspeech
Ning Ma, Jon Barker, Phil Green (2007), Applying word duration constraints by using unrolled HMMs, Interspeech
Xiong Xiao, Eng Siong Chng, Haizhou Li (2007), Evaluating the temporal structure normalisation technique on the Aurora-4 task, Interspeech
Hynek Bořil, Petr Fousek, Harald Höge (2007), Two-stage system for robust neutral/lombard speech recognition, Interspeech
Takatoshi Jitsuhiro, Tomoji Toriyama, Kiyoshi Kogure (2007), Noise suppression using search strategy with multi-model compositions, Interspeech
Takanobu Nishiura, Yoshiki Hirano, Yuki Denda, Masato Nakayama (2007), Investigations into early and late reflections on distant-talking speech recognition toward suitable reverberation criteria, Interspeech
Stefan Windmann, Reinhold Haeb-Umbach (2007), An approach to iterative speech feature enhancement and recognition, Interspeech
Jeih-weih Hung (2007), Optimization of temporal filters in the modulation frequency domain for constructing robust features in speech recognition, Interspeech
Rico Petrick, Kevin Lohde, Matthias Wolff, Rüdiger Hoffmann (2007), The harming part of room acoustics in automatic speech recognition, Interspeech
Yuan Fu Liao, Yh-Her Yang, Chi-Hui Hsu, Cheng-Chang Lee, Jing-Teng Zeng (2007), A reference model weighting-based method for robust speech recognition, Interspeech
Babak Nasersharif, Ahmad Akbari, Mohammad Mehdi Homayounpour (2007), Mel sub-band filtering and compression for robust speech recognition, Interspeech
Yun Tang, Richard Rose (2007), Clustered maximum likelihood linear basis for rapid speaker adaptation, Interspeech
Wenxuan Teng, Guillaume Gravier, Frédéric Bimbot, Frédéric Soufflet (2007), Rapid speaker adaptation by reference model interpolation, Interspeech
Randy Gomez, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano (2007), Rapid unsupervised speaker adaptation using single utterance based on MLLR and speaker selection, Interspeech
Brian Mak, Roger Hsiao (2007), Robustness of several kernel-based fast adaptation methods on noisy LVCSR, Interspeech
Janne Pylkkönen (2007), Estimating VTLN warping factors by distribution matching, Interspeech
Ming Liu, Xi Zhou, Mark Hasegawa-Johnson, Thomas S. Huang, Zhengyou Zhang (2007), Frequency domain correspondence for speaker normalization, Interspeech
Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa (2007), Unsupervised training of adaptation rate using q-learning in large vocabulary continuous speech recognition, Interspeech
Martin Karafiát, Lukáš Burget, Jan Černocký, Thomas Hain (2007), Application of CMLLR in narrow band wide band adapted systems, Interspeech
Christophe Lévy, Georges Linarès, Jean-François Bonastre (2007), Fast adaptation of GMM-based compact models, Interspeech
Jonas Lööf, Ralf Schlüter, Hermann Ney (2007), Efficient estimation of speaker-specific projecting feature transforms, Interspeech
Mohamed Kamal Omar (2007), Regularized feature-based maximum likelihood linear regression for speech recognition, Interspeech
Omar Caballero Morales, Stephen Cox (2007), Modelling confusion matrices to improve speech recognition accuracy, with an application to dysarthric speech, Interspeech
Qiang Huo, Wei Li (2007), An active approach to speaker and task adaptation based on automatic analysis of vocabulary confusability, Interspeech
Jing Zheng, Andreas Stolcke (2007), fMPE-MAP: improved discriminative adaptation for modeling new domains, Interspeech
Timothy J. Hazen, Erik McDermott (2007), Discriminative MCE-based speaker adaptation of acoustic models for a spoken lecture processing task, Interspeech
Zahi N. Karam, William M. Campbell (2007), A new kernel for SVM MLLR based speaker recognition, Interspeech
Kong-Aik Lee, Changhuai You, Haizhou Li, Tomi Kinnunen (2007), A GMM-based probabilistic sequence kernel for speaker verification, Interspeech
Hagai Aronowitz (2007), Speaker recognition using kernel-PCA and intersession variability modeling, Interspeech
Réda Dehak, Najim Dehak, Patrick Kenny, Pierre Dumouchel (2007), Linear and non linear kernel GMM supervector machines for speaker verification, Interspeech
Ignacio Lopez-Moreno, Ismael Mateos-Garcia, Daniel Ramos, Joaquin Gonzalez-Rodriguez (2007), Support vector regression for speaker verification, Interspeech
C. Longworth, M. J. F. Gales (2007), Derivative and parametric kernels for speaker verification, Interspeech
Jose R. Calvo, Rafael Fernández, Gabriel Hernández (2007), Application of shifted delta cepstral features in speaker verification, Interspeech
Luciana Ferrer, Kemal Sönmez, Elizabeth Shriberg (2007), A smoothing kernel for spatially related features and its application to speaker verification, Interspeech
D. Charlet, M. Collet, Frédéric Bimbot (2007), VZ-norm: an extension of z-norm to the multivariate case for anchor model based speaker verification, Interspeech
Howard Lei, Nikki Mirghafori (2007), Word-conditioned HMM supervectors for speaker recognition, Interspeech
Wei-Ho Tsai (2007), Speaker clustering using direct maximization of a BIC-based score, Interspeech
A. Preti, Jean-François Bonastre, Driss Matrouf, F. Capman, B. Ravera (2007), Confidence measure based unsupervised target model adaptation for speaker verification, Interspeech
Huanjun Bao, Ming-Xing Xu, Thomas Fang Zheng (2007), Emotion attribute projection for speaker recognition on emotional speech, Interspeech
Shi-Xiong Zhang, Man-Wai Mak, Helen Meng (2007), High-level feature-based speaker verification via articulatory phonetic-class pronunciation modeling, Interspeech
T. Yingthawornsuk, H. Kaymaz Keskinpala, D. M. Wilkes, R. G. Shiavi, R. M. Salomon (2007), Direct acoustic feature using iterative EM algorithm and spectral energy for classifying suicidal speech, Interspeech
Claudio Garreton, Nestor Becerra Yoma, Fernando Huenupán, Carlos Molina (2007), On comparing and combining intra-speaker variability compensation and unsupervised model adaptation in speaker verification, Interspeech
Xianyu Zhao, Yuan Dong, Hao Yang, Jian Zhao, Liang Lu, Haila Wang (2007), Comparison of two kinds of speaker location representation for SVM-based speaker verification, Interspeech
Mireia Farrús, Javier Hernando, Pascual Ejarque (2007), Jitter and shimmer measurements for speaker recognition, Interspeech
Zhenyu Shan, Yingchun Yang, Ruizhi Ye (2007), Natural-emotion GMM transformation algorithm for emotional speaker recognition, Interspeech
Ivy H. Tseng, Olivier Verscheure, Deepak S. Turaga, Upendra V. Chaudhari (2007), Optimized one-bit quantization for adapted GMM-based speaker verification, Interspeech
Mitchell McLaren, Robbie Vogt, Brendan Baker, Sridha Sridharan (2007), A comparison of session variability compensation techniques for SVM-based speaker recognition, Interspeech
Benoît Fauve, Nicholas Evans, Neil Pearson, Jean-François Bonastre, John Mason (2007), Influence of task duration in text-independent speaker verification, Interspeech
Elizabeth Shriberg, Luciana Ferrer (2007), A text-constrained prosodic system for speaker verification, Interspeech
Asmaa El Hannani, Dijana Petrovska-Delacrétaz (2007), Fusing acoustic, phonetic and data-driven systems for text-independent speaker verification, Interspeech
Najim Dehak, Patrick Kenny, Pierre Dumouchel (2007), Continuous prosodic features and formant modeling with joint factor analysis for speaker verification, Interspeech
Claudio Vair, Daniele Colibro, Fabio Castaldo, Emanuele Dalmasso, Pietro Laface (2007), Loquendo - Politecnico di torino's 2006 NIST speaker recognition evaluation system, Interspeech
Driss Matrouf, Nicolas Scheffer, Benoît Fauve, Jean-François Bonastre (2007), A straightforward and efficient implementation of the factor analysis model for speaker verification, Interspeech
Timothy J. Hazen, Daniel Schultz (2007), Multi-modal user authentication from video for mobile or variable-environment applications, Interspeech
Michael Gerber, René Beutler, Beat Pfister (2007), Quasi text-independent speaker-verification based on pattern matching, Interspeech
Yosef A. Solewicz, Moshe Koppel (2007), Virtual fusion for speaker recognition, Interspeech
Yi-Hsiang Chao, Wei-Ho Tsai, Shih-Sian Cheng, Hsin-Min Wang, Ruei-Chuan Chang (2007), Evolutionary minimum verification error learning of the alternative hypothesis model for LLR-based speaker verification, Interspeech
Seiichi Nakagawa, Kouhei Asakawa, Longbiao Wang (2007), Speaker recognition by combining MFCC and phase information, Interspeech
Sandeep Manocha, Carol Y. Espy-Wilson (2007), A semi-automatic approach for speaker mining of tapped telephone conversations, Interspeech
Hao Yang, Yuan Dong, Xianyu Zhao, Jian Zhao, Liang Lu, Haila Wang (2007), Cluster adaptive training weights as features in SVM-based speaker verification, Interspeech
Hideki Okamoto, Mariko Kojima, Tomoko Matsui, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano (2007), Study on speaker verification with non-audible murmur segments, Interspeech
Xugang Lu, Jianwu Dang (2007), Dimension reduction for speaker identification based on mutual information, Interspeech
Jonas Lindh, Anders Eriksson (2007), Robustness of long time measures of fundamental frequency, Interspeech
Vinod Prakash, John H. L. Hansen (2007), Score distribution scaling for speaker recognition, Interspeech
A. C. Morris, J. Koreman, B. Ly-Van, H. Sellahewa, S. Jassim, R. Llarena Gómez (2007), Global features for rapid identity verification with dynamic biometric data, Interspeech
Tuan Van Pham, Michael Neffe, Gernot Kubin (2007), Robust voice activity detection for narrow-bandwidth speaker verification under adverse environments, Interspeech
Fernando Huenupán, Nestor Becerra Yoma, Carlos Molina, Claudio Garreton (2007), Speaker verification with multiple classifier fusion using Bayes based confidence measure, Interspeech
Girija Chetty, Michael Wagner (2007), Audiovisual speaker identity verification based on lip motion features, Interspeech
Gokhan Tur, Elizabeth Shriberg, Andreas Stolcke, Sachin Kajarekar (2007), Duration and pronunciation conditioned lexical modeling for speaker verification, Interspeech
Jean-François Bonastre, Driss Matrouf, Corinne Fredouille (2007), Artificial impostor voice transformation effects on false acceptance rates, Interspeech
David R. H. Miller, Michael Kleber, Chia-Lin Kao, Owen Kimball, Thomas Colthurst, Stephen A. Lowe, Richard M. Schwartz, Herbert Gish (2007), Rapid and accurate spoken term detection, Interspeech
Yi-cheng Pan, Hung-lin Chang, Berlin Chen, Lin-shan Lee (2007), Subword-based position specific posterior lattices (s-PSPL) for indexing speech information, Interspeech
Andreas Merkel, Dietrich Klakow (2007), Improved methods for language model based question classification, Interspeech
Tomoyosi Akiba, Hirofumi Tsujimura (2007), Error-tolerant question answering for spoken documents, Interspeech
Dilek Hakkani-Tür, Gokhan Tur, Michael Levit (2007), Exploiting information extraction annotations for document retrieval in distillation tasks, Interspeech
K. Thambiratnam, F. Seide (2007), Learning spoken document similarity and recommendation using supervised probabilistic latent semantic analysis, Interspeech
Roy Wallace, Robbie Vogt, Sridha Sridharan (2007), A phonetic search approach to the 2006 NIST spoken term detection evaluation, Interspeech
Yoshiaki Itoh, Kohei Iwata, Kazunori Kojima, Masaaki Ishigame, Kazuyo Tanaka, Shi-wook Lee (2007), An integration method of retrieval results using plural subword models for vocabulary-free spoken document retrieval, Interspeech
Dimitra Vergyri, Izhak Shafran, Andreas Stolcke, Ramana R. Gadde, Murat Akbacak, Brian Roark, Wen Wang (2007), The SRI/OGI 2006 spoken term detection system, Interspeech
Masataka Goto, Jun Ogata, Kouichirou Eto (2007), Podcastle: a web 2.0 approach to speech recognition research, Interspeech
Nathalie Camelin, Frédéric Béchet, Géraldine Damnati, Renato De Mori (2007), Speech mining in noisy audio message corpus, Interspeech
Jian Shao, Qingwei Zhao, Pengyuan Zhang, Zhaojie Liu, Yonghong Yan (2007), A fast fuzzy keyword spotting algorithm based on syllable confusion network, Interspeech
Wooil Kim, John H. L. Hansen (2007), Advances in speechfind: transcript reliability estimation employing confidence measure based on discriminative sub-word model for SDR, Interspeech
Benoit Favre, Jean-François Bonastre, Patrice Bellot (2007), An interactive timeline for speech database browsing, Interspeech
Michael C. W. Yip (2007), Spoken word recognition of Chinese homophones: a further investigation, Interspeech
Maria Wolters, Pauline Campbell, Christine DePlacido, Amy Liddell, David Owens (2007), The role of outer hair cell function in the perception of synthetic versus natural speech, Interspeech
Akiko Kusumoto, Alexander B. Kain, John-Paul Hosom, Jan P. H. van Santen (2007), Hybridizing conversational and clear speech, Interspeech
Sophie Dufour, Ulrich Hans Frauenfelder (2007), Neighborhood density and neighborhood frequency effects in French spoken word recognition, Interspeech
Toshio Irino, Yoshie Aoki, Yoshie Hayashi, Hideki Kawahara, Roy D. Patterson (2007), Discrimination and recognition of scaled word sounds, Interspeech
László Tóth (2007), Benchmarking human performance on the acoustic and linguistic subtasks of ASR systems, Interspeech
Lin Yang, Jianping Zhang, Yonghong Yan (2007), Contributions of temporal fine structure cues to Chinese speech recognition in cochlear implant simulation, Interspeech
Xihong Wu, Jing Chen, Zhigang Yang, Qiang Huang, Mengyuan Wang, Liang Li (2007), Effect of number of masking talkers on speech-on-speech masking in Chinese, Interspeech
Odile Bagou, Sophie Dufour, Cécile Fougeron, Alain Content, Ulrich Hans Frauenfelder (2007), Do different boundary types induce subtle acoustic cues to which French listeners are sensitive?, Interspeech
Svante Stadler, Arne Leijon, Björn Hagerman (2007), An information theoretic approach to predict speech intelligibility for listeners with normal and impaired hearing, Interspeech
Travis Wade, Bernd Möbius (2007), Speaking rate effects in a landmark-based phonetic exemplar model, Interspeech
Kazumi Maniwa, Allard Jongman, Travis Wade (2007), Acoustic correlates of intelligibility enhancements in clearly produced fricatives, Interspeech
Tim Jürgens, Thomas Brand, Birger Kollmeier (2007), Modelling the human-machine gap in speech reception: microscopic speech intelligibility prediction for normal-hearing subjects with an auditory model, Interspeech
Ayako Ikeno, John H. L. Hansen (2007), Lombard speech impact on perceptual speaker recognition, Interspeech
Huiwen Goy, Kathleen Pichora-Fuller, Pascal van Lieshout, Gurjit Singh, Bruce Schneider (2007), Effect of within- and between-talker variability on word identification in noise by younger and older adults, Interspeech
H. Timothy Bunnell, N. Carolyn Schanen, Linda D. Vallino, Thierry G. Morlet, James B. Polikoff, Jennette D. Driscoll, James T. Mantell (2007), Speech perception in children with speech sound disorder, Interspeech
Huan Wang, Werner Hemmert (2007), Speech coding and information processing by auditory neurons, Interspeech
Annie C. Gilbert, Victor J. Boucher (2007), What do listeners attend to in hearing prosodic structures? investigating the human speech-parser using short-term recall, Interspeech
Douglas S. Brungart, Nandini Iyer (2007), Time-compressed speech perception with speech and noise maskers, Interspeech
Anne Cutler, Martin Cooke, Maria Luisa Garcia Lecumberri, Dennis Pasveer (2007), L2 consonant identification in noise: cross-language comparisons, Interspeech
Jennifer T. Le, Catherine T. Best, Michael D. Tyler, Christian Kroos (2007), Effects of non-native dialects on spoken word recognition, Interspeech
Julien Meyer, Fanny Meunier, Laure Dentel (2007), Identification of natural whistled vowels by non-whistlers, Interspeech
Alexandra Jesse, James M. McQueen (2007), Prelexical adjustments to speaker idiosyncrasies: are they position-specific?, Interspeech
Holger Mitterer (2007), Top-down effects on compensation for coarticulation are not replicable, Interspeech
Yosuke Igarashi (2007), Pitch pattern alternation in goshogawara Japanese: evidence for a prosodic phrase above the domain for downstep, Interspeech
Irina Nesterenko, Pavel Skrelin (2007), Some evidence on the phonetics and phonology of prosodic phrasing in Russian, Interspeech
Jan Volín, Radek Skarnitzl (2007), Temporal downtrends in Czech read speech, Interspeech
Hyongsil Cho, Daniel Hirst (2007), Empirical evidence for prosodic phrasing: pauses as linguistic annotation in Korean read speech, Interspeech
Markus Dreyer, Izhak Shafran (2007), Exploiting prosody for PCFGs with latent annotations, Interspeech
Qin Shi, DanNing Jiang, FanPing Meng, Yong Qin (2007), Combining length distribution model with decision tree in prosodic phrase prediction, Interspeech
Li-chiung Yang (2007), Duration and pauses as boundary-markers in speech: a cross-linguistic study, Interspeech
Jian Yu, Lixing Huang, Jianhua Tao, Xia Wang (2007), Modeling incompletion phenomenon in Mandarin dialog prosody, Interspeech
Anne Tamm, Kálmán Abari, Gábor Olaszy (2007), Accent assignment algorithm in Hungarian, based on syntactic analysis, Interspeech
Cheng-Yuan Lin, Pei-Chi Jao, J. -S. Roger Jang (2007), An effective initial/final duration prediction method for corpus-based singing voice synthesis of Mandarin Chinese, Interspeech
Géza Németh, Márk Fék, Tamás Gábor Csapó (2007), Increasing prosodic variability of text-to-speech synthesizers, Interspeech
Damien Lolive, Nelly Barbot, Olivier Boeffard (2007), Unsupervised HMM classification of F0 curves, Interspeech
Ian Read, Stephen Cox (2007), Automatic pitch accent prediction for text-to-speech synthesis, Interspeech
Xinqiang Ni, Yining Chen, Frank K. Soong, Min Chu, Ping Zhang (2007), An unsupervised approach to automatic prosodic annotation, Interspeech
Zeynep Inanoglu, Steve Young (2007), A system for transforming the emotion in speech: combining data-driven conversion techniques for prosody and voice quality, Interspeech
Chen-Yu Chiang, Hsiu-Min Yu, Yih-Ru Wang, Sin-Horng Chen (2007), An automatic prosody labeling method for Mandarin speech, Interspeech
Keikichi Hirose, Keiko Ochi, Nobuaki Minematsu (2007), Corpus-based generation of prosodic features from text based on generation process model, Interspeech
Jilei Tian, Jani Nurminen, Imre Kiss (2007), Novel eigenpitch-based prosody model for text-to-speech synthesis, Interspeech
Volker Strom, Ani Nenkova, Robert Clark, Yolanda Vazquez-Alvarez, Jason Brenier, Simon King, Dan Jurafsky (2007), Modelling prominence and emphasis improves unit-selection synthesis, Interspeech
Seiya Takada, Yuji Yagi, Keikichi Hirose, Nobuaki Minematsu (2007), A framework of reply speech generation for concept-to-speech conversion in spoken dialogue systems, Interspeech
Thorsten Stocksmeier, Stefan Kopp, Dafydd Gibbon (2007), Synthesis of prosodic attitudinal variants in German backchannel ja, Interspeech
Ke Li, Yoko Greenberg, Yoshinori Sagisaka (2007), Inter-language prosodic style modification experiment using word impression vector for communicative speech generation, Interspeech
Koby Crammer (2007), A conservative aggressive subspace tracker, Interspeech
Mattias Nilsson, W. Bastiaan Kleijn (2007), Mutual information and the speech signal, Interspeech
Tony Ezzat, Jake Bouvrie, Tomaso Poggio (2007), Spectro-temporal analysis of speech using 2-d Gabor filters, Interspeech
Tomas Dekens, Mike Demol, Werner Verhelst, Piet Verhoeve (2007), A comparative study of speech rate estimation techniques, Interspeech
Tiago H. Falk, Hua Yuan, Wai-Yip Chan (2007), Spectro-temporal processing for blind estimation of reverberation time and single-ended quality measurement of reverberant speech, Interspeech
Toon van Waterschoot, Marc Moonen (2007), Linear prediction of audio signals, Interspeech
Carlo Magi, Tom Bäckström, Paavo Alku (2007), Stabilised weighted linear prediction - a robust all-pole method for speech processing, Interspeech
Daniel Rudoy, Daniel N. Spendley, Patrick J. Wolfe (2007), Conditionally linear Gaussian models for estimating vocal tract resonances, Interspeech
Karl Schnell, Arild Lacroix (2007), Time-varying pre-emphasis and inverse filtering of speech, Interspeech
Joachim Thiemann, Peter Kabal (2007), Reconstructing audio signals from modified non-coherent hilbert envelopes, Interspeech
Binh Phu Nguyen, Masato Akagi (2007), A flexible spectral modification method based on temporal decomposition and Gaussian mixture model, Interspeech
Jonathan Darch, Ben Milner (2007), A comparison of estimated and MAP-predicted formants and fundamental frequencies with a speech reconstruction application, Interspeech
Huiqun Deng, Douglas O'Shaughnessy (2007), Effect of incomplete glottal closures on estimates of glottal waves via inverse filtering of vowel sounds, Interspeech
Kaustubh Kalgaonkar, Mark A. Clements (2007), Vocal tract and area function estimation with both lip and glottal losses, Interspeech
S Guruprasad, B Yegnanarayana, K Sri Rama Murty (2007), Detection of instants of glottal closure using characteristics of excitation source, Interspeech
Nicolas Sturmel, Christophe D'Alessandro, Boris Doval (2007), A comparative evaluation of the zeros of z transform representation for voice source estimation, Interspeech
Aki Härmä (2007), Ambient telephony: scenarios and research challenges, Interspeech
Yasunari Obuchi, Akio Amano (2007), Always listening to you: creating exhaustive audio database in home environments, Interspeech
Joerg Schmalenstroeer, Reinhold Haeb-Umbach (2007), Joint speaker segmentation, localization and identification for streaming audio, Interspeech
Yan-Chen Lu, Martin Cooke, Heidi Christensen (2007), Active binaural distance estimation for dynamic sources, Interspeech
Bengt J. Borgström, Abeer Alwan (2007), A packetization and variable bitrate interframe compression scheme for vector quantizer-based distributed speech recognition, Interspeech
Matthias Wölfel (2007), Channel selection by class separability measures for automatic transcriptions on distant microphones, Interspeech
Danny Wyatt, Tanzeem Choudhury, Jeff Bilmes (2007), Conversation detection and speaker segmentation in privacy-sensitive situated speech data, Interspeech
Alberto Abad, Carlos Segura, Climent Nadeu, Javier Hernando (2007), Audio-based approaches to head orientation estimation in a smart-room, Interspeech
Valentin Ion, Reinhold Haeb-Umbach (2007), Multi-resolution soft features for channel-robust distributed speech recognition, Interspeech
Yi Su, Frederick Jelinek, Sanjeev Khudanpur (2007), Large-scale random forest language models for speech recognition, Interspeech
Yuya Akita, Yusuke Nemoto, Tatsuya Kawahara (2007), PLSA-based topic detection in meetings for adaptation of lexicon and language model, Interspeech
Atsushi Sako, Tetsuya Takiguchi, Yasuo Ariki (2007), Language modeling using PLSA-based topic HMM, Interspeech
Yi-cheng Pan, Lin-shan Lee (2007), Lexicon adaptation with reduced character error (LARCE) - a new direction in Chinese language modeling, Interspeech
Meng-Sung Wu, Jen-Tzung Chien (2007), Minimum rank error training for language modeling, Interspeech
Wen Wang, Andreas Stolcke (2007), Integrating MAP, marginals, and unsupervised language model adaptation, Interspeech
Hiroki Yamazaki, Koji Iwano, Koichi Shinoda, Sadaoki Furui, Haruo Yokota (2007), Dynamic language model adaptation using presentation slides for lecture speech recognition, Interspeech
Cosmin Munteanu, Gerald Penn, Ron Baecker (2007), Web-based language modelling for automatic lecture transcription, Interspeech
Tanel Alumäe, Toomas Kirt (2007), LSA-based language model adaptation for highly inflected languages, Interspeech
Aaron Heidel, Hung-an Chang, Lin-shan Lee (2007), Language model adaptation using latent dirichlet allocation and an efficient topic inference algorithm, Interspeech
Sibel Yaman, Jen-Tzung Chien, Chin-Hui Lee (2007), Structural Bayesian language modeling and adaptation, Interspeech
Ciro Martins, António J. S. Teixeira, João Neto (2007), Vocabulary selection for a broadcast news transcription system using a morpho-syntactic approach, Interspeech
Nguyen Bach, Mohamed Noamany, Ian Lane, Tanja Schultz (2007), Handling OOV words in Arabic ASR via flexible morphological constraints, Interspeech
Raquel Justo, M. Inés Torres (2007), Phrases in category-based language models for Spanish and basque ASR, Interspeech
Ebru Arısoy, Haşim Sak, Murat Saraçlar (2007), Language modeling for automatic turkish broadcast news transcription, Interspeech
Sasha Calhoun (2007), Predicting focus through prominence structure, Interspeech
Murtaza Bulut, Sungbok Lee, Shrikanth S. Narayanan (2007), Analysis of emotional speech prosody in terms of part of speech tags, Interspeech
Fang Liu, Yi Xu (2007), The neutral tone in question intonation in Mandarin, Interspeech
Amélie Rochet-Capellan, Jean-Luc Schwartz, Rafael Laboissière, Arturo Galvàn (2007), Pointing to a target while naming it with /pata/ or /tapa/: the effect of consonants and stress position on jaw-finger coordination, Interspeech
Øydis Hide, Steven Gillis, Paul Govaerts (2007), Suprasegmental aspects of pre-lexical speech in cochlear implanted children, Interspeech
Oliver Niebuhr (2007), Categorical perception in intonation: a matter of signal dynamics?, Interspeech
Noureddine Aboutabit, Denis Beautemps, Jeanne Clarke, Laurent Besacier (2007), A HMM recognition of consonant-vowel syllables from lip contours: the cued speech case, Interspeech
Patrick Lucey, Gerasimos Potamianos, Sridha Sridharan (2007), A unified approach to multi-pose audio-visual ASR, Interspeech
Rowan Seymour, Darryl Stewart, Ji Ming (2007), Audio-visual integration for robust speech recognition using maximum weighted stream posteriors, Interspeech
Thomas Hueber, Gérard Chollet, Bruce Denby, Gérard Dreyfus, Maureen Stone (2007), Continuous-speech phone recognition from ultrasound and optical images of the tongue and lips, Interspeech
Bo Zhu, Timothy J. Hazen, James Glass (2007), Multimodal speech recognition with ultrasonic sensors, Interspeech
David Dean, Patrick Lucey, Sridha Sridharan, Tim Wark (2007), Fused HMM-adaptation of multi-stream HMMs for audio-visual speech recognition, Interspeech
Carlos T. Ishi, Hiroshi Ishiguro, Norihiro Hagita (2007), Analysis of head motions and speech in spoken dialogue, Interspeech
Lars Bo Larsen, Kasper L. Jensen, Søren Larsen, Morten Rasmussen (2007), A paradigm for mobile speech-centric services, Interspeech
Pavel Campr, Marek Hrúz, Miloš Železný (2007), Design and recording of Czech sign language corpus for automatic sign language recognition, Interspeech
Jens Edlund, Jonas Beskow (2007), Pushy versus meek - using avatars to influence turn-taking behaviour, Interspeech
Michael Wand, Szu-Chen Stan Jou, Tanja Schultz (2007), Wavelet-based front-end for electromyographic speech recognition, Interspeech
Gaëlle Ferré, Roxane Bertrand, Philippe Blache, Robert Espesser, Stéphane Rauzy (2007), Intensive gestures in French and their multimodal correlates, Interspeech
Slim Ouni, Kais Ouni (2007), Aspects of visual speech in Arabic, Interspeech
Denis Burnham, Jessica Reynolds, Guillaume Vignali, Sandra Bollwerk, Caroline Jones (2007), Rigid vs non-rigid face and head motion in phone and tone perception, Interspeech
Hedvig Kjellström, Olov Engwall, Sherif Mahdy Abdou, Olle Bälter (2007), Audio-visual phoneme classification for pronunciation training applications, Interspeech
Katja Grauwinkel, Britta Dewitt, Sascha Fagel (2007), Visual information and redundancy conveyed by internal articulator dynamics in synthetic audiovisual speech, Interspeech
Wei Zhou, Zengfu Wang (2007), A speech rate related lip movement model for speech animation, Interspeech
Guanyong Wu, Jie Zhu (2007), An extension 2DPCA based visual feature extraction method for audio-visual speech recognition, Interspeech
Soo-jong Lee, Jun Park, Eung-kyeu Kim (2007), Preventing an external acoustic noise from being misrecognized as a speech recognition object by confirming the lip movement image signal, Interspeech
Gregor Hofer, Hiroshi Shimodaira (2007), Automatic head motion prediction from speech data, Interspeech
Yuki Denda, Takanobu Nishiura, Yoichi Yamashita (2007), Omnidirectional audio-visual talker localizer with dynamic feature fusion based on validity and reliability criteria, Interspeech
Nick Campbell, Damien Douxchamps (2007), Processing image and audio information for recognising discourse participation status through features of face and voice, Interspeech
Kamil K. Wójcicki, Stephen So, Kuldip K. Paliwal (2007), The effect of the additivity assumption on time and frequency domain wiener filtering for speech enhancement, Interspeech
Junfeng Li, Shuichi Sakamoto, Satoshi Hongo, Masato Akagi, Yôiti Suzuki (2007), Noise reduction based on adaptive β-order generalized spectral subtraction for speech enhancement, Interspeech
Amit Das, John H. L. Hansen (2007), Class constrained ROVER based speech enhancement, Interspeech
Erhan Deger, Md. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu, Md. Kamrul Hasan (2007), EMD based soft-thresholding for speech enhancement, Interspeech
Adam Borowicz, Alexander Petrovsky (2007), An approximate solution for perceptually constrained signal subspace speech enhancement method, Interspeech
Tim Fingscheidt, Suhadi Suhadi (2007), Quality assessment of speech enhancement systems by separation of enhanced speech, noise, and echo, Interspeech
Anis Ben Aicha, Sofia Ben Jebara (2007), Perceptual musical noise reduction using critical bands tonality coefficients and masking thresholds, Interspeech
Dirk Mauler, Anil M. Nagathil, Rainer Martin (2007), On optimal estimation of compressed speech for hearing aids, Interspeech
Richard C. Hendriks, Jesper Jensen, Richard Heusdens (2007), DFT domain subspace based noise tracking for speech enhancement, Interspeech
Nitish Krishnamurthy, John H. L. Hansen (2007), Noise tracking for speech systems in adverse environments, Interspeech
Abderrahman Essebbar, Tristan Poinsard (2007), Speech enhancement using multi-reference noise reduction in a vehicle environment, Interspeech
Ernst Warsitz, Reinhold Haeb-Umbach, Dang Hai Tran Vu (2007), Blind adaptive principal eigenvector beamforming for acoustical source separation, Interspeech
Zbyněk Koldovský, Petr Tichavský (2007), Time-domain blind audio source separation using advanced ICA methods, Interspeech
S. W. Lee, Frank K. Soong, P. C. Ching (2007), Model-based speech separation with single-microphone input, Interspeech
Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Masato Miyoshi (2007), Multi-step linear prediction based speech dereverberation in noisy reverberant environment, Interspeech
Seung Yeol Lee, Jong Won Shin, Hwan Sik Yun, Nam Soo Kim (2007), A statistical model based post-filtering algorithm for residual echo suppression, Interspeech
Xiaoshan Huang, Xiaoqun Zhao (2007), An optimal speech enhancement under speech uncertainty probability and masking property of auditory system, Interspeech
Viktoria Maier, Roger K. Moore (2007), Temporal episodic memory model: an evolution of minerva2, Interspeech
Gianpaolo Coro, Francesco Cutugno, Fulvio Caropreso (2007), Speech recognition with factorial-HMM syllabic acoustic models, Interspeech
Mathias De Wachter, Kris Demuynck, Patrick Wambacq, Dirk Van Compernolle (2007), Evaluating acoustic distance measures for template based recognition, Interspeech
Yan Han, Lou Boves (2007), Hierarchical acoustic modeling based on random-effects regression for automatic speech recognition, Interspeech
Annika Hämäläinen, Louis ten Bosch, Lou Boves (2007), Construction and analysis of multiple paths in syllable models, Interspeech
Carol Y. Espy-Wilson, Tarun Pruthi, Amit Juneja, Om Deshmukh (2007), Landmark-based approach to speech recognition: an alternative to HMMs, Interspeech
Satoshi Asakawa, Nobuaki Minematsu, Keikichi Hirose (2007), Automatic recognition of connected vowels only using speaker-invariant representation of speech dynamics, Interspeech
Roberto Togneri, Li Deng (2007), A structured speech model parameterized by recursive dynamics and neural networks, Interspeech
Li Deng, Helmer Strik (2007), Structure-based and template-based automatic speech recognition - comparing parametric and non-parametric approaches, Interspeech
David Grangier, Samy Bengio (2007), Learning the inter-frame distance for discriminative template-based keyword detection, Interspeech
Dong Yu, Li Deng, Alex Acero (2007), Handling phonetic context and speaker variation in a structure-based speech recognizer, Interspeech
Maarten Van Segbroeck, Hugo Van hamme (2007), Vector-quantization based mask estimation for missing data automatic speech recognition, Interspeech
Sébastien Demange, Christophe Cerisara, Jean-Paul Haton (2007), Accurate marginalization range for missing data recognition, Interspeech
Marco Kühne, Roberto Togneri, Sven Nordholm (2007), Smooth soft mel-spectrographic masks based on blind sparse source separation, Interspeech
Jonathan Laidler, Martin Cooke, Neil D. Lawrence (2007), Model-driven detection of clean speech patches in noise, Interspeech
Richard M. Stern, Evandro B. Gouvêa, Govindarajan Thattai (2007), polyaural array processing for automatic speech recognition in degraded environments, Interspeech
Nicolás Morales, Liang Gu, Yuqing Gao (2007), Adding noise to improve noise robustness in speech recognition, Interspeech
Eric Fosler-Lussier, Laura Dilley, Na'im Tyson, Mark Pitt (2007), The buckeye corpus of speech: updates and enhancements, Interspeech
N. Barroso, A. Ezeiza, N. Gilisagasti, K. López de Ipiña, A. López, J. M. López (2007), Development of multimodal resources for multilingual information retrieval in the basque context, Interspeech
Reva Schwartz, Wade Shen, Joseph Campbell, Shelley Paget, Julie Vonwiller, Dominique Estival, Christopher Cieri (2007), Construction of a phonotactic dialect corpus using semiautomatic annotation, Interspeech
Slim Abdennadher, Mohamed Aly, Dirk Bühler, Wolfgang Minker, Johannes Pittermann (2007), BECAM tool - a semi-automatic tool for bootstrapping emotion corpus annotation and management, Interspeech
Christopher Cieri, Linda Corson, David Graff, Kevin Walker (2007), Resources for new research directions in speaker recognition: the mixer 3, 4 and 5 corpora, Interspeech
Peter A. Heeman, Andy McMillin, J. Scott Yaruss (2007), Intercoder reliability in annotating complex disfluencies, Interspeech
M. H. Radfar, R. M. Dansereau (2007), Single channel speech separation using maximum a posteriori estimation, Interspeech
Suhadi Suhadi, Tim Fingscheidt (2007), Speech enhancement with improved a posteriori SNR computation, Interspeech
Thang Vu Tat, Germine Seide, Masashi Unoki, Masato Akagi (2007), Method of LP-based blind restoration for improving intelligibility of bone-conducted speech, Interspeech
Tiago H. Falk, Svante Stadler, W. Bastiaan Kleijn, Wai-Yip Chan (2007), Noise suppression based on extending a speech-dominated modulation band, Interspeech
Amin Haji Abolhassani, Sid-Ahmed Selouani, Douglas O'Shaughnessy, Mohamed-Faouzi Harkat (2007), Speech enhancement using PCA and variance of the reconstruction error model identification, Interspeech
Jong Won Shin, Woohyung Lim, Junesig Sung, Nam Soo Kim (2007), Speech reinforcement based on partial specific loudness, Interspeech
Tamara Rathcke, Jonathan Harrington (2007), The phonetics and phonology of high and low tones in two falling f0-contours in standard German, Interspeech
Tina John, Jonathan Harrington (2007), Temporal alignment of creaky voice in neutralised realisations of an underlying, post-nasal voicing contrast in German, Interspeech
Mike Demol, Werner Verhelst, Piet Verhoeve (2007), The duration of speech pauses in a multilingual environment, Interspeech
Dafydd Gibbon, Jolanta Bachan, Grażyna Demenko (2007), Syllable timing patterns in Polish: results from annotation mining, Interspeech
Constandinos Kalimeris, Stelios Bakamidis (2007), Minimal pairs and functional loads of sound contrasts obtained from a list of modern greek words, Interspeech
Daan Wissing (2007), More on acoustic correlates of stress, Interspeech
Cécile Woehrling, Philippe Boula de Mareüil (2007), Comparing praat and snack formant measurements on two large corpora of northern and southern French, Interspeech
William Barry, Bistra Andreeva, Ingmar Steiner (2007), The phonetic exponency of phrasal accentuation in French and German, Interspeech
Christiana Christodoulou (2007), Phonetic geminates in cypriot greek: the case of voiceless plosives, Interspeech
Darcie Williams, François Poiré (2007), Predicting vowel duration in spontaneous canadian French speech, Interspeech
Ivan Chow, François Poiré (2007), Rhotic variation and schwa epenthesis in windsor French, Interspeech
Audrey Bürki, Cécile Fougeron, Cédric Gendrot (2007), On the categorical nature of the process involved in schwa elision in French, Interspeech
Yue-Ning Hu, Min Chu, Chao Huang, Yan-Ning Zhang (2007), Exploring tonal variations via context-dependent tone models, Interspeech
Philippe Martin, Jun Li (2007), Acoustic analysis of the neutral tone in Mandarin, Interspeech
Rerrario Shui-Ching Ho, Yoshinori Sagisaka (2007), F0 analysis of perceptual distance among Cantonese level tones, Interspeech
Chang-wen Hsu, Lin-shan Lee (2007), Extended powered cepstral normalization (p-CN) with range equalization for robust features in speech recognition, Interspeech
Makoto Sakai, Norihide Kitaoka, Seiichi Nakagawa (2007), Selection of optimal dimensionality reduction method using chernoff bound for segmental unit input HMM, Interspeech
Vivek Tyagi (2007), Fepstrum: an improved modulation spectrum for ASR, Interspeech
Dušan Macho (2007), Narrowband to wideband feature expansion for robust multilingual ASR, Interspeech
Weifeng Li, Hervé Bourlard (2007), Non-linear spectral contrast stretching for in-car speech recognition, Interspeech
Xiao-Bing Li, Douglas O'Shaughnessy (2007), Clustering-based two-dimensional linear discriminant analysis for speech recognition, Interspeech
Yotaro Kubo, Shigeki Okawa, Akira Kurematsu, Katsuhiko Shirai (2007), A study on temporal features derived by analytic signal, Interspeech
Stephen A. Zahorian, Tara Singh, Hongbing Hu (2007), Dimensionality reduction of speech features using nonlinear principal components analysis, Interspeech
D. R. Sanand, D. Dinesh Kumar, S. Umesh (2007), Linear transformation approach to VTLN using dynamic frequency warping, Interspeech
Vladimir Fabregas Surigué de Alencar, Abraham Alcaim (2007), Features interpolation domain for distributed speech recognition and performance for ITU-t g.723.1 CODEC, Interspeech
Shoei Sato, Kazuo Onoe, Akio Kobayashi, Shinich Homma, Toru Imai, Tohru Takagi, Tetsunori Kobayashi (2007), Dynamic integration of multiple feature streams for robust real-time LVCSR, Interspeech
Hironori Matsumasa, Tetsuya Takiguchi, Yasuo Ariki, Ichao Li, Toshitaka Nakabayashi (2007), PCA-based feature extraction for fluctuation in speaking style of articulation disorders, Interspeech
Fabio Valente, Jithendra Vepa, Hynek Hermansky (2007), Multi-stream features combination based on dempster-shafer rule for LVCSR system, Interspeech
Natasha Singh-Miller, Michael Collins, Timothy J. Hazen (2007), Dimensionality reduction for speech recognition using neighborhood components analysis, Interspeech
Dan Su, Xihong Wu, Huisheng Chi (2007), Probabilistic latent speaker analysis for large vocabulary speech recognition, Interspeech
S. R. Mahadeva Prasanna, Hynek Hermansky (2007), MRASTA and PLP in automatic speech recognition, Interspeech
Markus Brückl (2007), Women's vocal aging: a longitudinal approach, Interspeech
Laurence Cnockaert, Jean Schoentgen, Canan Ozsancak, Pascal Auzou, Francis Grenez (2007), Effect of intensive voice therapy on vocal tremor for parkinson speakers, Interspeech
A. Alpan, A. Kacha, Francis Grenez, Jean Schoentgen (2007), Assessment of vocal dysperiodicities in connected disordered speech, Interspeech
Anne-Maria Laukkanen, Jaromír Horáček, Pavel Švancara, Elina Lehtinen (2007), Effects of FE modelled consequences of tonsillectomy on perceptual evaluation of voice, Interspeech
Irma M. Verdonck-de Leeuw, Louis ten Bosch, Li Ying Chao, Rico N. P. M. Rinkel, Pepijn A. Borggreven, Lou Boves, C. René Leemans (2007), Speech quality after major surgery of the oral cavity and oropharynx with microvascular soft tissue reconstruction, Interspeech
Christel de Bruijn, Sandra Whiteside (2007), Voice fatigue and use of speech recognition: a study of voice quality ratings, Interspeech
Jean-François Bonastre, Corinne Fredouille, A. Ghio, A. Giovanni, G. Pouchoulin, J. Révis, B. Teston, P. Yu (2007), Complementary approaches for voice disorder assessment, Interspeech
G. Pouchoulin, Corinne Fredouille, Jean-François Bonastre, A. Ghio, A. Giovanni (2007), Frequency study for the characterization of the dysphonic voices, Interspeech
Victor J. Boucher (2007), Acoustic correlates of laryngeal-muscle fatigue: findings for a phonometric prevention of acquired voice pathologies, Interspeech
Andreas Maier, Maria Schuster, Anton Batliner, Elmar Nöth, Emeka Nkenke (2007), Automatic scoring of the intelligibility in patients with cancer of the oral cavity, Interspeech
Jacques Duchateau, Leen Cleuren, Hugo Van hamme, Pol Ghesquière (2007), Automatic assessment of children's reading level, Interspeech
Carlos Ferrer, María E. Hernández-Díaz, Eduardo González (2007), Using waveform matching techniques in the measurement of shimmer in voiced signals, Interspeech
R. Fraile, J. I. Godino-Llorente, N. Sáenz-Lechón, V. Osma-Ruiz, P. Gómez-Vilda (2007), Analysis of the impact of analogue telephone channel on MFCC parameters for voice pathology detection, Interspeech
C. Manfredi, L. Bocchi, G. Cantarella, G. Peretti, G. Guidi, V. Mezzatesta (2007), Objective parameters from videokymographic images: a user-friendly interface, Interspeech
David House (2007), Integrating audio and visual cues for speaker friendliness in multimodal speech synthesis, Interspeech
Wieneke Wesseling, R. J. J. H. van Son, Louis C. W. Pols (2007), The influence of masking words on the prediction of TRPs in a shadowed dialog, Interspeech
Kornel Laskowski, Susanne Burger (2007), Analysis of the occurrence of laughter in meetings, Interspeech
Pashiera Barkhuysen, Emiel Krahmer, Marc Swerts (2007), Incremental perception of acted and real emotional speech, Interspeech
David Schlangen, Raquel Fernández (2007), Speaking through a noisy channel - experiments on inducing clarification behaviour in human-human dialogue, Interspeech
Christophe D'Alessandro, Albert Rilliard, Sylvain Le Beux (2007), Computerized chironomy: evaluation of hand-controlled intonation reiteration, Interspeech
Ivan Habernal, Miloslav Konopík (2007), JAAE: the java abstract annotation editor, Interspeech
Goshu Nagino, Makoto Shozakai, Kiyohiro Shikano (2007), How to judge reusability of existing speech corpora for target task by utilizing statistical multidimensional scaling, Interspeech
Peter Rutten (2007), Feasibility of constructing an expressive speech corpus from television soap opera dialogue, Interspeech
Rosemary Orr, Bernat González i Llinares, Françoise Petersen, Helge Hüttenrauch, Martin Böcker, Michael Tate (2007), Collection of empirical data for standardization of generic vocabularies in speech driven ICT devices and services, Interspeech
Antonio Marcos Selmini, Fábio Violaro (2007), Acoustic-phonetic features for refining the explicit speech segmentation, Interspeech
B. Lecouteux, Georges Linarès, Frédéric Beaugendre, Pascal Nocera (2007), Text island spotting in large speech databases, Interspeech
Tim Paek, Yun-Cheng Ju, Christopher Meek (2007), People watcher: a game for eliciting human-transcribed data for automated directory assistance, Interspeech
Andrew Kun, Tim Paek, Zeljko Medenica (2007), The effect of speech interface accuracy on driving performance, Interspeech
Hua Zhang, Lijuan Wang, Frank K. Soong, Wenju Liu (2007), Context constrained-generalized posterior probability for verifying phone transcriptions, Interspeech
Pongtep Angkititrakul, DongGu Kwak, SangJo Choi, JeongHee Kim, Anh PhucPhan, Amardeep Sathyanarayana, John H. L. Hansen (2007), Getting start with UTDrive: driver-behavior modeling and assessment of distraction for in-vehicle speech systems, Interspeech
BalaKrishna Kolluru, Yoshihiko Gotoh (2007), Relative evaluation of informativeness in machine generated summaries, Interspeech
Toshiyuki Takezawa, Masahide Mizushima, Tohru Shimizu, Genichiro Kikui (2007), A method for evaluating task-oriented spoken dialog translation systems based on communication efficiency, Interspeech
Charlotte van Hooijdonk, Edwin Commandeur, Reinier Cozijn, Emiel Krahmer, Erwin Marsi (2007), Using eye movements for online evaluation of speech synthesis, Interspeech
Jian Li, Dmitry Sityaev, Jie Hao (2007), Sentence level intelligibility evaluation for Mandarin text-to-speech systems using semantically unpredictable sentences, Interspeech
Judith Kessens, David A. van Leeuwen (2007), N-best: the northern- and southern-dutch benchmark evaluation of speech recognition technology, Interspeech
Trym Holter, Svein Sørsdal (2007), A MAP based approach to adaptive speech intelligibility measurements, Interspeech
Sirinoot Boonsuk, Proadpran Punyabukkana, Atiwong Suchato (2007), Phone boundary detection using selective refinements and context-dependent acoustic features, Interspeech
Tien-Ping Tan, Laurent Besacier (2007), Modeling context and language variation for non-native speech recognition, Interspeech
Xufang Zhao, Douglas O'Shaughnessy (2007), An evaluation of cross-language adaptation and native speech training for rapid HMM construction based on very limited training data, Interspeech
Konstantin Markov, Satoshi Nakamura (2007), Never-ending learning with dynamic hidden Markov network, Interspeech
C. Breslin, M. J. F. Gales (2007), Building multiple complementary systems using directed decision trees, Interspeech
Hiroaki Nanjo, Yuichi Oku, Takehiko Yoshimi (2007), Automatic speech recognition framework for multilingual audio contents, Interspeech
G. Bouselmi, Dominique Fohr, I. Illina (2007), Combined acoustic and pronunciation modelling for non-native speech recognition, Interspeech
Tadashi Emori, Yoshifumi Onishi, Koichi Shinoda (2007), Automatic estimation of scaling factors among probabilistic models in speech recognition, Interspeech
Emilian Stoimenov, John McDonough (2007), Memory efficient modeling of polyphone context with weighted finite-state transducers, Interspeech
Valeriy Pylypenko (2007), Extra large vocabulary continuous speech recognition algorithm based on information retrieval, Interspeech
I. Lee Hetherington (2007), PocketSUMMIT: small-footprint continuous speech recognition, Interspeech
Tobias Cincarek, Izumi Shindo, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano (2007), Development of preschool children subsystem for ASR and q&a in a real-environment speech-oriented guidance task, Interspeech
Chengyuan Ma, Chin-Hui Lee (2007), A study on word detector design and knowledge-based pruning and rescoring, Interspeech
Thomas Colthurst, Tresi Arvizo, Chia-Lin Kao, Owen Kimball, Stephen A. Lowe, David R. H. Miller, Jim Van Sciver (2007), Parameter tuning for fast speech recognition, Interspeech
Louis ten Bosch, Bert Cranen (2007), A computational model for unsupervised word discovery, Interspeech
Bernd T. Meyer, Matthias Wächter, Thomas Brand, Birger Kollmeier (2007), Phoneme confusions in human and automatic speech recognition, Interspeech
Kengo Ohta, Masatoshi Tsuchiya, Seiichi Nakagawa (2007), Construction of spoken language model including fillers using filler prediction model, Interspeech
Raghunandan Kumaran, Jeff Bilmes, Katrin Kirchhoff (2007), Attention shift decoding for conversational speech recognition, Interspeech
Péter Mihajlik, Tibor Fegyó, Zoltán Tüske, Pavel Ircing (2007), A morpho-graphemic approach for the recognition of spontaneous speech in agglutinative languages - like Hungarian, Interspeech
Mei Yang, Jing Zheng, Andreas Kathol (2007), A semi-supervised learning approach for morpheme segmentation for an Arabic dialect, Interspeech
Gerhard B. van Huyssteen, Martin J. Puttkammer (2007), Accelerating the annotation of lexical data for less-resourced languages, Interspeech
Christoph Draxler (2007), On web-based creation of speech resources for less-resourced languages, Interspeech
Miroslav Martinović, Srdjan Vesić, Goran Rakić (2007), Building an information retrieval system for serbian - challenges and solutions, Interspeech
Guy De Pauw, Peter Waiganjo Wagacha (2007), Bootstrapping morphological analysis of gĩkũyũ using unsupervised maximum entropy learning, Interspeech
Jerneja Žganec Gros, Stanislav Gruden (2007), The voiceTRAN machine translation system, Interspeech
Sérgio Paulo, Luís C. Oliveira (2007), MuLAS: a framework for automatically building multi-tier corpora, Interspeech
Jacquelijn Ringersma, Marc Kemps-Snijders (2007), Creating multimedia dictionaries of endangered languages using LEXUS, Interspeech
Hrafn Loftsson, Eiríkur Rögnvaldsson (2007), IceNLP: a natural language processing toolkit for icelandic, Interspeech
Marius Peche, Marelie Davel, Etienne Barnard (2007), Phonotactic spoken language identification with limited training data, Interspeech
Solomon Teferra Abate, Wolfgang Menzel (2007), Automatic speech recognition for an under-resourced language - amharic, Interspeech
Abdillahi Nimaan, Pascal Nocera, Frédéric Béchet, Jean-François Bonastre (2007), Information retrieval strategies for accessing african audio corpora, Interspeech
Vesa Siivola, Mathias Creutz, Mikko Kurimo (2007), Morfessor and variKN machine learning tools for speech and language technology, Interspeech
Markpong Jongtaveesataporn, Issara Thienlikit, Chai Wutiwiwatchai, Sadaoki Furui (2007), Towards better language modeling for Thai LVCSR, Interspeech
Christian Raymond, Giuseppe Riccardi (2007), Generative and discriminative algorithms for spoken language understanding, Interspeech
Elias Iosif, Alexandros Potamianos (2007), A soft-clustering algorithm for automatic induction of semantic classes, Interspeech
Agustín Gravano, Stefan Benus, Julia Hirschberg, Shira Mitchell, Ilia Vovsha (2007), Classification of discourse functions of affirmative words in spoken dialogue, Interspeech
Bogdan Minescu, Géraldine Damnati, Frédéric Béchet, Renato De Mori (2007), Conditional use of word lattices, confusion networks and 1-best string hypotheses in a sequential interpretation strategy, Interspeech
Jáchym Kolář, Yang Liu, Elizabeth Shriberg (2007), Speaker adaptation of language models for automatic dialog act segmentation of meetings, Interspeech
Amparo Albalate, Dimitar Dimitrov, Roberto Pieraccini (2007), Unsupervised categorisation approaches for technical support automated agents, Interspeech
Michael Wohlmayr, Marián Képesi (2007), Joint position-pitch extraction from multichannel audio, Interspeech
Hyun Soo Kim (2007), Morphological pre-processing technique and its applications on speech signal, Interspeech
Patricia A. Pelle, Claudio F. Estienne (2007), A pitch extraction system based on phase locked loops and consensus decision, Interspeech
Milan Legát, Jindřich Matoušek, Daniel Tihelka (2007), A robust multi-phase pitch-mark detection algorithm, Interspeech
Md. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu, Md. Kamrul Hasan (2007), Pitch estimation of noisy speech signals using empirical mode decomposition, Interspeech
Daniel Hirst, Hyongsil Cho, Sunhee Kim, Hyunji Yu (2007), Evaluating two versions of the momel pitch modelling algorithm on a corpus of read speech in Korean, Interspeech
Hussein Hussein, Oliver Jokisch (2007), Hybrid electroglottograph and speech signal based algorithm for pitch marking, Interspeech
Jasha Droppo, Alex Acero (2007), A fine pitch model for speech, Interspeech
Prasanta Kumar Ghosh, Antonio Ortega, Shrikanth S. Narayanan (2007), Pitch period estimation using multipulse model and wavelet transform, Interspeech
Martin Heckmann, Frank Joublin, Christian Goerick (2007), Combining rate and place information for robust pitch extraction, Interspeech
Heidi Christensen, Ning Ma, Stuart N. Wrigley, Jon Barker (2007), Integrating pitch and localisation cues at a speech fragment level, Interspeech
Jean-Sylvain Liénard, François Signol, Claude Barras (2007), Speech fundamental frequency estimation using the alternate comb, Interspeech
Andrew Rosenberg, Julia Hirschberg (2007), Detecting pitch accent using pitch-corrected energy-based predictors, Interspeech
Saikat Chatterjee, T. V. Sreenivas (2007), Normalized two stage SVQ for minimum complexity wide-band LSF quantization, Interspeech
Peng Zhang, Chang-chun Bao (2007), A novel 2kb/s waveform interpolation speech coder based on non-negative matrix factorization, Interspeech
Ahmed Ismail, Yasser Dakroury, Hazem Abbas (2007), A novel energy distribution comparison approach for robust speech spectrum vector quantization, Interspeech
Ahmed Ismail, Yasser Dakroury, Hazem Abbas (2007), Novel low-band phase representation for low bit-rate speech coding, Interspeech
Chun-Feng Wu, Cheng-Lung Lee, Wen-Whei Chang (2007), Perceptual-based playout mechanisms for multi-stream voice over IP networks, Interspeech
Robert Zopf, Jes Thyssen, Juin-Hwey Chen (2007), Time-warping and re-phasing in packet loss concealment, Interspeech
Yannis Agiomyrgiannakis, Yannis Stylianou (2007), The harmonic model codec (HMC) framework for voIP, Interspeech
Yannis Agiomyrgiannakis, Yannis Stylianou (2007), Bit-erasure channel decoding for GMM-based multiple description coding, Interspeech
Hua Yuan, Tiago H. Falk, Wai-Yip Chan (2007), Degradation-classification assisted single-ended quality measurement of speech, Interspeech
Alexander Raake, Sascha Spors, Jens Ahrens, Jitendra Ajmera (2007), Concept and evaluation of a downward-compatible system for spatial teleconferencing using automatic speaker clustering, Interspeech
Min-Ki Lee, Kyung-Tae Kim, Hong-Goo Kang, Dae Hee Youn (2007), Speech quality estimation using packet loss effects in CELP-type speech coders, Interspeech
Masahiro Oshikiri, Hiroyuki Ehara, Toshiyuki Morii, Tomofumi Yamanashi, Kaoru Satoh, Koji Yoshida (2007), An 8-32 kbit/s scalable wideband coder extended with MDCT-based bandwidth extension on top of a 6.8 kbit/s narrowband CELP coder, Interspeech
Robert Wielgat, Tomasz P. Zieliński, Paweł Świętojański, Piotr Żołądź, Daniel Król, Tomasz Woźniak, Stanisław Grabias (2007), Comparison of HMM and DTW methods in automatic recognition of pathological phoneme pronunciation, Interspeech
K. Yu, M. J. F. Gales, P. C. Woodland (2007), Unsupervised training with directed manual transcription for recognising Mandarin broadcast audio, Interspeech
Hao Wu, Xihong Wu (2007), Context dependent syllable acoustic model for continuous Chinese speech recognition, Interspeech
Dimitris Oikonomidis, Vassilis Diakoloukas, Vassilis Digalakis (2007), A sub-optimal viterbi-like search for linear dynamic models classification, Interspeech
Georg Heigold, Ralf Schlüter, Hermann Ney (2007), On the equivalence of Gaussian HMM and Gaussian HMM-like hidden conditional random fields, Interspeech
Stefano Scanzio, Pietro Laface, Roberto Gemello, Franco Mana (2007), Speeding-up neural network training using sentence and frame selection, Interspeech
Linquan Liu, Thomas Fang Zheng, Makoto Akabane, Ruxin Chen, Wenhu Wu (2007), Using a small development set to build a robust dialectal Chinese speech recognizer, Interspeech
Carlos Molina, Nestor Becerra Yoma, Fernando Huenupán, Claudio Garreton (2007), Unsupervised re-scoring of observation probability in viterbi based on reinforcement learning by using confidence measure and HMM neighborhood, Interspeech
Shiuan-Sung Lin, François Yvon (2007), Optimization on decoding graphs by discriminative training, Interspeech
Stéphane Huet, Guillaume Gravier, Pascale Sébillot (2007), Morphosyntactic processing of n-best lists for improved recognition and confidence measure computation, Interspeech
Xiang Li, Juan M. Huerta (2007), How predictable is ASR confidence in dialog applications?, Interspeech
Alexandre Allauzen (2007), Error detection in confusion network, Interspeech
Takanobu Oba, Takaaki Hori, Atsushi Nakamura (2007), An approach to efficient generation of high-accuracy and compact error-corrective models for speech recognition, Interspeech
Hamed Ketabdar, Mirko Hannemann, Hynek Hermansky (2007), Detection of out-of-vocabulary words in posterior based ASR, Interspeech
Daniela Braga, Luís Coelho, Fernando Gil V. Resende (2007), Homograph ambiguity resolution in front-end design for portuguese TTS systems, Interspeech
Ghinwa F. Choueiter, Stephanie Seneff, James Glass (2007), New word acquisition using subword modeling, Interspeech
Samuel Thomas, Ashish Verma (2007), Language identification of person names using CF-IOF based weighing function, Interspeech
Henk van den Heuvel, Jean-Pierre Martens, Nanneke Konings (2007), G2p conversion of names: what can we do (better)?, Interspeech
Ausdang Thangthai, Chai Wutiwiwatchai, Anocha Ragchatjaroen, Sittipong Saychum (2007), A learning method for Thai phonetization of English words, Interspeech
Steffen Werner, Rüdiger Hoffmann (2007), Spontaneous speech synthesis by pronunciation variant selection - a comparison to natural speech, Interspeech
Nikos Tsourakis, Vassilis Digalakis (2007), A generic methodology of converting transliterated text to phonetic strings case study: greeklish, Interspeech
Rita Singh, Evandro B. Gouvêa, Bhiksha Raj (2007), Probabilistic deduction of symbol mappings for extension of lexicons, Interspeech
Sergey Astrov, Joachim Hofer, Harald Höge (2007), Use of syllable center detection for improved duration modeling in Chinese Mandarin connected digits recognition, Interspeech
Thomas Pellegrini, Lori Lamel (2007), Using phonetic features in unsupervised word decompounding for ASR with application to a less-represented language, Interspeech
Sheng Qiang, Yao Qian, Frank K. Soong, Congfu Xu (2007), Robust F0 modeling for Mandarin speech recognition in noise, Interspeech
Dino Seppi, Daniele Falavigna, Georg Stemmer, Roberto Gretter (2007), Word duration modeling for word graph rescoring in LVCSR, Interspeech
Fabio Tamburini, Petra Wagner (2007), On automatic prominence detection for German, Interspeech
Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan (2007), Prosody-enriched lattices for improved syllable recognition, Interspeech
Joel Pinto, Andrew Lovitt, Hynek Hermansky (2007), Exploiting phoneme similarities in hybrid HMM-ANN keyword spotting, Interspeech
C. E. Liu, K. Thambiratnam, F. Seide (2007), Online vocabulary adaptation using limited adaptation data, Interspeech
Chin-Hui Lee, Mark A. Clements, Sorin Dusan, Eric Fosler-Lussier, Keith Johnson, Biing-Hwang Juang, Lawrence R. Rabiner (2007), An overview on automatic speech attribute transcription (ASAT), Interspeech
Ilana Bromberg, Qian Qian, Jun Hou, Jinyu Li, Chengyuan Ma, Brett Matthews, Antonio Moreno-Daniel, Jeremy Morris, Sabato Marco Siniscalchi, Yu Tsao, Yu Wang (2007), Detection-based ASR in the automatic speech attribute transcription project, Interspeech
Chi-Yueh Lin, Hsiao-Chuan Wang (2007), Attribute-based Mandarin speech recognition using conditional random fields, Interspeech
Helmer Strik, Khiet P. Truong, Febe de Wet, Catia Cucchiarini (2007), Comparing classifiers for pronunciation error detection, Interspeech
Jarek Krajewski, Bernd Kröger (2007), Using prosodic and spectral characteristics for sleepiness detection, Interspeech
Brian M. Ore, Raymond E. Slyh (2007), Score fusion for articulatory feature detection, Interspeech
Scott Otterson (2007), Improved location features for meeting speaker diarization, Interspeech
Kyu J. Han, Shrikanth S. Narayanan (2007), A robust stopping criterion for agglomerative hierarchical clustering in a speaker diarization system, Interspeech
Marijn Huijbregts, Chuck Wooters (2007), The blame game: performance analysis of speaker diarization system components, Interspeech
Hagai Aronowitz (2007), Trainable speaker diarization, Interspeech
Jing Huang, Etienne Marcheret, Karthik Visweswariah (2007), Improving speaker diarization for CHIL lecture meetings, Interspeech
Viet-Bac Le, Odile Mella, Dominique Fohr (2007), Speaker diarization using normalized cross likelihood ratio, Interspeech
Wai-Sum Lee (2007), Tone production by the speakers of different age-and-gender groups, Interspeech
Nan Xu, Denis Burnham, Christine Kitamura (2007), Vowels and tones in infant directed speech: hyperarticulation for both, but different developmental patterns, Interspeech
Eon-Suk Ko (2007), Acquisition of vowel duration in children speaking american English, Interspeech
Hiroko Hirano, Keikichi Hirose, Goh Kawai, Wentao Gu, Nobuaki Minematsu (2007), F0 models show Chinese speakers of Japanese insert intonational boundaries and drop pitch, Interspeech
Paola Escudero, Jelle Kastelein, Klara Weiand, R. J. J. H. van Son (2007), Formal modelling of L1 and L2 perceptual learning: computational linguistics versus machine learning, Interspeech
Mirjam Broersma (2007), Kettle hinders cat, shadow does not hinder shed: activation of ‘almost embedded’ words in nonnative listening, Interspeech
Sacha Krstulović, Anna Hunecke, Marc Schröder (2007), An HMM-based speech synthesis system applied to German and its adaptation to a limited set of expressive football announcements, Interspeech
Liang Gu, Wei Zhang, Lazkin Tahir, Yuqing Gao (2007), Statistical vowelization of Arabic text for speech synthesis in speech-to-speech translation systems, Interspeech
Wu Liu, Dezhi Huang, Yuan Dong, Xinnian Mao, Haila Wang (2007), A pair-based language model for the robust lexical analysis in Chinese text-to-speech synthesis, Interspeech
R. Maia, Tomoki Toda, Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda (2007), A trainable excitation model for HMM-based speech synthesis, Interspeech
Jochen Steigner, Marc Schröder (2007), Cross-language phonemisation in German text-to-speech synthesis, Interspeech
Ryuki Tachibana, Tohru Nagano, Gakuto Kurata, Masafumi Nishimura, Noboru Babaguchi (2007), Preliminary experiments toward automatic generation of new TTS voices from recorded speech alone, Interspeech
Suphattharachai Chomphan, Takao Kobayashi (2007), Implementation and evaluation of an HMM-based Thai speech synthesis system, Interspeech
Davide Bonardo, Enrico Zovato (2007), Speech synthesis enhancement in noisy environments, Interspeech
Helmut Schmid, Bernd Möbius, Julia Weidenkaff (2007), Tagging syllable boundaries with joint n-gram models, Interspeech
Jun Xu, Dezhi Huang, Yongxin Wang, Yuan Dong, Lianhong Cai, Haila Wang (2007), Hierarchical non-uniform unit selection based on prosodic structure, Interspeech
Peter Birkholz (2007), Control of an articulatory speech synthesizer based on dynamic approximation of spatial articulatory targets, Interspeech
Nobuyuki Nishizawa, Hisashi Kawai (2007), A preselection method based on cost degradation from the optimal sequence for concatenative speech synthesis, Interspeech
Guntram Strecha, Matthias Eichner, Rüdiger Hoffmann (2007), Line cepstral quefrencies and their use for acoustic inventory coding, Interspeech
Peter Cahill, Daniel Aioanei, Julie Carson-Berndsen (2007), Articulatory acoustic feature applications in speech synthesis, Interspeech
Aleksandra Krul, Géraldine Damnati, François Yvon, Cédric Boidin, Thierry Moudenc (2007), Approaches for adaptive database reduction for text-to-speech synthesis, Interspeech
Richard Tzong-Han Tsai, Hsi-Chuan Hung, Hong-Jie Dai, Wen-Lian Hsu (2007), Exploiting unlabeled internal data in conditional random fields to reduce word segmentation errors for Chinese texts, Interspeech
Barry Kirkpatrick, Darragh O'Brien, Ronán Scaife, Andrew Errity (2007), On the role of spectral dynamics in unit selection speech synthesis, Interspeech
Brian Langner, Alan W. Black (2007), ugloss: a framework for improving spoken language generation understandability, Interspeech
Karl Schnell, Arild Lacroix (2007), Combination of LSF and pole based parameter interpolation for model-based diphone concatenation, Interspeech
Kishore Prahallad, Arthur R. Toth, Alan W. Black (2007), Automatic building of synthetic voices from large multi-paragraph speech databases, Interspeech
A. Gallardo-Antolín, R. Barra, Marc Schröder, Sacha Krstulović, J. M. Montero (2007), Automatic phonetic segmentation of Spanish emotional speech, Interspeech
Dacheng Lin, Yong Zhao, Frank K. Soong, Min Chu, Jieyu Zhao (2007), Iterative unit selection with unnatural prosody detection, Interspeech
Zdeněk Hanzlíček, Jindřich Matoušek (2007), F0 transformation within the voice conversion framework, Interspeech
Daniel Erro, Asunción Moreno (2007), Weighted frequency warping for voice conversion, Interspeech
Daniel Erro, Asunción Moreno (2007), Frame alignment method for cross-lingual voice conversion, Interspeech
Jani Nurminen, Jilei Tian, Victor Popa (2007), Voicing level control with application in voice conversion, Interspeech
Winston S. Percybrooks, Elliot Moore (2007), New algorithm for LPC residual estimation from LSF vectors for a voice conversion system, Interspeech
Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano (2007), Speaker adaptive training for one-to-many eigenvoice conversion based on Gaussian mixture model, Interspeech
Petko N. Petkov, W. Bastiaan Kleijn (2007), Improving the phase vocoder approach to pitch-shifting, Interspeech
Larbi Mesbahi, Vincent Barreaud, Olivier Boeffard (2007), Comparing GMM-based speech transformation systems, Interspeech
Jen-Wei Kuo, Hung-Yi Lo, Hsin-Min Wang (2007), Improved HMM/SVM methods for automatic phoneme segmentation, Interspeech
Takahiro Shinozaki, Tatsuya Kawahara (2007), Gaussian mixture optimization for HMM based on efficient cross-validation, Interspeech
Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda (2007), Model-space MLLR for trajectory HMMs, Interspeech
Hamed Ketabdar, Hervé Bourlard (2007), In-context phone posteriors as complementary features for tandem ASR, Interspeech
Qian Qian, Xiaodong He, Li Deng (2007), Phone-discriminating minimum classification error (p-MCE) training for phonetic recognition, Interspeech
Lori Lamel, Abdel. Messaoudi, Jean-Luc Gauvain (2007), Improved acoustic modeling for transcribing Arabic broadcast data, Interspeech
Erik McDermott, Atsushi Nakamura (2007), String and lattice based discriminative training for the corpus of spontaneous Japanese lecture transcription task, Interspeech
Byung-Ok Kang, Ho-Young Jung, Yun-Keun Lee (2007), Discriminative noise adaptive training approach for an environment migration, Interspeech
Jia-Yu Chen, Peder A. Olsen, John R. Hershey (2007), Word confusability - measuring hidden Markov model similarity, Interspeech
Thomas Deselaers, Georg Heigold, Hermann Ney (2007), Speech recognition with state-based nearest neighbour classifiers, Interspeech
Remco Teunen, Masami Akamine (2007), HMM-based speech recognition using decision trees instead of GMMs, Interspeech
Christian Gollan, Stefan Hahn, Ralf Schlüter, Hermann Ney (2007), An improved method for unsupervised training of LVCSR systems, Interspeech
Mohamed Kamal Omar (2007), A variational approach to robust maximum likelihood estimation for speech recognition, Interspeech
Kai Yu, Rob A. Rutenbar (2007), Generating small, accurate acoustic models with a modified Bayesian information criterion, Interspeech
Peter Bell, Simon King (2007), Sparse Gaussian graphical models for speech recognition, Interspeech
Sakriani Sakti, Konstantin Markov, Satoshi Nakamura (2007), An HMM acoustic model incorporating various additional knowledge sources, Interspeech
Matti Varjokallio, Mikko Kurimo (2007), Comparison of subspace methods for Gaussian mixture models in speech recognition, Interspeech
Tanja Schultz, Alan W. Black, Sameer Badaskar, Matthew Hornyak, John Kominek (2007), SPICE: web-based tools for rapid language adaptation in speech processing systems, Interspeech
Filip Deprez, Jan Odijk, Jan De Moortel (2007), Introduction to multilingual corpus-based concatenative speech synthesis, Interspeech
Frederik Stouten, Jean-Pierre Martens (2007), Recognition of foreign names spoken by native speakers, Interspeech
R. Cordoba, L. F. D'Haro, F. Fernandez-Martinez, J. M. Montero, R. Barra (2007), Language identification using several sources of information with a multiple-Gaussian classifier, Interspeech
Carmen Del Solar, Guillermo Pérez, Eva Florencio, David Moral, Gabriel Amores, Pilar Manchón (2007), Dynamic language change in MIMUS, Interspeech
Jonas Lööf, Christian Gollan, Stefan Hahn, Georg Heigold, B. Hoffmeister, Christian Plahl, David Rybach, Ralf Schlüter, Hermann Ney (2007), The RWTH 2007 TC-STAR evaluation system for european English and Spanish, Interspeech
Eugene Chin Wei Koh, Hanwu Sun, Tin Lay Nwe, Trung Hieu Nguyen, Bin Ma, Eng Siong Chng, Haizhou Li, Susanto Rahardja (2007), Using direction of arrival estimate and acoustic feature information in speaker diarization, Interspeech
Fernando Batista, Diamantino Caseiro, Nuno Mamede, Isabel Trancoso (2007), Recovering punctuation marks for automatic speech recognition, Interspeech
Jui-Feng Yeh, Chung-Hsien Wu, Wei-Yen Wu (2007), Disfluency correction of spontaneous speech using conditional random fields with variable-length features, Interspeech
Jing Huang, Etienne Marcheret, Karthik Visweswariah, Vit Libal, Gerasimos Potamianos (2007), Detection, diarization, and transcription of far-field lecture speech, Interspeech
Timothy J. Hazen, Brennan Sherry, Mark Adler (2007), Speech-based annotation and retrieval of digital photographs, Interspeech
Umit Guz, Sébastien Cuendet, Dilek Hakkani-Tür, Gokhan Tur (2007), Co-training using prosodic and lexical information for sentence segmentation, Interspeech
Yannick Estève, Sylvain Meignier, Paul Deléglise, Julie Mauclair (2007), Extracting true speaker identities from transcriptions, Interspeech
Rong Fu, Ian D. Benest (2007), An improved speaker diarization system, Interspeech
Sebastian Stüker, Christian Fügen, Florian Kraft, Matthias Wölfel (2007), The ISL 2007 English speech transcription system for european parliament speeches, Interspeech
Mei-Yuh Hwang, Wen Wang, Xin Lei, Jing Zheng, Ozgur Cetin, Gang Peng (2007), Advances in Mandarin broadcast speech recognition, Interspeech
Jun Ogata, Masataka Goto, Kouichirou Eto (2007), Automatic transcription for a web 2.0 service to search podcasts, Interspeech
Joseph Tepperman, Abe Kazemzadeh, Shrikanth S. Narayanan (2007), A text-free approach to assessing nonnative intonation, Interspeech
John Lee, Stephanie Seneff (2007), Automatic generation of cloze items for prepositions, Interspeech
Christopher Waple, Hongcui Wang, Tatsuya Kawahara, Yasushi Tsubota, Masatake Dantsuji (2007), Evaluating and optimizing Japanese tutor system featuring dynamic question generation and interactive guidance, Interspeech
Catia Cucchiarini, Ambra Neri, Febe de Wet, Helmer Strik (2007), ASR-based pronunciation training: scoring accuracy and pedagogical effectiveness of a system for dutch L2 learners, Interspeech
Joseph Tepperman, Matthew Black, Patti Price, Sungbok Lee, Abe Kazemzadeh, Matteo Gerosa, Margaret Heritage, Abeer Alwan, Shrikanth S. Narayanan (2007), A Bayesian network classifier for word-level reading assessment, Interspeech
Hartwig Holzapfel, Alex Waibel (2007), Behavior models for learning and receptionist dialogs, Interspeech
Markku Turunen, Jaakko Hakulinen, Anssi Kainulainen, Aleksi Melto, Topi Hurtig (2007), Design of a rich multimodal interface for mobile spoken route guidance, Interspeech
Mariët Theune, Dennis Hofs, Marco van Kessel (2007), The virtual guide: a direction giving embodied conversational agent, Interspeech
Sudeep Gandhe, David Traum (2007), Creating spoken dialogue characters from corpora without annotations, Interspeech
Pui-Yu Hui, Zhengyu Zhou, Helen Meng (2007), Complementarity and redundancy in multimodal user inputs with speech and pen gestures, Interspeech
Linda Bell, Joakim Gustafson (2007), Children's convergence in referring expressions to graphical objects in a speech-enabled computer game, Interspeech
Hiromi Kawatsu, Sumio Ohno (2007), An analysis of individual differences in the f0 contour and the duration of anger utterances at several degrees, Interspeech
Yoshiko Arimoto, Sumio Ohno, Hitoshi Iida (2007), Acoustic features of anger utterances during natural dialog, Interspeech
Fadi Biadsy, Julia Hirschberg, Andrew Rosenberg, Wisam Dakka (2007), Comparing american and palestinian perceptions of charisma using acoustic-prosodic and lexical analysis, Interspeech
Carlos Busso, Sungbok Lee, Shrikanth S. Narayanan (2007), Using neutral speech models for emotional speech analysis, Interspeech
N. Satoh, K. Yamauchi, S. Matsunaga, M. Yamashita, R. Nakagawa, K. Shinohara (2007), Emotion clustering using the results of subjective opinion tests for emotion recognition in infants' cries, Interspeech
R. Barra, J. M. Montero, J. Macias-Guarasa, J. Gutiérrez-Arriola, J. Ferreiros, J. M. Pardo (2007), On the limitations of voice conversion techniques in emotion identification tasks, Interspeech
Kate Dupuis, Kathleen Pichora-Fuller (2007), Use of lexical and affective prosodic cues to emotion by younger and older adults, Interspeech
Purnima Gupta, Nitendra Rajput (2007), Two-stream emotion recognition for call center monitoring, Interspeech
Ioulia Grichkovtsova, Anne Lacheret, Michel Morel (2007), The role of intonation and voice quality in the affective speech perception, Interspeech
Bogdan Vlasenko, Björn Schuller, Andreas Wendemuth, Gerhard Rigoll (2007), Combining frame and turn-level information for robust recognition of emotions within speech, Interspeech
Björn Schuller, Anton Batliner, Dino Seppi, Stefan Steidl, Thurid Vogt, Johannes Wagner, Laurence Devillers, Laurence Vidrascu, Noam Amir, Loic Kessous, Vered Aharonson (2007), The relevance of feature type for the automatic classification of emotional user states: low level descriptors and functionals, Interspeech
Vũ Minh Quang, Laurent Besacier, Eric Castelli (2007), Automatic question detection: prosodic-lexical features and crosslingual experiments, Interspeech
Makoto Tachibana, Keigo Kawashima, Junichi Yamagishi, Takao Kobayashi (2007), Performance evaluation of HMM-based style classification with a small amount of training data, Interspeech
Khiet P. Truong, David A. van Leeuwen (2007), Visualizing acoustic similarities between emotions in speech: an acoustic map of emotions, Interspeech
Hao Hu, Ming-Xing Xu, Wei Wu (2007), Fusion of global statistical and segmental spectral features for speech emotion recognition, Interspeech
Vidhyasaharan Sethu, Eliathamby Ambikairajah, Julien Epps (2007), Group delay features for emotion detection, Interspeech
Christian Müller, Felix Burkhardt (2007), Combining short-term cepstral and long-term pitch features for automatic recognition of speaker age, Interspeech
Frank Enos, Elizabeth Shriberg, Martin Graciarena, Julia Hirschberg, Andreas Stolcke (2007), Detecting deception using critical segments, Interspeech
Takashi Nose, Yoichi Kato, Takao Kobayashi (2007), Style estimation of speech based on multiple regression hidden semi-Markov model, Interspeech
Chi Zhang, John H. L. Hansen (2007), Analysis and classification of speech mode: whispered through shouted, Interspeech
Melissa Bettoni-Techio, Andréia S. Rauber, Rosana Denise Koerich (2007), Perception and production of word-final alveolar stops by brazilian portuguese learners of English, Interspeech
Denise Cristina Kluge, Andréia S. Rauber, Mara Silvia Reis, Ricardo A. Hoffmann Bion (2007), The relationship between the perception and production of English nasal codas by brazilian learners of English, Interspeech
Takafumi Utashiro, Goh Kawai (2007), CALL courseware for learning reactive tokens in face-to-face dialogs, Interspeech
Shinya Kiriyama, Ryo Tsuji, Tomohiko Kasami, Shogo Ishikawa, Naofumi Otani, Hiroaki Horiuchi, Yoichi Takebayashi, Shigeyoshi Kitazawa (2007), The developmental analysis of demonstrative expression skills utilizing a multimodal infant behavior corpus, Interspeech
Elena E. Lyakso, Olga V. Frolova (2007), Russian vowels system acoustic features development in ontogenesis, Interspeech
Petra van Alphen, Elise de Bree, Paula Fikkert, Frank Wijnen (2007), The role of metrical stress in comprehension and production in dutch children at-risk of dyslexia, Interspeech
Seiichi Nakagawa, Kei Ohta (2007), A statistical method of evaluating pronunciation proficiency for presentation in English, Interspeech
Akiyo Joto, Yoshiki Nagase, Seiya Funatsu (2007), The intelligibility and its relations to acoustic characteristics of English /s/ and /esh/ produced by native speakers of Japanese, Interspeech
Martijn Goudbeek, Daniel Swingley, Keith R. Kluender (2007), The limits of multidimensional category learning, Interspeech
Maria Uther, James Uther, Panos Athanasopoulos, Pushpendra Singh, Reiko Akahane-Yamada (2007), Mobile adaptive CALL (MAC): a lightweight speech-based intervention for mobile language learners, Interspeech
Catherine T. Best, Pierre A. Hallé, Jennifer S. Pardo (2007), English and French speakers' perception of voicing distinctions in non-native lateral consonant syllable onsets, Interspeech
Francisco Lacerda, Lisa Gustavsson (2007), Predicting the consequences of vocalizations in early infancy, Interspeech
David Weenink, Guangqin Chen, Zongyan Chen, Stefan de Konink, Dennis Vierkant, Eveline van Hagen, R. J. J. H. van Son (2007), Learning tone distinctions for Mandarin Chinese, Interspeech
Catherine Lai, Kyle Gorman, Jiahong Yuan, Mark Liberman (2007), Perception of disfluency: language differences and listener bias, Interspeech
Stephane Pigeon, Wade Shen, Aaron Lawson, David A. van Leeuwen (2007), Design and characterization of the non-native military air traffic communications database (nnMATC), Interspeech
Wade Shen, Douglas Reynolds (2007), A comparison of speaker clustering and speech recognition techniques for air situational awareness, Interspeech
Dimitrios Dimitriadis, Jose C. Segura, Luz Garcia, Alexandros Potamianos, Petros Maragos, Vassilis Pitsikalis (2007), Advanced front-end for robust speech recognition in extremely adverse environments, Interspeech
Roberto Gemello, Franco Mana, Stefano Scanzio (2007), Experiments on hiwire database using denoising and adaptation with a hybrid HMM-ANN model, Interspeech
Brett Y. Smolenski (2007), Detection and removal of switching noise in push-to-talk and voice operated exchange communications systems, Interspeech
Luis Buera, Antonio Miguel, Óscar Saz, Eduardo Lleida, Alfonso Ortega (2007), Evaluation of the combined use of MEMLIN and MLLR on the non-native adaptation task of hiwire project database, Interspeech
Daniel Déchelotte, Holger Schwenk, Gilles Adda, Jean-Luc Gauvain (2007), Improved machine translation of speech-to-text outputs, Interspeech
S. Saleem, K. Subramanian, R. Prasad, David Stallard, Chia-Lin Kao, P. Natarajan, R. Suleiman (2007), Improvements in machine translation for English/iraqi speech translation, Interspeech
Evgeny Matusov, Dustin Hillard, Mathew Magimai-Doss, Dilek Hakkani-Tür, Mari Ostendorf, Hermann Ney (2007), Improving speech translation with automatic boundary prediction, Interspeech
Roldano Cattoni, Nicola Bertoldi, Marcello Federico (2007), Punctuating confusion networks for speech translation, Interspeech
Aarthi Reddy, Richard Rose, Alain Désilets (2007), Integration of ASR and machine translation models in a document translation task, Interspeech
Yik-Cheung Tam, Tanja Schultz (2007), Bilingual LSA-based translation lexicon adaptation for spoken language translation, Interspeech
David Stallard, Fred Choi, Chia-Lin Kao, Kriste Krstovski, P. Natarajan, R. Prasad, S. Saleem, K. Subramanian (2007), The BBN 2007 displayless English/iraqi speech-to-speech translation system, Interspeech
Ruhi Sarikaya, Yonggang Deng, Yuqing Gao (2007), Context dependent word modeling for statistical machine translation using part-of-speech tags, Interspeech
Darren Scott Appling, Nick Campbell (2007), Translating conversational speech to standard linguistic form, Interspeech
Caroline Lavecchia, Kamel Smaïli, David Langlois, Jean-Paul Haton (2007), Using inter-lingual triggers for machine translation, Interspeech
Daniele Falavigna, Nicola Bertoldi, Fabio Brugnara, Roldano Cattoni, Mauro Cettolo, Boxing Chen, Marcello Federico, Diego Giuliani, Roberto Gretter, Deepa Gupta, Dino Seppi (2007), The IRST English-Spanish translation system for european parliament speeches, Interspeech
Christian Fügen, Muntsin Kolss (2007), The influence of utterance chunking on machine translation performance, Interspeech
Kristin Precoda, Jing Zheng, Dimitra Vergyri, Horacio Franco, Colleen Richey, Andreas Kathol, Sachin Kajarekar (2007), Iraqcomm: a next generation translation system, Interspeech
Sharath Rao, Ian Lane, Tanja Schultz (2007), Optimizing sentence segmentation for spoken language translation, Interspeech
Korin Richmond (2007), A multitask learning perspective on acoustic-articulatory inversion, Interspeech
Chao Qin, Miguel Á. Carreira-Perpiñán (2007), A comparison of acoustic features for articulatory inversion, Interspeech
Odette Scharenborg, Vincent Wan (2007), Can unquantised articulatory feature continuums be modelled?, Interspeech
Milind S. Shah, Prem C. Pandey (2007), Estimation of place of articulation in stop consonants for visual feedback, Interspeech
Blaise Potard, Yves Laprie (2007), Compact representations of the articulatory-to-acoustic mapping, Interspeech
Joe Frankel, Mathew Magimai-Doss, Simon King, Karen Livescu, Özgür Çetin (2007), Articulatory feature classifiers trained on 2000 hours of telephone speech, Interspeech
Amr H. Nour-Eldin, Peter Kabal (2007), Objective analysis of the effect of memory inclusion on bandwidth extension of narrowband speech, Interspeech
Bernd Geiser, Hervé Taddei, Peter Vary (2007), Artificial bandwidth extension without side information for ITU-t g.729.1, Interspeech
Hannu Pulakka, Paavo Alku, Laura Laaksonen, Päivi Valve (2007), The effect of highband harmonic structure in the artificial bandwidth expansion of telephone speech, Interspeech
Shingo Kuroiwa, Masashi Takashina, Satoru Tsuge, Ren Fuji (2007), Artificial bandwidth extension for speech signals using speech recogniton, Interspeech
Driss Guerchi, Tamer Rabie, Abdelrhani Louzi (2007), Voicing-based codebook in low-rate wideband CELP coding, Interspeech
Ethan R. Duni, Bhaskar D. Rao (2007), Performance of speaker-dependent wideband speech coding, Interspeech
Philippe Dreuw, David Rybach, Thomas Deselaers, Morteza Zahedi, Hermann Ney (2007), Speech recognition techniques for a sign language recognition system, Interspeech
Keigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano (2007), Impact of various small sound source signals on voice conversion accuracy in speech communication aid for laryngectomees, Interspeech
Petr Cerva, Jan Nouza (2007), Design and development of voice controlled aids for motor-handicapped persons, Interspeech
Kouichi Katsurada, Yuji Okuma, Makoto Yano, Yurie Iribe, Tsuneo Nitta (2007), Management of static/dynamic properties in a multimodal interaction system, Interspeech
R. San-Segundo, A. Pérez, D. Ortiz, L. F. D'Haro, M. Inés Torres, F. Casacuberta (2007), Evaluation of alternatives on speech to sign language translation, Interspeech
Géza Németh, Gábor Olaszy, Mátyás Bartalis, Géza Kiss, Csaba Zainkó, Péter Mihajlik (2007), Speech based drug information system for aged and visually impaired persons, Interspeech
Waldo Nogueira, Tamás Harczos, Bernd Edler, Jörn Ostermann, Andreas Büchner (2007), Automatic speech recognition with a cochlear implant front-end, Interspeech
Soo-Young Suk, Hiroaki Kojima (2007), Voice activated powered wheelchair with non-voice rejection algorithm, Interspeech
Laurianne Sitbon, Patrice Bellot, Philippe Blache (2007), Phonetic based sentence level rewriting of questions typed by dyslexic spellers in an information retrieval context, Interspeech
André Berton, Peter Regel-Brietzmann, Hans-Ulrich Block, Stefanie Schachtl, Manfred Gehrke (2007), How to integrate speech-operated internet information dialogs into a car, Interspeech
James Glass, Timothy J. Hazen, Scott Cyphers, Igor Malioutov, David Huynh, Regina Barzilay (2007), Recent progress in the MIT spoken lecture processing project, Interspeech
Philipp Fischer, Andreas Österle, André Berton, Peter Regel-Brietzmann (2007), How to personalize speech applications for web-based information in a car, Interspeech
Satoshi Ikeda, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno (2007), Topic estimation with domain extensibility for guiding user's out-of-grammar utterances in multi-domain spoken dialogue systems, Interspeech
Ryota Nishimura, Norihide Kitaoka, Seiichi Nakagawa (2007), Prosody change and response timing analysis in spontaneously spoken dialogs and their modeling in a spoken dialog system, Interspeech
Satoshi Tamura, Kunihiko Takamatsu, Shinji Ogura, Satoru Hayamizu (2007), GEMSIS - a novel application of speech recognition to emergency and disaster medicine, Interspeech
Rachel Coulston, Esther Klabbers, Jacques de Villiers, John-Paul Hosom (2007), Application of speech technology in a home based assessment kiosk for early detection of alzheimer's disease, Interspeech
Olga Vybornova, Monica Gemo, Ronald Moncarey, Benoit Macq (2007), Ontology-based multimodal high level fusion involving natural language analysis for aged people home care application, Interspeech
Shing-kai Chan, Lei Xie, Helen Meng (2007), Modeling the statistical behavior of lexical chains to capture word cohesiveness for automatic story segmentation, Interspeech
James G. Fung, Dilek Hakkani-Tür, Mathew Magimai-Doss, Elizabeth Shriberg, Sébastien Cuendet, Nikki Mirghafori (2007), Cross-linguistic analysis of prosodic features for sentence segmentation, Interspeech
Andrew Rosenberg, Mehrbod Sharifi, Julia Hirschberg (2007), Varying input segmentation for story boundary detection in English, Arabic and Mandarin broadcast news, Interspeech
BalaKrishna Kolluru, Yoshihiko Gotoh (2007), Speaker role based structural classification of broadcast news stories, Interspeech
Matthias Jilka, Bernd Möbius (2007), The influence of vowel quality features on peak alignment, Interspeech
Yen-Liang Shue, Markus Iseli, Nanette Veilleux, Abeer Alwan (2007), Pitch accent versus lexical stress: quantifying acoustic measures related to the voice source, Interspeech
Stefan Benus, Agustín Gravano, Julia Hirschberg (2007), Prosody, emotions, and… ‘whatever’, Interspeech
Wentao Gu, Rerrario Shui-Ching Ho, Tan Lee (2007), Modeling tones in hakka on the basis of the command-response model, Interspeech
Gerrit Kentner (2007), Length, ordering preference and intonational phrasing: evidence from pauses, Interspeech
Jörg Peters, Judith Hanssen, Carlos Gussenhoven (2007), Alignment of the second low target in dutch falling-rising pitch contours, Interspeech
Helena Moniz, Ana Isabel Mata, M. Céu Viana (2007), On filled-pauses and prolongations in european portuguese, Interspeech
Michael Olsberg, Yi Xu, Jeremy Green (2007), Dependence of tone perception on syllable perception, Interspeech
Ralf Winkler (2007), Testing the relevance of speech rate, pitch and a glottal Chink for the perception of age in synthesized speech using formant synthesis, Interspeech
Tamás Böhm, Stefanie Shattuck-Hufnagel (2007), Utterance-final glottalization as a cue for familiar speaker recognition, Interspeech
Chun-Fang Huang, Masato Akagi (2007), A rule-based speech morphing for verifying a expressive speech perception model, Interspeech
Elina E. Helander, Jani Nurminen (2007), On the importance of pure prosody in the perception of speaker identity, Interspeech
Shi-Han Chen, Chih-Chung Kuo (2007), Perceptual relevance of pitch contours of Mandarin tones and its efficacy in prosody generation of speech synthesis, Interspeech
Hiromitsu Nishizaki, Mitsuhiro Sohmiya, Kenji Kobayashi, Yoshihiro Sekiguchi (2007), The effect of filled pauses in a lecture speech on impressive evaluation of listeners, Interspeech
Yujia Li, Tan Lee (2007), Perceptual equivalence of approximated Cantonese tone contours, Interspeech
Suleman Shahid, Emiel Krahmer, Marc Swerts (2007), Audiovisual emotional speech of game playing children: effects of age and culture, Interspeech
Oliver Lemon, Olivier Pietquin (2007), Machine learning for spoken dialogue systems, Interspeech
Verena Rieser, Oliver Lemon (2007), Learning dialogue strategies for interactive database search, Interspeech
Heriberto Cuayáhuitl, Steve Renals, Oliver Lemon, Hiroshi Shimodaira (2007), Hierarchical dialogue optimization using semi-Markov decision processes, Interspeech
Hua Ai, Diane J. Litman (2007), Knowledge consistent user simulations for dialog systems, Interspeech
Hsu-Chih Wu, Stephanie Seneff (2007), Reducing recognition error rate based on context relationships among dialogue turns, Interspeech
Teruhisa Misu, Tatsuya Kawahara (2007), Bayes risk-based optimization of dialogue management for document retrieval system with speech interface, Interspeech
Christiane Ulbrich, Horst Ulbrich (2007), Realisations and alternations in German /r/-realisation, Interspeech
Christopher S. Doty, Kaori Idemaru, Susan G. Guion (2007), Singleton and geminate stops in Finnish - acoustic correlates, Interspeech
Christophe Van Bael, Harald Baayen, Helmer Strik (2007), Segment deletion in spontaneous speech: a corpus study using mixed effects models with crossed random effects, Interspeech
Hongying Zheng, Peter W. M. Tsang, William S. -Y. Wang (2007), Categorical perception of Cantonese tones in context: a cross-linguistic study, Interspeech
Yiya Chen, Jiahong Yuan (2007), A corpus study of the 3rd tone sandhi in standard Chinese, Interspeech
Jonathan Harrington, Sallyanne Palethorpe, Catherine I. Watson (2007), Age-related changes in fundamental frequency and formants: a longitudinal study of four speakers, Interspeech
Jian Zhang, Ho Yin Chan, Pascale Fung, Lu Cao (2007), A comparative study on speech summarization of broadcast news and lecture speech, Interspeech
Gabriel Murray, Steve Renals (2007), Towards online speech summarization, Interspeech
Tomoyuki Yamagata, Atsushi Sako, Tetsuya Takiguchi, Yasuo Ariki (2007), System request detection in conversation based on acoustic and speaker alternation features, Interspeech
Michael Levit, Elizabeth Boschee, Marjorie Freedman (2007), Selecting on-topic sentences from natural language corpora, Interspeech
Seokhwan Kim, Minwoo Jeong, Gary Geunbae Lee (2007), A semi-supervised method for efficient construction of statistical spoken language understanding resources, Interspeech
Yasuhisa Fujii, Norihide Kitaoka, Seiichi Nakagawa (2007), Automatic extraction of cue phrases for important sentences in lecture speech and automatic lecture speech summarization, Interspeech
Yi-Ting Chen, Hsuan-Sheng Chiu, Hsin-Min Wang, Berlin Chen (2007), A unified probabilistic generative framework for extractive spoken document summarization, Interspeech
Matthieu Hébert (2007), Generic class-based statistical language models for robust speech understanding in directed dialog applications, Interspeech
Michael L. Seltzer, Yun-Cheng Ju, Ivan Tashev, Alex Acero (2007), Robust location understanding in spoken dialog systems using intersections, Interspeech
Maria Markaki, Michael Wohlmayr, Yannis Stylianou (2007), Speech-nonspeech discrimination using the information bottleneck method and spectro-temporal modulation index, Interspeech
Keun Won Jang, Dong Kook Kim, Joon-Hyuk Chang (2007), A uniformly most powerful test for statistical model-based voice activity detection, Interspeech
John Dines, Jithendra Vepa (2007), Direct optimisation of a multilayer perceptron for the estimation of cepstral mean and variance statistics, Interspeech
Marijn Huijbregts, Chuck Wooters, Roeland Ordelman (2007), Filtering the unknown: speech activity detection in heterogeneous video collections, Interspeech
Abhijeet Sangwan, Nitish Krishnamurthy, John H. L. Hansen (2007), Environmentally aware voice activity detector, Interspeech
Masakiyo Fujimoto, Kentaro Ishizuka (2007), Noise robust voice activity detection based on switching kalman filter, Interspeech
Q-Haing Jo, Yun-Sik Park, Kye-Hwan Lee, Ji-Hyun Song, Joon-Hyuk Chang (2007), Voice activity detection based on support vector machine using effective feature vectors, Interspeech
K Sri Rama Murty, B Yegnanarayana, S Guruprasad (2007), Voice activity detection in degraded speech using excitation source information, Interspeech
David Cournapeau, Tatsuya Kawahara (2007), Evaluation of real-time voice activity detection based on high order statistics, Interspeech
Yanmeng Guo, Qian Qian, Yonghong Yan (2007), Robust voice activity detection based on adaptive sub-band energy sequence analysis and harmonic detection, Interspeech
Corinne Fredouille, Nicholas Evans (2007), The influence of speech activity detection and overlap on speaker diarization for meeting room recordings, Interspeech
Gibak Kim, Nam Ik Cho (2007), Voice activity detection using the phase vector in microphone array, Interspeech
Federico Flego, Christian Zieger, Maurizio Omologo (2007), Adaptive weighting of microphone arrays for distant-talking F0 and voiced/unvoiced estimation, Interspeech
A. Sreenivasa Murthy, S. Chandra Sekhar, T. V. Sreenivas (2007), Robust and high-resolution voiced/unvoiced classification in noisy speech using a signal smoothness criterion, Interspeech
Tara N. Sainath, Victor Zue, Dimitri Kanevsky (2007), Audio classification using extended baum-welch transformations, Interspeech
Mary Tai Knox, Nikki Mirghafori (2007), Automatic laughter detection using neural networks, Interspeech
Gang Peng, Mei-Yuh Hwang, Mari Ostendorf (2007), Automatic acoustic segmentation for speech recognition on broadcast recordings, Interspeech
Peter Birkholz (2007), Articulatory synthesis of singing, Interspeech
Takeshi Saitou, Masataka Goto, Masashi Unoki, Masato Akagi (2007), Vocal conversion from speaking voice to singing voice using STRAIGHT, Interspeech
Axel Roebel, Joshua Fineberg (2007), Speech to chant transformation with the phase vocoder, Interspeech
Hideki Kenmochi, Hayato Ohshita (2007), VOCALOID - commercial singing synthesizer based on sample concatenation, Interspeech
Nicolas D’Alessandro, Thierry Dutoit (2007), RAMCESS/handsketch : a multi-representation framework for realtime and expressive singing synthesis, Interspeech
Sten Ternström, Johan Sundberg (2007), Formant-based synthesis of singing, Interspeech
Han Sloetjes, Albert Russel, Alexander Klassmann (2007), ELAN: a free and open-source multimedia annotation tool, Interspeech
Jozsef Szakos, Ulrike Glavitsch (2007), Speechindexer in action: managing endangered Formosan languages, Interspeech
Tohru Ifukube, Yasuyuki Shimizu (2007), A portable record player for wax cylinders using a laser-beam reflection method, Interspeech
Joseph P Campbell (2014), Speaker Recognition for Forensic Applications, Odyssey
Alvin F. Martin, Craig S. Greenberg, John M. Howard, George R. Doddington, John J. Godfrey, Vincent M. Stanford (2014), Effects of the New Testing Paradigm of the 2012 NIST Speaker Recognition Evaluation, Odyssey
David van der Vloed, Jos Bouten, David van Leeuwen (2014), NFI-FRITS: A forensic speaker recognition database and some first experiments, Odyssey
David van Leeuwen, Niko Brummer, Albert Swart (2014), A comparison of linear and non-linear calibrations for speaker recognition, Odyssey
Yun Lei, Luciana Ferrer, Aaron Lawson, Mitchell McLaren, Nicolas Scheffer (2014), Trial-based Calibration for Speaker Recognition in Unseen Conditions, Odyssey
Johan Rohdin, Sangeeta Biswas, Koichi Shinoda (2014), Discriminative PLDA training with application-specific loss functions for speaker verification, Odyssey
Joaquin Gonzalez-Rodriguez, Juana Gil, Rubén Pérez, Javier Franco-Pedroso (2014), What are we missing with i-vectors? A perceptual analysis of i-vector-based falsely accepted trials, Odyssey
Pierre-Michel Bousquet, Jean-François Bonastre, Driss Matrouf (2014), Exploring some limits of Gaussian PLDA modeling for i-vector distributions, Odyssey
Najim Dehak, Oldrich Plchot, Mohamad Hasan Bahari, Lukas Burget, Hugo Van Hamme, Reda Dehak (2014), GMM Weights Adaptation Based on Subspace Approaches for Speaker Verification, Odyssey
Andreas Nautsch, Christian Rathgeb, Christoph Busch, Herbert Reininger, Klaus Kasper (2014), Towards Duration Invariance of i-Vector-based Adaptive Score Normalization, Odyssey
Zhi-Yi Li, Wei-Qiang Zhang, Wei-Wei Liu, Yao Tian, Jia Liu (2014), Text-Independent Speaker Verification via State Alignment, Odyssey
Kong Aik Lee, Bin Ma, Haizhou Li, Liping Chen, Wu Guo, Lirong Dai (2014), Local Variability Modeling for Text-Independent Speaker Verification, Odyssey
Yusuf Ziya Isik, Hakan Erdogan, Ruhi Sarikaya (2014), A Latent Dirichlet Allocation Based Front-End for Speaker Verification, Odyssey
Ville Hautamäki, Rosa Gonzalez Hautamäki, Tomi Kinnunen, Anne-Maria Laukkanen (2014), Comparison of human listeners and speaker verification systems using voice mimicry data, Odyssey
Patrick Kenny, Themos Stafylakis, Pierre Ouellet, Md Jahangir Alam, Pierre Dumouchel (2014), Supervised/Unsupervised Voice Activity Detectors for Text-dependent Speaker Recognition on the RSR2015 Corpus, Odyssey
Johan Rohdin, Sangeeta Biswas, Koichi Shinoda (2014), i-Vector Selection for Effective PLDA Modeling in Speaker Recognition, Odyssey
Brecht Desplanques, Kris Demuynck, Jean-Pierre Martens (2014), Combining Joint Factor Analysis and iVectors for Robust Language Recognition, Odyssey
Alexandros Lazaridis, Elie Khoury, Jean-Philippe Goldman, Mathieu Avanzi, Sébastien Marcel, Philip N. Garner (2014), Swiss French Regional Accent Identification, Odyssey
Laura Fernandez Gallardo, Michael Wagner, Sebastian Möller (2014), Spectral Sub-band Analysis of Speaker Verification Employing Narrowband and Wideband Speech, Odyssey
Gang Liu, John H.L. Hansen (2014), Supra-Segmental Feature Based Speaker Trait Detection, Odyssey
Karthika Vijayan, Vinay Kumar, K Sri Rama Murty (2014), Allpass modelling of Fourier phase for speaker verification, Odyssey
Jinghua Zhong, Weiwu Jiang, Helen Meng, Na Li, Zhifeng Li (2014), An Integration of Random Subspace Sampling and Fishervoice for Speaker Verification, Odyssey
Gang Liu, John Hansen, Chengzhu Yu, Abhinav Misra, Navid Shokouhi (2014), Investigating State-of-the-Art Speaker Verification in the case of Unlabeled Development Data, Odyssey
Alvin F. Martin, Craig S. Greenberg, John M. Howard, George R. Doddington, John J. Godfrey (2014), NIST Language Recognition Evaluation - Past and Future, Odyssey
Gang Liu, Qian Zhang, John Hansen (2014), Robust Language Recognition Based on Diverse Features, Odyssey
Nobuaki Minematsu, Shun Kasahara, Takehiko Makino, Daisuke Saito, Keikichi Hirose (2014), Speaker-basis Accent Clustering Using Invariant Structure Analysis and the Speech Accent Archive, Odyssey
Alan Mccree (2014), Multiclass Discriminative Training of i-vector Language Recognition, Odyssey
Jean-François Bonastre, Itshak Lapidot, Samy Bengio (2014), Telephone Conversation Speaker Diarization Using Mealy-HMMs, Odyssey
Hervé Bredin, Antoine Laurent, Achintya Sarkar, Viet-Bac Le, Sophie Rosset, Claude Barras (2014), Person Instance Graphs for Named Speaker Identification in TV Broadcast, Odyssey
Grégor Dupuy, Sylvain Meignier, Paul Deléglise, Yannick Estève (2014), Recent Improvements on ILP-based Clustering for Broadcast News Speaker Diarization, Odyssey
Pranay Dighe, Marc Ferras, Herve Bourlard (2014), Modeling Overlapping Speech using Vector Taylor Series, Odyssey
Martin Cooke (2014), Speaking in adverse conditions: from behavioural observations to intelligibility-enhancing speech modifications, Odyssey
Patrick Kenny, Themos Stafylakis, Alam Jahangir, Pierre Ouellet, Marcel Kockmann (2014), Joint Factor Analysis for Text-Dependent Speaker Verification, Odyssey
Giovanni Soldi, Simon Bozonnet, Federico Alegre, Christophe Beaugeant, Nicholas Evans (2014), Short-Duration Speaker Modelling with Phone Adaptive Training, Odyssey
Changhuai You, Kong Aik Lee, Bin Ma, Haizhou Li (2014), Text-Dependent Speaker Verification System in VHF Communication Channel, Odyssey
Alan Mccree, Douglas Reynolds, Daniel Garcia-Romero, Tomi Kinnunen, Craig Greenberg, Désiré Bansé, George Doddington, John Godfrey, Alvin Martin, Mark Przybocki (2014), The NIST 2014 Speaker Recognition i-vector Machine Learning Challenge, Odyssey
Sergey Novoselov, Timur Pekhovsky, Konstantin Simonchik (2014), STC Speaker Recognition System for the NIST i-Vector Challenge, Odyssey
Bostjan Vesnicer, Jerneja Zganec-Gros, Simon Dobrisek, Vitomir Struc (2014), Incorporating Duration Information into I-Vector-Based Speaker Recognition Systems, Odyssey
Abbas Khosravani, Mohammad Mahdi Homayounpour (2014), Linearly Constrained Minimum Variance for Robust I-vector Based Speaker Recognition, Odyssey
Marc Ferras, Elie Khoury, Sébastien Marcel, Laurent El Shafey (2014), Hierarchical speaker clustering methods for the NIST i-vector Challenge, Odyssey
Samy Bengio (2014), Large Scale Learning of a Joint Embedding Space, Odyssey
Niko Brummer, Alan Mccree, Stephen Shum, Daniel Garcia-Romero, Carlos Vaquero (2014), Unsupervised Domain Adaptation for I-Vector Speaker Recognition, Odyssey
Alan Mccree, Stephen Shum, Douglas Reynolds, Daniel Garcia-Romero (2014), Unsupervised Clustering Approaches for Domain Adaptation in Speaker Recognition Systems, Odyssey
Sandro Cumani, Pietro Laface (2014), Generative pairwise models for speaker recognition, Odyssey
Hagai Aronowitz (2014), Compensating Inter-Dataset Variability in PLDA Hyper-Parameters for Robust Speaker Recognition, Odyssey
Yun Lei, Luciana Ferrer, Aaron Lawson, Mitchell McLaren, Nicolas Scheffer (2014), Application of Convolutional Neural Networks to Language Identification in Noisy Conditions, Odyssey
Patrick Kenny, Themos Stafylakis, Pierre Ouellet, Vishwa Gupta, Jahangir Alam (2014), Deep Neural Networks for extracting Baum-Welch statistics for Speaker Recognition, Odyssey
Pavel Matejka, Le Zhang, Tim Ng, Ondrej Glembek, Jeff Ma, Bing Zhang, Sri Harish Mallidi (2014), Neural Network Bottleneck Features for Language Identification, Odyssey
Omid Ghahabi, Javier Hernando (2014), i-Vector Modeling with Deep Belief Networks for Multi-Session Speaker Recognition, Odyssey
Antje Schweitzer, Norbert Braunschweiler, Grzegorz Dogil, Bernd Möbius (2004), Assessing the acceptability of the Smartkom speech synthesis voices, SSW
Jithendra Vepa, Simon King (2004), Subjective evaluation of join cost & smoothing methods, SSW
Erwin Marsi (2004), Optionality in evaluating prosody prediction, SSW
Yoshinori Shiga, Simon King (2004), Accurate spectral envelope estimation for articulation-to-speech synthesis, SSW
Alexander Kain, Xiaochuan Niu, John-Paul Hosom, Qi Miao, Jan P. H. van Santen (2004), Formant re-synthesis of dysarthric speech, SSW
Tomoki Toda, Alan W. Black, Keiichi Tokuda (2004), Mapping from articulatory movements to vocal tract spectrum with Gaussian mixture model for articulatory speech synthesis, SSW
Toshio Hirai, Seiichi Tenpaku (2004), Using 5 ms segments in concatenative speech synthesis, SSW
Nobuo Nukaga, Ryota Kamoshida, Kenji Nagamatsu (2004), Unit selection using pitch synchronous cross correlation for Japanese concatenative speech synthesis, SSW
Ann K. Syrdal, Alistair D. Conkie (2004), Data-driven perceptually based join costs, SSW
Matthew Aylett (2004), Merging data driven and rule based prosodic models for unit selection TTS, SSW
Jan P. H. van Santen, Taniya Mishra, Esther Klabbers (2004), Estimating phrase curves in the general superpositional intonation model, SSW
Pablo Daniel Agüero, Antonio Bonafonte (2004), Intonation modeling for TTS using a joint extraction and prediction approach, SSW
Esther Klabbers, Jan P. H. van Santen (2004), Clustering of foot-based pitch contours in expressive speech, SSW
E. Eide, A. Aaron, R. Bakis, W. Hamza, Michael Picheny, J. Pitrelli (2004), A corpus-based approach to expressive speech synthesis, SSW
Guillaume Gibert, Gérard Bailly, Frédéric Elisei (2004), Audiovisual text-to-cued speech synthesis, SSW
Rachel Baker, Robert A. J. Clark, Michael White (2004), Synthesising contextually appropriate intonation in limited domains, SSW
Jelske Dijkstra, Louis C. W. Pols, Rob J. J. H. van Son (2004), Frisian TTS, an example of bootstrapping TTS for minority languages, SSW
Sebsibe H. Mariam, S. P. Kishore, Alan W. Black, Rohit Kumar, Rajeev Sangal (2004), Unit selection voice for Amharic using Festvox, SSW
Kalika Bali, Partha Pratim Talukdar, N. Sridhar Krishna, A.G. Ramakrishnan (2004), Tools for the development of a Hindi speech synthesis system, SSW
Hiroyuki Segi, Tohru Takagi, Takayuki Ito (2004), A concatenative speech synthesis method using context dependent phoneme sequences with variable length as search units, SSW
Justin Fackrell, Wojciech Skut (2004), Improving pronunciation dictionary coverage of names by modelling spelling variation, SSW
Yeon-Jun Kim, Ann Syrdal, Matthias Jilka (2004), Improving TTS by higher agreement between predicted versus observed pronunciations, SSW
Jerome R. Bellegarda (2004), A novel discontinuity metric for unit selection text-to-speech synthesis, SSW
Jordi Adell, Antonio Bonafonte (2004), Towards phone segmentation for concatenative speech synthesis, SSW
Joakim Gustafson, Kåre Sjölander (2004), Voice creation for conversational fairy-tale characters, SSW
Shinsuke Sakai (2004), F0 modeling with multi-layer additive modeling based on a statistical learning technique, SSW
John Kominek, Alan W. Black (2004), Impact of durational outlier removal from unit selection catalogs, SSW
Keikichi Hirose, Kentaro Sato, Nobuaki Minematsu (2004), Corpus-based synthesis of fundamental frequency contours with various speaking styles from text using F0 contour generation process model, SSW
Jianhua Tao, Yongguo Kang (2004), Multi-source based acoustic model for speech synthesis, SSW
Robert A. J. Clark, Korin Richmond, Simon King (2004), Festival 2 - build your own general purpose unit selection speech synthesiser, SSW
Hisashi Kawai, Tomoki Toda, Jinfu Ni, Minoru Minoru, Tsuzaki Tsuzaki, Keiichi Tokuda (2004), XIMERA: a new TTS from ATR based on corpus-based technologies, SSW
Fabio Tesser, Piero Cosi, Carlo Drioli, Graziano Tisato (2004), Prosodic data driven modelling of a narrative style in Festival TTS, SSW
Heiga Zen, Keiichi Tokuda, Tadashi Kitamura (2004), An introduction of trajectory model into HMM-based speech synthesis, SSW
N. Sridhar Krishna, Hema A. Murthy (2004), Duration modeling of Indian languages Hindi and Telugu, SSW
Jason Y. Zhang, Arthur R. Toth, Kevyn Collins-Thompson, Alan W. Black (2004), Prominence prediction for supersentential prosodic modeling based on a new database, SSW
Robert I. Damper, Yannick Marchand, John-David Marseters, Alex Bazin (2004), Aligning letters and phonemes for speech synthesis, SSW
Peter Rutten, David Talkin (2004), rvoice studio and activeprompts, SSW
Leonardo Badino, Claudia Barolo, Silvia Quazza (2004), Language independent phoneme mapping for foreign TTS, SSW
Enrico Zovato, Alberto Pacchiotti, Silvia Quazza, Stefano Sandri (2004), Towards emotional speech synthesis: a rule based approach, SSW
Alejandro C. Renato, José A. Alvarez (2004), Corpora of latin american Spanish for research in prosody and synthesis, SSW
John Kominek, Alan W. Black (2004), The CMU Arctic speech databases, SSW
Arthur R. Toth (2004), Forced alignment for speech synthesis databases using duration and prosodic phrase breaks, SSW
Wentao Gu, Hiroya Fujisaki, Keikichi Hirose (2004), Analysis of fundamental frequency contours of Cantonese based on a command-response model, SSW
Brian Langner, Alan W. Black (2004), Creating a database of speech in noise for unit selection synthesis, SSW
Alan W. Black (2004), Overview of voice building, SSW
Tomoki Toda (2004), Overview of voice conversion, SSW
Renana Peres (2001), Beyond the Equal Error Rate - About the inter-relationship between algorithm and application, Odyssey
Christian J. Wellekens (2001), Seamless navigation in audio files, Odyssey
James L. Wayman (2001), Theory, characterization and testing of general biometric technologies, Odyssey
George Doddington (2001), Speaker Recognition Evaluation -- a challenge and an opportunity, Odyssey
Hirotaka Nakasone (2001), Speaker recognition in forensic environment, Odyssey
Regis Quelavoine (2001), Patent: a public disclosure of intellectual property, Odyssey
Josef Confino (2001), Listen to the Customers: Implementation of a speaker verification system in the bank industry, Odyssey
Mark A. Przybocki, Alvin F. Martin (2001), Odyssey text independent evaluation data, Odyssey
Eric G. Hansen, Raymond E. Slyh, Timothy R. Anderson (2001), Formant and F0 features for speaker recognition, Odyssey
A. Higgins, L. Bahler (2001), Password-based voice verification using SpeakerKey, Odyssey
Orith Toledo-Ronen (2001), Speech detection for text-dependent speaker verification, Odyssey
Alvin F. Martin, Mark A. Przybocki (2001), The NIST Speaker Recognition Evaluations: 1996-2001, Odyssey
Ran D. Zilca (2001), Using second order statistics for text independent speaker verification, Odyssey
Jamal Kharroubi, Dijana Petrovska-Delacrétaz, Gérard (2001) Chollet (2001), Text-independent speaker verification using support vector machines, Odyssey
Walter D. Andrews, Mary A. Kohler, Joseph P. Campbell, John J. Godfrey (2001), Phonetic, idiolectal and acoustic speaker recognition, Odyssey
Ivan Magrin-Chagnolleau, Guillaume Gravier, Raphael Blouet (2001), Overview of the 2000-2001 ELISA Consortium research activities, Odyssey
Dat Tran, Michael Wagner (2001), A generalised normalisation method for speaker verification, Odyssey
Corinne Fredouille, Jean-Francois Bonastre, Teva Merlin (2001), Bayesian bpproach based decision in speaker verification, Odyssey
Roland Auckenthaler, John S. Mason (2001), Gaussian selection applied to text-independent speaker verification, Odyssey
William D. Voiers (2001), Evaluating the effects of communication systems on speaker recognizability by human listeners: The Diagnostic Speaker Recognizability Test (DSRT), Odyssey
Lit Ping Wong, Martin J. Russell (2001), Speaker verification under additive noise conditions with non-stationary SNR using parallel model combination (PMC), Odyssey
Iain A. McCowan, Jason Pelecanos, Sridha Sridharan (2001), Robust speaker recognition using microphone arrays, Odyssey
Andrzej Drygajlo, Mounir El-Maliki (2001), Integration and imputation methods for unreliable feature compensation in GMM based speaker verification, Odyssey
Robert B. Dunn, Thomas F. Quatieri, Douglas A. Reynolds, Joseph P. Campbell (2001), Speaker recognition from coded speech in matched and mismatched conditions, Odyssey
Charles C. Broun, William M. Campbell, David Pearce, Holly Kelleher (2001), Speaker recognition and the ETSI Standard Distributed Speech Recognition Front-End, Odyssey
Ran Gazit, Yaakov Metzger, Orith Toledo-Ronen (2001), Speaker verification over cellular networks, Odyssey
Javier Rodriguez Saeta, Christian Koechling, Javier Hernando (2001), A VQ speaker identification system in car environment for personalized infotainment, Odyssey
Joaquin Gonzalez-Rodriguez, Javier Ortega-Garcia, J.J. Lucena-Molina (2001), On the application of the Bayesian approach in real forensic conditions with GMM-based systems, Odyssey
Hirotaka Nakasone, Steven D. Beck (2001), Forensic automatic speaker recognition, Odyssey
Didier Meuwly, Andrzej Drygajlo (2001), Forensic speaker recognition based on a Bayesian framework and Gaussian mixture modelling (GMM), Odyssey
Yosef A. Solewicz (2001), Noise robustness in forensic speaker verification, Odyssey
Yaakov Metzger (2001), Blind segmentation of a multi-speaker conversation using two different sets of features, Odyssey
Mauro Cettolo (2001), Speaker tracking in a broadcast news corpus, Odyssey
Itshak Lapidot, Hugo Guterman (2001), Resolution limitation in speakers clustering and segmentation problems, Odyssey
Sylvain Meignier, Jean-Francois Bonastre, Stephane Igounet (2001), E-HMM approach for learning and adapting sound models for speaker indexing, Odyssey
William M. Campbell, Charles C. Broun (2001), Text-prompted speaker recognition with polynomial classifiers, Odyssey
Marcos Faundez-Zanuy (2001), On the model size selection for speaker identification, Odyssey
Robert Stapert, John S. Mason (2001), Speaker recognition and the acoustic speech space, Odyssey
Sachin S. Kajarekar, Hynek Hermansky (2001), Speaker verification based on broad phonetic categories, Odyssey
Hassan Ezzaidi, Jean Rouat, Douglas O'Shaughnessy (2001), Combining pitch and MFCC for speaker identification systems, Odyssey
Jason Pelecanos, Sridha Sridharan (2001), Feature warping for robust speaker verification, Odyssey
Özgür Devrim Orman, Levent M. Arslan (2001), Frequency analysis of speaker identification, Odyssey
Raphael Blouet, Frédéric Bimbot (2001), A tree-based approach for score computation in speaker verification, Odyssey
Xiaozheng Zhang, Charles C. Broun (2001), Using lip features for multimodal speaker verification, Odyssey
Fabian Monrose, Michael K. Reiter, Qi Li, Susanne Wetzel (2001), Using voice to generate cryptographic keys, Odyssey
Niko Brümmer, Jason Pelecanos (2001), Unsupervised evaluation of speaker verification systems, Odyssey
Larry P. Heck, Dominique Genoud (2001), Integrating speaker and speech recognizers: Automatic identity claim capture for speaker verification, Odyssey
Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent (2020), Overview of the 6th CHiME Challenge, CHiME
Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, David Snyder, Aswin Shanmugam Subramanian, Jan Trmal, Bar Ben Yair, Christoph Boeddeker, Zhaoheng Ni, Yusuke Fujita, Shota H