Table of Contents and Access to Abstracts
Keynotes
Young, Steve:
"Still talking to machines (cognitively speaking)",
1-10.
Ifukube, Tohru:
"Sound-based assistive technology supporting "seeing", "hearing" and "speaking" for the disabled and the elderly",
11-19.
Tseng, Chiu-yu:
"Beyond sentence prosody",
20-29.
Special Session: Models of Speech - In Search of Better Representations
Nam, Hosung / Mitra, Vikramjit / Tiede, Mark / Saltzman, Elliot / Goldstein, Louis / Espy-Wilson, Carol / Hasegawa-Johnson, Mark:
"A procedure for estimating gestural scores from natural speech",
30-33.
Shue, Yen-Liang / Chen, Gang / Alwan, Abeer:
"On the interdependencies between voice quality, glottal gaps, and voice-source related acoustic measures",
34-37.
Kawahara, Hideki / Morise, Masanori / Takahashi, Toru / Banno, Hideki / Nisimura, Ryuichi / Irino, Toshio:
"Simplification and extension of non-periodic excitation source representations for high-quality speech manipulation systems",
38-41.
Hiroya, Sadao / Mochida, Takemi:
"Phase equalization-based autoregressive model of speech signals",
42-45.
Xu, Yi / Prom-on, Santitham:
"Articulatory-functional modeling of speech prosody: a review",
46-49.
Torres, Humberto M. / Mixdorff, Hansjörg / Gurlekian, Jorge A. / Pfitzinger, Hartmut R.:
"Two new estimation methods for a superpositional intonation model",
50-53.
ASR: Acoustic Models I-III
Wiesler, Simon / Heigold, Georg / Nußbaum-Thom, Markus / Schlüter, Ralf / Ney, Hermann:
"A discriminative splitting criterion for phonetic decision trees",
54-57.
Gales, Mark J. F. / Yu, Kai:
"Canonical state models for automatic speech recognition",
58-61.
Dognin, Pierre L. / Hershey, John R. / Goel, Vaibhava / Olsen, Peder:
"Restructuring exponential family mixture models",
62-65.
Beaufays, Françoise / Vanhoucke, Vincent / Strope, Brian:
"Unsupervised discovery and training of maximally dissimilar cluster models",
66-69.
Sim, Khe Chai:
"Probabilistic state clustering using conditional random field for context-dependent acoustic modelling",
70-73.
Sun, Xie / Zhao, Yunxin:
"Integrate template matching and statistical modeling for speech recognition",
74-77.
Saon, George / Soltau, Hagen:
"Boosting systems for LVCSR",
1341-1344.
Goel, Vaibhava / Sainath, Tara N. / Ramabhadran, Bhuvana / Olsen, Peder / Nahamoo, David / Kanevsky, Dimitri:
"Incorporating sparse representation phone identification features in automatic speech recognition using exponential families",
1345-1348.
Chen, Xin / Zhao, Yunxin:
"Integrating MLP features and discriminative training in data sampling based ensemble acoustic modeling",
1349-1352.
Huang, Jui-Ting / Hasegawa-Johnson, Mark:
"Semi-supervised training of Gaussian mixture models by conditional entropy minimization",
1353-1356.
Shi, Guangchuan / Shi, Yu / Huo, Qiang:
"A study of irrelevant variability normalization based training and unsupervised online adaptation for LVCSR",
1357-1360.
Hsiao, Roger / Metze, Florian / Schultz, Tanja:
"Improvements to generalized discriminative feature transformation for speech recognition",
1361-1364.
Veselý, Karel / Burget, Lukáš / Grézl, František:
"Parallel training of neural networks for speech recognition",
2934-2937.
Singh, Rita / Lambert, Benjamin / Raj, Bhiksha:
"The use of sense in unsupervised training of acoustic models for ASR systems",
2938-2941.
Du, Jun / Hu, Yu / Jiang, Hui:
"Boosted mixture learning of Gaussian mixture HMMs for speech recognition",
2942-2945.
Leutnant, Volker / Haeb-Umbach, Reinhold:
"On the exploitation of hidden Markov models and linear dynamic models in a hybrid decoder architecture for continuous speech recognition",
2946-2949.
Abad, Alberto / Pellegrini, Thomas / Trancoso, Isabel / Neto, João:
"Context dependent modelling approaches for hybrid speech recognizers",
2950-2953.
Kubo, Yotaro / Watanabe, Shinji / Nakamura, Atsushi / Kobayashi, Tetsunori:
"A regularized discriminative training method of acoustic models derived by minimum relative entropy discrimination",
2954-2957.
Liao, Hank / Alberti, Chris / Bacchiani, Michiel / Siohan, Olivier:
"Decision tree state clustering with word and syllable features",
2958-2961.
Fujimura, Hiroshi / Masuko, Takashi / Tachimori, Mitsuyoshi:
"A duration modeling technique with incremental speech rate normalization",
2962-2965.
Wöllmer, Martin / Sun, Yang / Eyben, Florian / Schuller, Björn:
"Long short-term memory networks for noise robust speech recognition",
2966-2969.
Nitta, Tsuneo / Onoda, Takayuki / Kimura, Masashi / Iribe, Yurie / Katsurada, Kouichi:
"One-model speech recognition and synthesis based on articulatory movement HMMs",
2970-2973.
Cui, Xiaodong / Xue, Jian / Dognin, Pierre L. / Chaudhari, Upendra V. / Zhou, Bowen:
"Acoustic modeling with bootstrap and restructuring for low-resourced languages",
2974-2977.
Kosaka, Tetsuo / Goto, Keisuke / Ito, Takashi / Kato, Masaharu:
"Lecture speech recognition by combining word graphs of various acoustic models",
2978-2981.
Sim, Khe Chai / Liu, Shilin:
"Semi-parametric trajectory modelling using temporally varying feature mapping for speech recognition",
2982-2985.
Yu, Dong / Deng, L.:
"Deep-structured hidden conditional random fields for phonetic recognition",
2986-2989.
Malkin, Jonathan / Bilmes, Jeff:
"Semi-supervised learning for improved expression of uncertainty in discriminative classifiers",
2990-2993.
Olsen, Peder / Goel, Vaibhava / Micchelli, Charles / Hershey, John R.:
"Modeling posterior probabilities using the linear exponential family",
2994-2997.
Spoken Dialogue Systems I, II
Lefèvre, Fabrice / Mairesse, François / Young, Steve:
"Cross-lingual spoken language understanding from unaligned data using discriminative classification models and machine translation",
78-81.
Balchandran, Rajesh / Rachevsky, Leonid / Ramabhadran, Bhuvana / Novák, Miroslav:
"Techniques for topic detection based processing in spoken dialog systems",
82-85.
Chandramohan, Senthilkumar / Geist, Matthieu / Pietquin, Olivier:
"Optimizing spoken dialogue management with fitted value iteration",
86-89.
Jurčíček, F. / Thomson, B. / Keizer, S. / Mairesse, François / Gašić, M. / Yu, Kai / Young, Steve:
"Natural belief-critic: a reinforcement algorithm for parameter estimation in statistical spoken dialogue systems",
90-93.
Schmitt, Alexander / Scholz, Michael / Minker, Wolfgang / Liscombe, Jackson / Suendermann, David:
"Is it possible to predict task completion in automated troubleshooters?",
94-97.
Suendermann, David / Liscombe, Jackson / Pieraccini, Roberto:
"Minimally invasive surgery for spoken dialog systems",
98-101.
Spoken Dialogue Systems II
López-Cózar, Ramón / Griol, David:
"New technique to enhance the performance of spoken dialogue systems based on dialogue states-dependent language models and grammatical rules",
2998-3001.
Hurtado, Lluís-F. / Planells, Joaquin / Segarra, Encarna / Sanchis, Emilio / Griol, David:
"A stochastic finite-state transducer approach to spoken dialog management",
3002-3005.
Laroche, Romain / Bretier, Philippe / Putois, Ghislain:
"Enhanced monitoring tools and online dialogue optimisation merged into a new spoken dialogue system design experience",
3006-3009.
Laroche, Romain / Putois, Ghislain / Bretier, Philippe:
"Optimising a handcrafted dialogue system design",
3010-3013.
Putze, Felix / Schultz, Tanja:
"Utterance selection for speech acts in a cognitive tourguide scenario",
3014-3017.
Parent, Gabriel / Eskenazi, Maxine:
"Lexical entrainment of real users in the let's go spoken dialog system",
3018-3021.
Quarteroni, Silvia / González, Meritxell / Riccardi, Giuseppe / Varges, Sebastian:
"Combining user intention and error modeling for statistical dialog simulators",
3022-3025.
Hakulinen, Jaakko / Turunen, Markku / Camara, Raúl Santos de la / Crook, Nigel:
"Parallel processing of interruptions and feedback in companions affective dialogue system",
3026-3029.
Raux, Antoine / Mehta, Neville / Ramachandran, Deepak / Gupta, Rakesh:
"Dynamic language modeling using Bayesian networks for spoken dialog systems",
3030-3033.
Hara, Sunao / Kitaoka, Norihide / Takeda, Kazuya:
"Automatic detection of task-incompleted dialog for spoken dialog system based on dialog act n-gram",
3034-3037.
Liang, Wei-Bin / Wu, Chung-Hsien / Hsiao, Yu-Cheng:
"Dialogue act detection in error-prone spoken dialogue systems using partial sentence tree and latent dialogue act matrix",
3038-3041.
Kawahara, Tatsuya / Sumi, Kouhei / Chang, Zhi-Qiang / Takanashi, Katsuya:
"Detection of hot spots in poster conversations based on reactive tokens of audience",
3042-3045.
Matsuyama, Yoichi / Fujie, Shinya / Taniyama, Hikaru / Kobayashi, Tetsunori:
"Psychological evaluation of a group communication activation robot in a party game",
3046-3049.
Matsuyama, Kyoko / Komatani, Kazunori / Takeda, Ryu / Takahashi, Toru / Ogata, Tetsuya / Okuno, Hiroshi G.:
"Analyzing user utterances in barge-in-able spoken dialogue system for improving identification accuracy",
3050-3053.
Heldner, Mattias / Edlund, Jens / Hirschberg, Julia:
"Pitch similarity in the vicinity of backchannels",
3054-3057.
Truong, Khiet P. / Poppe, Ronald / Heylen, Dirk:
"A rule-based backchannel prediction model using pitch and pause information",
3058-3061.
Speech Perception: Factors Influencing Perception
Boersma, Paul / Chládková, Kateřina:
"Detecting categorical perception in continuous discrimination data",
102-105.
Benders, Titia / Escudero, Paola:
"The interrelation between the stimulus range and the number of response categories in vowel categorization",
106-109.
Nilsenová, Marie / Goudbeek, Martijn / Kempen, Luuk:
"The relation between pitch perception preference and emotion identification",
110-113.
Otake, Takashi / McQueen, James M. / Cutler, Anne:
"Competition in the perception of spoken Japanese words",
114-117.
Sadakata, Makiko / Zanden, Lotte van der / Sekiyama, Kaoru:
"Influence of musical training on perception of L2 speech",
118-121.
Derrick, Donald / Gick, Bryan:
"Full body aero-tactile integration in speech perception",
122-125.
Prosody: Models
Duběda, Tomáš / Mády, Katalin:
"Nucleus position within the intonation phrase: a typological study of English, Czech and Hungarian",
126-129.
Lee, Yong-cheol / Nambu, Satoshi:
"Focus-sensitive operator or focus inducer: always and only",
130-133.
Yuan, Jiahong / Liberman, Mark:
"F0 declination in English and Mandarin broadcast news speech",
134-137.
Schweitzer, Katrin / Walsh, Michael / Möbius, Bernd / Schütze, Hinrich:
"Frequency of occurrence effects on pitch accent realisation",
138-141.
González-Ferreras, César / Vivaracho-Pascual, Carlos / Escudero-Mancebo, David / Cardeñoso-Payo, Valentín:
"On the automatic toBI accent type identification from data",
142-145.
Rosenberg, Andrew:
"AutoBI - a tool for automatic toBI annotation",
146-149.
Speech Synthesis: Unit Selection and Others
Strom, Volker / King, Simon:
"A classifier-based target cost for unit selection speech synthesis trained on perceptual data",
150-153.
Zhang, Wei / Cui, Xiaodong:
"Applying scalable phonetic context similarity in unit selection of concatenative text-to-speech",
154-157.
Isogai, Mitsuaki / Mizuno, Hideyuki:
"Speech database reduction method for corpus-based TTS system",
158-161.
Lu, Heng / Ling, Zhen-Hua / Wei, Si / Dai, Lirong / Wang, Ren-Hua:
"Automatic error detection for unit selection speech synthesis using log likelihood ratio based SVM classifier",
162-165.
Silén, Hanna / Helander, Elina / Nurminen, Jani / Koppinen, Konsta / Gabbouj, Moncef:
"Using robust viterbi algorithm and HMM-modeling in unit selection TTS to replace units of poor quality",
166-169.
Kim, Yeon-Jun / Beutnagel, Mark C.:
"Automatic detection of abnormal stress patterns in unit selection synthesis",
170-173.
Tihelka, Daniel / Kala, Jiří / Matoušek, Jindřich:
"Enhancements of viterbi search for fast unit selection synthesis",
174-177.
Ewender, Thomas / Pfister, Beat:
"Accurate pitch marking for prosodic modification of speech segments",
178-181.
Pan, Shifeng / Zhang, Meng / Tao, Jianhua:
"A novel hybrid approach for Mandarin speech synthesis",
182-185.
Pontes, Josafá de Jesus Aguiar / Furui, Sadaoki:
"Modeling liaison in French by using decision trees",
186-189.
Luan, Jian / Li, Jian:
"Improvement on plural unit selection and fusion",
190-193.
Parlikar, Alok / Black, Alan W. / Vogel, Stephan:
"Improving speech synthesis of machine translation output",
194-197.
Putois, Ghislain / Chevelu, Jonathan / Boidin, Cédric:
"Paraphrase generation to improve text-to-speech synthesis",
198-201.
ASR: Search, Decoding and Confidence Measures I, II
Han, Chang Woo / Kang, Shin Jae / Lee, Chul Min / Kim, Nam Soo:
"Phone mismatch penalty matrices for two-stage keyword spotting via multi-pass phone recognizer",
202-205.
Motlicek, Petr / Valente, Fabio / Garner, Philip N.:
"English spoken term detection in multilingual recordings",
206-209.
Han, Icksang / Park, Chiyoun / Cho, Jeongmi / Kim, Jeongsu:
"A hybrid approach to robust word lattice generation via acoustic-based word detection",
210-213.
Steinbiss, Volker / Sundermeyer, Martin / Ney, Hermann:
"Direct observation of pruning errors (DOPE): a search analysis tool",
214-217.
Rybach, David / Riley, Michael:
"Direct construction of compact context-dependency transducers from data",
218-221.
Novák, Miroslav:
"Incremental composition of static decoding graphs with label pushing",
222-225.
Yang, Zhanlei / Liu, Wenju:
"A novel path extension framework using steady segment detection for Mandarin speech recognition",
226-229.
Schlüter, Ralf / Nußbaum-Thom, Markus / Ney, Hermann:
"On the relation of Bayes risk, word error, and word posteriors in ASR",
230-233.
Nolden, D. / Ney, Hermann / Schlüter, Ralf:
"Time conditioned search in automatic speech recognition reconsidered",
234-237.
Kobashikawa, Satoshi / Asami, Taichi / Yamaguchi, Yoshikazu / Masataki, Hirokazu / Takahashi, Satoshi:
"Efficient data selection for speech recognition based on prior confidence estimation using speech and context independent models",
238-241.
Ogawa, Atsunori / Nakamura, Atsushi:
"A novel confidence measure based on marginalization of jointly estimated error cause probabilities",
242-245.
Fayolle, Julien / Moreau, Fabienne / Raymond, Christian / Gravier, Guillaume / Gros, Patrick:
"CRF-based combination of contextual features to improve a posteriori word-level confidence measures",
1942-1945.
Wöllmer, Martin / Eyben, Florian / Schuller, Björn / Rigoll, Gerhard:
"Recognition of spontaneous conversational speech using long short-term memory phoneme predictions",
1946-1949.
Pellegrini, Thomas / Trancoso, Isabel:
"Improving ASR error detection with non-decoder based features",
1950-1953.
Golipour, Ladan / O'Shaughnessy, Douglas:
"Phoneme classification and lattice rescoring based on a k-NN approach",
1954-1957.
Bilmes, Jeff / Lin, Hui:
"Online adaptive learning for speech recognition decoding",
1958-1961.
Hori, Takaaki / Watanabe, Shinji / Nakamura, Atsushi:
"Improvements of search error risk minimization in viterbi beam search for speech recognition",
1962-1965.
Special-Purpose Speech Applications
Hofe, Robin / Ell, Stephen R. / Fagan, Michael J. / Gilbert, James M. / Green, Phil D. / Moore, Roger K. / Rybchenko, Sergey I.:
"Evaluation of a silent speech interface based on magnetic sensing",
246-249.
San-Segundo, Rubén / López, Verónica / Martín, Raquel / Lufti, Syaheerah / Ferreiros, Javier / Córdoba, Ricardo / Pardo, José Manuel:
"Advanced speech communication system for deaf people",
250-253.
Sam, Sethserey / Castelli, Eric / Besacier, Laurent:
"Unsupervised acoustic model adaptation for multi-origin non native ASR",
254-257.
Hakkani-Tür, Dilek / Vergyri, Dimitra / Tur, Gokhan:
"Speech-based automated cognitive status assessment",
258-261.
Imai, Toru / Homma, Shinichi / Kobayashi, Akio / Oku, Takahiro / Sato, Shoei:
"Speech recognition with a seamlessly updated language model for real-time closed-captioning",
262-265.
Nishimoto, Takuya / Watanabe, Takayuki:
"The comparison between the deletion-based methods and the mixing-based methods for audio CAPTCHA systems",
266-269.
Adda-Decker, Martine / Lamel, Lori / Snoeren, Natalie D.:
"Comparing mono- & multilingual acoustic seed models for a low e-resourced language: a case-study of luxembourgish",
270-273.
Son, Rob J. J. H. van / Jacobi, Irene / Hilgers, Frans:
"Manipulating treacheoesophageal speech",
274-277.
Imseng, David / Bourlard, Hervé / Doss, Mathew Magimai:
"Towards mixed language speech recognition systems",
278-281.
Barnard, Etienne / Schalkwyk, Johan / Heerden, Charl van / Moreno, Pedro J.:
"Voice search for development",
282-285.
Levow, Gina-Anne / Duncan, Susan / King, Edward T.:
"Cross-cultural investigation of prosody in verbal feedback in interactional rapport",
286-289.
Knox, Mary Tai / Friedland, Gerald:
"Multimodal speaker diarization using oriented optical flow histograms",
290-293.
Middag, Catherine / Saeys, Yvan / Martens, Jean-Pierre:
"Towards an ASR-free objective analysis of pathological speech",
294-297.
Speech Analysis
Godin, Keith W. / Hansen, John H. L.:
"Session variability contrasts in the MARP corpus",
298-301.
Kondo, Kazuhiro / Takano, Yusuke:
"Estimation of two-to-one forced selection intelligibility scores by speech recognizers using noise-adapted models",
302-305.
Schaaf, Thomas / Metze, Florian:
"Analysis of gender normalization using MLP and VTLN features",
306-309.
Aimetti, Guillaume / Moore, Roger K. / Bosch, Louis ten:
"Discovering an optimal set of minimally contrasting acoustic speech units: a point of focus for whole-word pattern matching",
310-313.
Stafylakis, Themos / Anguera, Xavier:
"Improvements to the equal-parameter BIC for speaker diarization",
314-317.
Mesgarani, Nima / Thomas, Samuel / Hermansky, Hynek:
"A multistream multiresolution framework for phoneme recognition",
318-321.
Salvi, Giampiero / Tesser, Fabio / Zovato, Enrico / Cosi, Piero:
"Cluster analysis of differential spectral envelopes on emotional speech",
322-325.
Bowman, Sam / Livescu, Karen:
"Modeling pronunciation variation with context-dependent articulatory feature decision trees",
326-329.
Raj, Bhiksha / Wilson, Kevin W. / Krueger, Alexander / Haeb-Umbach, Reinhold:
"Ungrounded independent non-negative factor analysis",
330-333.
Hershey, John R. / Olsen, Peder / Rennie, Steven J.:
"Signal interaction and the devil function",
334-337.
Systems for LVCSR
Akita, Yuya / Mimura, Masato / Neubig, Graham / Kawahara, Tatsuya:
"Semi-automated update of automatic transcription system for the Japanese national congress",
338-341.
Liu, Xunying / Gales, Mark J. F. / Woodland, Phil C.:
"Language model cross adaptation for LVCSR system combination",
342-345.
Watanabe, Shinji / Hori, Takaaki / Nakamura, Atsushi:
"Large vocabulary continuous speech recognition using WFST-based linear classifier for structured data",
346-349.
Květoň, Pavel / Novák, Miroslav:
"Accelerating hierarchical acoustic likelihood computation on graphics processors",
350-353.
Shan, Jiulong / Wu, Genqing / Hu, Zhihong / Tang, Xiliu / Jansche, Martin / Moreno, Pedro J.:
"Search by voice in Mandarin Chinese",
354-357.
Hain, Thomas / Burget, Lukáš / Dines, John / Garner, Philip N. / Hannani, Asmaa El / Huijbregts, Marijn / Karafiát, Martin / Lincoln, Mike / Wan, Vincent:
"The AMIDA 2009 meeting transcription system",
358-361.
Speaker Characterization and Recognition I-IV
Campbell, William M. / Karam, Zahi N.:
"Simple and efficient speaker comparison using approximate KL divergence",
362-365.
Sun, Hanwu / Ma, Bin / Huang, Chien-Lin / Nguyen, Trung Hieu / Li, Haizhou:
"The IIR NIST SRE 2008 and 2010 summed channel speaker recognition systems",
366-369.
Huang, Chien-Lin / Sun, Hanwu / Ma, Bin / Li, Haizhou:
"Speaker characterization using long-term and temporal information",
370-373.
Perez-Gomez, Sergio / Ramos, Daniel / Gonzalez-Dominguez, Javier / Gonzalez-Rodriguez, Joaquin:
"Score-level compensation of extreme speech duration variability in speaker verification",
374-377.
Abad, Alberto / Trancoso, Isabel:
"Speaker recognition experiments using connectionist transformation network features",
378-381.
Lei, Yun / Hansen, John H. L.:
"Speaker recognition using supervised probabilistic principal component analysis",
382-385.
Bigot, Benjamin / Pinquier, Julien / Ferrané, Isabelle / André-Obrecht, Régine:
"Looking for relevant features for speaker role recognition",
1057-1060.
Kockmann, Marcel / Burget, Lukáš / Glembek, Ondřej / Ferrer, Luciana / Černocký, Jan:
"Prosodic speaker verification using subspace multinomial models with intersession compensation",
1061-1064.
Wang, Eryu / Lee, Kong Aik / Ma, Bin / Li, Haizhou / Guo, Wu / Dai, Lirong:
"The estimation and kernel metric of spectral correlation for text-independent speaker verification",
1065-1068.
Saeidi, Rahim / Mowlaee, Pejman / Kinnunen, Tomi / Tan, Zheng-Hua / Christensen, Mads Græsbøll / Jensen, Søren Holdt / Fränti, Pasi:
"Improving monaural speaker identification by double-talk detection",
1069-1072.
Avinash, B. / Guruprasad, S. / Yegnanarayana, Bayya:
"Exploring subsegmental and suprasegmental features for a text-dependent speaker verification in distant speech signals",
1073-1076.
Liu, Qingsong / Huang, Wei / Xu, Dongxing / Cai, Hongbin / Dai, Beiqian:
"A fast implementation of factor analysis for speaker verification",
1077-1080.
Zhang, Ce / Zheng, Rong / Xu, Bo:
"An investigation into direct scoring methods without SVM training in speaker verification",
1437-1440.
Jourani, Reda / Daoudi, Khalid / André-Obrecht, Régine / Aboutajdine, Driss:
"Large margin Gaussian mixture models for speaker identification",
1441-1444.
Zheng, Rong / Xu, Bo:
"On the use of Gaussian component information in the generative likelihood ratio estimation for speaker verification",
1445-1448.
Mak, Man-Wai / Rao, Wei:
"Acoustic vector resampling for GMMSVM-based speaker verification",
1449-1452.
Biatov, Konstantin:
"A fast speaker indexing using vector quantization and second order statistics with adaptive threshold computation",
1453-1456.
Wang, Gang / Wu, Xiaojun / Zheng, Thomas Fang:
"Using phoneme recognition and text-dependent speaker verification to improve speaker segmentation for Chinese speech",
1457-1460.
Garretón, Claudio / Yoma, Néstor Becerra:
"On enhancing feature sequence filtering with filter-bank energy transformation in speaker verification with telephone speech",
1461-1464.
Zhu, Donglai / Ma, Bin / Lee, Kong Aik / Leung, Cheung-Chi / Li, Haizhou:
"MAP estimation of subspace transform for speaker recognition",
1465-1468.
Jafari, Ayeh / Srinivasan, Ramji / Crookes, Danny / Ming, Ji:
"A longest matching segment approach for text-independent speaker recognition",
1469-1472.
Hautamäki, Ville / Kinnunen, Tomi / Nosratighods, Mohaddeseh / Lee, Kong Aik / Ma, Bin / Li, Haizhou:
"Approaching human listener accuracy with modern speaker verification",
1473-1476.
Pohjalainen, Jouni / Saeidi, Rahim / Kinnunen, Tomi / Alku, Paavo:
"Extended weighted linear prediction (XLP) analysis of speech and its application to speaker verification in adverse conditions",
1477-1480.
Ye, Guoli / Mak, Brian:
"The use of subvector quantization and discrete densities for fast GMM computation for speaker verification",
1481-1484.
Richardson, Fred S. / Campbell, Joseph P.:
"Transcript-dependent speaker recognition using mixer 1 and 2",
2102-2105.
Drugman, Thomas / Dutoit, Thierry:
"On the potential of glottal signatures for speaker recognition",
2106-2109.
Padmanabhan, R. / Murthy, Hema A.:
"Acoustic feature diversity and speaker verification",
2110-2113.
Dehzangi, Omid / Ma, Bin / Chng, Eng Siong / Li, Haizhou:
"A discriminative performance metric for GMM-UBM speaker identification",
2114-2117.
Anguera, Xavier / Bonastre, Jean-François:
"A novel speaker binary key derived from anchor models",
2118-2121.
Zhang, Wei-Qiang / Deng, Yan / He, Liang / Liu, Jia:
"Variant time-frequency cepstral features for speaker recognition",
2122-2125.
Wang, Ning / Ching, P. C. / Lee, Tan:
"Exploitation of phase information for speaker recognition",
2126-2129.
Long, Yanhua / Dai, Lirong / Ma, Bin / Guo, Wu:
"Effects of the phonological relevance in speaker verification",
2130-2133.
Sierra, Gabriel H. / Bonastre, Jean-François / Matrouf, Driss / Calvo, Jose R.:
"Topological representation of speech for speaker recognition",
2134-2137.
Sadjadi, Seyed Omid / Hansen, John H. L.:
"Assessment of single-channel speech enhancement techniques for speaker identification under mismatched conditions",
2138-2141.
Zhang, Xiang / Cao, Chuan / Yang, Lin / Suo, Hongbin / Zhang, Jianping / Yan, Yonghong:
"Speaker recognition using the resynthesized speech via spectrum modeling",
2142-2145.
Source Separation
Peharz, Robert / Stark, Michael / Pernkopf, Franz / Stylianou, Yannis:
"A factorial sparse coder model for single channel source separation",
386-389.
Benabderrahmane, Yasmina / Selouani, Sid Ahmed / O'Shaughnessy, Douglas:
"Oriented PCA method for blind speech separation of convolutive mixtures",
390-393.
Hsieh, Hsin-Lung / Chien, Jen-Tzung:
"Online Gaussian process for nonstationary speech separation",
394-397.
Yu, Meng / Ma, Wenye / Xin, Jack / Osher, Stanley:
"Convexity and fast speech extraction by split bregman method",
398-401.
Ma, Wenye / Yu, Meng / Xin, Jack / Osher, Stanley:
"Reducing musical noise in blind source separation by time-domain sparse filters and split bregman method",
402-405.
Woodruff, John / Prabhavalkar, Rohit / Fosler-Lussier, Eric / Wang, DeLiang:
"Combining monaural and binaural evidence for reverberant speech segregation",
406-409.
Speech Synthesis: HMM-Based Speech Synthesis I, II
Zen, Heiga:
"Speaker and language adaptive training for HMM-based polyglot speech synthesis",
410-413.
Yu, Kai / Zen, Heiga / Mairesse, François / Young, Steve:
"Context adaptive training with factorized decision trees for HMM-based speech synthesis",
414-417.
Yamagishi, Junichi / Watts, Oliver / King, Simon / Usabaev, Bela:
"Roles of the average voice in speaker-adaptive HMM-based speech synthesis",
418-421.
Qian, Yao / Yan, Zhi-Jie / Wu, Yijian / Soong, Frank K. / Zhuang, Xin / Kong, Shengyi:
"An HMM trajectory tiling (HTT) approach to high quality TTS",
422-425.
Chen, Yi-Ning / Yan, Zhi-Jie / Soong, Frank K.:
"A perceptual study of acceleration parameters in HMM-based TTS",
426-429.
Yokomizo, Shuji / Nose, Takashi / Kobayashi, Takao:
"Evaluation of prosodic contextual factors for HMM-based speech synthesis",
430-433.
Shechtman, Slava / Sorin, Alex:
"Sinusoidal model parameterization for HMM-based TTS system",
805-808.
Shiga, Yoshinori / Toda, Tomoki / Sakai, Shinsuke / Kawai, Hisashi:
"Improved training of excitation for HMM-based parametric speech synthesis",
809-812.
Sung, June Sig / Hong, Doo Hwa / Oh, Kyung Hwan / Kim, Nam Soo:
"Excitation modeling based on waveform interpolation for HMM-based speech synthesis",
813-816.
Zhuang, Xin / Qian, Yao / Soong, Frank K. / Wu, Yijian / Zhang, Bo:
"Formant-based frequency warping for improving speaker adaptation in HMM TTS",
817-820.
Hu, Hongwei / Russell, Martin J.:
"Improved modelling of speech dynamics using non-linear formant trajectories for HMM-based speech synthesis",
821-824.
Ling, Zhen-Hua / Hu, Yu / Dai, Lirong:
"Global variance modeling on the log power spectrum of LSPs for HMM-based speech synthesis",
825-828.
Shannon, Matt / Byrne, William:
"Autoregressive clustering for HMM speech synthesis",
829-832.
Pilkington, Nicholas / Zen, Heiga:
"An implementation of decision tree-based context clustering on graphics processing units",
833-836.
Gutkin, Alexander / Gonzalvo, Xavi / Breuer, Stefan / Taylor, Paul:
"Quantized HMMs for low footprint text-to-speech synthesis",
837-840.
Watts, Oliver / Yamagishi, Junichi / King, Simon:
"The role of higher-level linguistic features in HMM-based speech synthesis",
841-844.
Mase, Ayami / Oura, Keiichiro / Nankaku, Yoshihiko / Tokuda, Keiichi:
"HMM-based singing voice synthesis system using pitch-shifted pseudo training data",
845-848.
Ni, Jinfu / Kawai, Hisashi:
"An unsupervised approach to creating web audio contents-based HMM voices",
849-852.
Koriyama, Tomoki / Nose, Takashi / Kobayashi, Takao:
"Conversational spontaneous speech synthesis using average voice model",
853-856.
Multi-Modal Signal Processing
Hörnstein, Jonas / Santos-Victor, José:
"Learning words and speech units through natural interactions",
434-437.
Liu, Qingju / Wang, Wenwu / Jackson, Philip:
"Bimodal coherence based scale ambiguity cancellation for target speech extraction and enhancement",
438-441.
Kawashima, Hiroaki / Horii, Yu / Matsuyama, Takashi:
"Speech estimation in non-stationary noise environments using timing structures between mouth movements and sound signals",
442-445.
Wang, Lijuan / Qian, Xiaojun / Han, Wei / Soong, Frank K.:
"Synthesizing photo-real talking head via trajectory-guided sample selection",
446-449.
Florescu, Victoria M. / Crevier-Buchman, Lise / Denby, Bruce / Hueber, Thomas / Colazo-Simon, Antonia / Pillot-Loiseau, Claire / Roussel, Pierre / Gendrot, Cédric / Quattrocchi, Sophie:
"Silent vs vocalized articulation for a portable ultrasound-based silent speech interface",
450-453.
Hofer, Gregor / Richmond, Korin:
"Comparison of HMM and TMDN methods for lip synchronisation",
454-457.
Paralanguage
Schiel, Florian / Heinrich, Christian / Neumeyer, Veronika:
"Rhythm and formant features for automatic alcohol detection",
458-461.
Yanushevskaya, Irena / Gobl, Christer / Kane, John / Ní Chasaide, Ailbhe:
"An exploration of voice source correlates of focus",
462-465.
Harnsberger, James D. / Shrivastav, Rahul / Brown Jr., W. S.:
"Modeling perceived vocal age in american English",
466-469.
Caraty, Marie-José / Montacié, Claude:
"Multivariate analysis of vocal fatigue in continuous reading",
470-473.
Kain, Alexander / Santen, Jan P. H. van:
"Frequency-domain delexicalization using surrogate vowels",
474-477.
Metze, Florian / Batliner, Anton / Eyben, Florian / Polzehl, Tim / Schuller, Björn / Steidl, Stefan:
"Emotion recognition using imperfect speech recognition",
478-481.
Liu, Gang / Lei, Yun / Hansen, John H. L.:
"A novel feature extraction strategy for multi-stream robust emotion identification",
482-485.
Toutios, Asterios / Musti, Utpala / Ouni, Slim / Colotte, Vincent / Wrobel-Dautcourt, Brigitte / Berger, Marie-Odile:
"Setup for acoustic-visual speech synthesis by concatenating bimodal units",
486-489.
Jochems, Bart / Larson, Martha / Ordelman, Roeland / Poppe, Ronald / Truong, Khiet P.:
"Towards affective state modeling in narrative and conversational settings",
490-493.
Nomoto, Narichika / Masataki, Hirokazu / Yoshioka, Osamu / Takahashi, Satoshi:
"Detection of anger emotion in dialog speech using prosody feature and temporal relation of utterances",
494-497.
Roustan, Benjamin / Dohen, Marion:
"Gesture and speech coordination: the influence of the relationship between manual gesture and speech",
498-501.
Bořil, Hynek / Sadjadi, Seyed Omid / Kleinschmidt, Tristan / Hansen, John H. L.:
"Analysis and detection of cognitive load and frustration in drivers' speech",
502-505.
Sasou, Akira / Hashimoto, Yasuharu / Sakaue, Katsuhiko:
"Acoustic-based recognition of head gestures accompanying speech",
506-509.
Castronovo, Sandro / Mahr, Angela / Pentcheva, Margarita / Müller, Christian:
"Multimodal dialog in the car: combining speech and turn-and-push dial to control comfort functions",
510-513.
Korchagin, Danil / Garner, Philip N. / Motlicek, Petr:
"Hands free audio analysis from home entertainment",
514-517.
Shaikh, Mostafa Al Masum / Rebordão, Antonio Rui Ferreira / Hirose, Keikichi:
"Affective story teller: a TTS system for emotional expressivity",
518-521.
ASR: Speaker Adaptation, Robustness Against Reverberation
Ghai, Shweta / Sinha, Rohit:
"Enhancing children's speech recognition under mismatched condition by explicit acoustic normalization",
522-525.
Li, Bo / Sim, Khe Chai:
"Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems",
526-529.
Vipperla, Ravichander / Renals, Steve / Frankel, Joe:
"Augmentation of adaptation data",
530-533.
Machlica, Lukáš / Zajíc, Zbyněk / Müller, Luděk:
"Discriminative adaptation based on fast combination of DMAP and dfMLLR",
534-537.
Sanand, Doddipatla Rama / Schlüter, Ralf / Ney, Hermann:
"Revisiting VTLN using linear transformation on conventional MFCC",
538-541.
Hayashi, Toyohiro / Nankaku, Yoshihiko / Lee, Akinobu / Tokuda, Keiichi:
"Speaker adaptation based on nonlinear spectral transform for speech recognition",
542-545.
Kosaka, Tetsuo / Ito, Takashi / Kato, Masaharu / Kohda, Masaki:
"Speaker adaptation based on system combination using speaker-class models",
546-549.
Jeong, Yongwon / Song, Young Rok / Kim, Hyung Soon:
"Speaker adaptation in transformation space using two-dimensional PCA",
550-553.
Trmal, Jan / Zelinka, Jan / Müller, Luděk:
"On speaker adaptive training of artificial neural networks",
554-557.
He, Yongjun / Han, Jiqing:
"Model synthesis for band-limited speech recognition",
558-561.
Fukumori, Takahiro / Morise, Masanori / Nishiura, Takanobu:
"Performance estimation of reverberant speech recognition based on reverberant criteria RSR-dn with acoustic parameters",
562-565.
Sehr, Armin / Hofmann, Christian / Maas, Roland / Kellermann, Walter:
"A novel approach for matched reverberant training of HMMs using data pairs",
566-569.
Maganti, Hari Krishna / Matassoni, Marco:
"An auditory based modulation spectral feature for reverberant speech recognition",
570-573.
Wolf, Martin / Nadeu, Climent:
"On the potential of channel selection for recognition of reverberated speech with multiple microphones",
574-577.
Gomez, Randy / Kawahara, Tatsuya:
"An improved wavelet-based dereverberation for robust automatic speech recognition",
578-581.
Petrick, Rico / Fehér, Thomas / Unoki, Masashi / Hoffmann, Rüdiger:
"Methods for robust speech recognition in reverberant environments: a comparison",
582-585.
Language Learning, TTS, and Other Applications
Suzuki, Masayuki / Qiao, Yu / Minematsu, Nobuaki / Hirose, Keikichi:
"Integration of multilayer regression analysis with structure-based pronunciation assessment",
586-589.
Doremalen, Joost van / Cucchiarini, Catia / Strik, Helmer:
"Using non-native error patterns to improve pronunciation verification",
590-593.
Luo, Dean / Qiao, Yu / Minematsu, Nobuaki / Yamauchi, Yutaka / Hirose, Keikichi:
"Regularized-MLLR speaker adaptation for computer-assisted language learning system",
594-597.
Hirabayashi, Kuniaki / Nakagawa, Seiichi:
"Automatic evaluation of English pronunciation by Japanese speakers using various acoustic features and pattern recognition techniques",
598-601.
Liao, Hsien-Cheng / Chen, Jiang-Chun / Chang, Sen-Chia / Guan, Ying-Hua / Lee, Chin-Hui:
"Decision tree based tone modeling with corrective feedbacks for automatic Mandarin tone assessment",
602-605.
Lu, Jingli / Wang, Ruili / Silva, Liyanage C. De / Gao, Yang / Liu, Jia:
"CASTLE: a computer-assisted stress teaching and learning environment for learners of English as a second language",
606-609.
Huang, Shen / Li, Hongyan / Wang, Shijin / Liang, Jiaen / Xu, Bo:
"Automatic reference independent evaluation of prosody quality using multiple knowledge fusions",
610-613.
Yoon, Su-Youn / Hasegawa-Johnson, Mark / Sproat, Richard:
"Landmark-based automated pronunciation error detection",
614-617.
Shuang, Zhiwei / Kang, Shiyin / Qin, Yong / Dai, Lirong / Cai, Lianhong:
"HMM based TTS for mixed language text",
618-621.
Liang, Hui / Dines, John:
"An analysis of language mismatch in HMM state mapping-based cross-lingual speaker adaptation",
622-625.
Kawahara, Tatsuya / Katsumaru, Norihiro / Akita, Yuya / Mori, Shinsuke:
"Classroom note-taking system for hearing impaired students using automatic speech recognition adapted to lectures",
626-629.
Dixon, Paul R. / Furui, Sadaoki:
"Exploring web-browser based runtimes engines for creating ubiquitous speech interfaces",
630-632.
Pitch and Glottal-Waveform Estimation and Modeling I, II
Sun, Xuejing / Gadre, Sameer:
"Efficient three-stage pitch estimation for packet loss concealment",
633-636.
Funaki, Keiichi:
"On evaluation of the f0 estimation based on time-varying complex speech analysis",
637-640.
Huang, Feng / Lee, Tan:
"Pitch estimation in noisy speech based on temporal accumulation of spectrum peaks",
641-644.
Wang, Tianyu T. / Quatieri, Thomas F.:
"Multi-pitch estimation by a joint 2-d representation of pitch and pitch dynamics",
645-648.
Tsiakoulis, Pirros / Potamianos, Alexandros:
"On the effect of fundamental frequency on amplitude and frequency modulation patterns in speech resonances",
649-652.
Rahman, M. Shahidur / Shimamura, Tetsuya:
"Pitch determination using autocorrelation function in spectral domain",
653-656.
Drugman, Thomas / Dutoit, Thierry:
"Chirp complex cepstrum-based decomposition for asynchronous glottal analysis",
657-660.
Cinnéide, Alan Ó / Dorran, David / Gainza, Mikel / Coyle, Eugene:
"Exploiting glottal formant parameters for glottal inverse filtering and parameterization",
661-664.
Sturmel, Nicolas / d'Alessandro, Christophe / Doval, Boris:
"Glottal parameters estimation on speech using the zeros of the z-transform",
665-668.
Reddy Mallidi, Sri Harish / Prahallad, Kishore / Gangashetty, Suryakanth V. / Yegnanarayana, Bayya:
"Significance of pitch synchronous analysis for speaker recognition using AANN models",
669-672.
Chen, Gang / Feng, Xue / Shue, Yen-Liang / Alwan, Abeer:
"On using voice source measures in automatic gender classification of children's speech",
673-676.
Chu, Wei / Alwan, Abeer:
"SAFE: a statistical algorithm for F0 estimation for both clean and noisy speech",
2590-2593.
Hong, Jung Ook / Wolfe, Patrick J.:
"Robust and efficient pitch estimation using an iterative ARMA technique",
2594-2597.
Ohishi, Yasunori / Kameoka, Hirokazu / Mochihashi, Daichi / Nagano, Hidehisa / Kashino, Kunio:
"Statistical modeling of F0 dynamics in singing voices based on Gaussian processes with multiple oscillation bases",
2598-2601.
Heckmann, Martin / Gläser, Claudius / Joublin, Frank / Nakadai, Kazuhiro:
"Applying geometric source separation for improved pitch extraction in human-robot interaction",
2602-2605.
Kane, John / Kane, Mark / Gobl, Christer:
"A spectral LF model based approach to voice source parameterisation",
2606-2609.
Drugman, Thomas / Dutoit, Thierry:
"Glottal-based analysis of the lombard effect",
2610-2613.
Open Vocabulary Spoken Document Retrieval (Special Session)
Itoh, Yoshiaki / Nishizaki, Hiromitsu / Hu, Xinhui / Nanjo, Hiroaki / Akiba, Tomoyosi / Kawahara, Tatsuya / Nakagawa, Seiichi / Matsui, Tomoko / Yamashita, Yoichi / Aikawa, Kiyoaki:
"Constructing Japanese test collections for spoken term detection",
677-680.
Natori, Satoshi / Nishizaki, Hiromitsu / Sekiguchi, Yoshihiro:
"Japanese spoken term detection using syllable transition network derived from multiple speech recognizers' outputs",
681-684.
Meng, Sha / Zhang, Wei-Qiang / Liu, Jia:
"Combining Chinese spoken term detection systems via side-information conditioned linear logistic regression",
685-688.
Kaneko, Taisuke / Akiba, Tomoyosi:
"Metric subspace indexing for fast spoken term detection",
689-692.
Chan, Chun-an / Lee, Lin-shan:
"Unsupervised spoken-term detection with spoken queries using segment-based dynamic time warping",
693-696.
Schneider, Daniel / Mertens, Timo / Larson, Martha / Köhler, Joachim:
"Contextual verification for open vocabulary spoken term detection",
697-700.
Tejedor, Javier / Toledano, Doroteo T. / Bautista, Miguel / King, Simon / Wang, Dong / Colás, José:
"Augmented set of features for confidence estimation in spoken term detection",
701-704.
Hu, Xinhui / Isotani, Ryosuke / Kawai, Hisashi / Nakamura, Satoshi:
"Cluster-based language model for spoken document retrieval using NMF-based document clustering",
705-708.
Robust ASR
Dalen, Rogier C. van / Gales, Mark J. F.:
"Asymptotically exact noise-corrupted speech likelihoods",
709-712.
Astudillo, Ramón Fernandez / Orglmeister, Reinhold:
"A MMSE estimator in mel-cepstral domain for robust large vocabulary automatic speech recognition using uncertainty propagation",
713-716.
Raj, Bhiksha / Virtanen, Tuomas / Chaudhuri, Sourish / Singh, Rita:
"Non-negative matrix factorization based compensation of music for automatic speech recognition",
717-720.
Demuynck, Kris / Zhang, Xueru / Compernolle, Dirk Van / Van hamme, Hugo:
"Feature versus model based noise robustness",
721-724.
Park, Ji Hun / Kim, Seon Man / Yoon, Jae Sam / Kim, Hong Kook / Lee, Sung Joo / Lee, Yunkeun:
"SNR-based mask compensation for computational auditory scene analysis applied to speech recognition in a car environment",
725-728.
Kim, Chanwoo / Stern, Richard M. / Eom, Kiwan / Lee, Jaewon:
"Automatic selection of thresholds for signal separation algorithms based on interaural delay",
729-732.
Language and Dialect Identification
Verdet, Florian / Matrouf, Driss / Bonastre, Jean-François / Hennebert, Jean:
"Channel detectors for system fusion in the context of NIST LRE 2009",
733-736.
Tong, Rong / Ma, Bin / Li, Haizhou / Chng, Eng Siong:
"Selecting phonotactic features for language recognition",
737-740.
Hanani, Abualsoud / Carey, Michael / Russell, Martin J.:
"Improved language recognition using mixture components statistics",
741-744.
Penagarikano, Mikel / Varona, Amparo / Rodriguez-Fuentes, Luis Javier / Bordel, German:
"Using cross-decoder co-occurrences of phone n-grams in SVM-based phonotactic language recognition",
745-748.
Koller, Oscar / Abad, Alberto / Trancoso, Isabel / Viana, Céu:
"Exploiting variety-dependent phones in portuguese variety identification applied to broadcast news transcription",
749-752.
Biadsy, Fadi / Hirschberg, Julia / Collins, Michael:
"Dialect recognition using a phone-GMM-supervector-based SVM kernel",
753-756.
Technologies for Learning and Education
Qian, Xiaojun / Soong, Frank K. / Meng, Helen:
"Discriminative acoustic model for improving mispronunciation detection and diagnosis in computer-aided pronunciation training (CAPT)",
757-760.
Chen, Liang-Yu / Jang, Jyh-Shing Roger:
"Automatic pronunciation scoring using learning to rank and DP-based score segmentation",
761-764.
Lo, Wai-Kit / Zhang, Shuang / Meng, Helen:
"Automatic derivation of phonological rules for mispronunciation detection in a computer-assisted pronunciation training system",
765-768.
Duong, Minh / Mostow, Jack:
"Adapting a duration synthesis model to rate children's oral reading prosody",
769-772.
Yoon, Su-Youn / Chen, Lei / Zechner, Klaus:
"Predicting word accuracy for the automatic speech recognition of non-native speech",
773-776.
Zhu, Taotao / Ke, Dengfeng / Chen, Zhenbiao / Xu, Bo:
"A new approach for automatic tone error detection in strong accented Mandarin based on dominant set",
777-780.
Emotional Speech
Prasanna, S. R. M. / Govind, D.:
"Analysis of excitation source information in emotional speech",
781-784.
Wu, Dongrui / Parsons, Thomas D. / Narayanan, Shrikanth S.:
"Acoustic feature analysis in speech emotion primitives estimation",
785-788.
Yeh, Lan-Ying / Chi, Tai-Shih:
"Spectro-temporal modulations for robust speech emotion recognition",
789-792.
Lee, Chi-Chun / Black, Matthew / Katsamanis, Athanasios / Lammert, Adam C. / Baucom, Brian R. / Christensen, Andrew / Georgiou, Panayiotis G. / Narayanan, Shrikanth S.:
"Quantification of prosodic entrainment in affective spontaneous spoken interactions of married couples",
793-796.
Mower, Emily / Han, Kyu J. / Lee, Sungbok / Narayanan, Shrikanth S.:
"A cluster-profile representation of emotion using agglomerative hierarchical clustering",
797-800.
Schuller, Björn / Devillers, Laurence:
"Incremental acoustic valence recognition: an inter-corpus perspective on features, matching, and performance in a gating paradigm",
801-804.
New Paradigms in ASR I, II
Wang, Xiao-Dong / Owa, Kunihiko / Shozakai, Makoto:
"Mandarin digit recognition assisted by selective tone distinction",
857-860.
Abe, Kazuhiko / Sakti, Sakriani / Isotani, Ryosuke / Kawai, Hisashi / Nakamura, Satoshi:
"Brazilian portuguese acoustic model training based on data borrowing from other language",
861-864.
Vu, Ngoc Thang / Schlippe, Tim / Kraus, Franziska / Schultz, Tanja:
"Rapid bootstrapping of five eastern european languages using the rapid language adaptation toolkit",
865-868.
Cao, Houwei / Lee, Tan / Ching, P. C.:
"Cross-lingual speaker adaptation via Gaussian component mapping",
869-872.
Elmahdy, Mohamed / Gruhn, Rainer / Minker, Wolfgang / Abdennadher, Slim:
"Cross-lingual acoustic modeling for dialectal Arabic speech recognition",
873-876.
Thomas, Samuel / Ganapathy, Sriram / Hermansky, Hynek:
"Cross-lingual and multi-stream posterior features for low resource LVCSR systems",
877-880.
Sundaram, Shiva / Bellegarda, Jerome R.:
"Latent perceptual mapping: a new acoustic modeling framework for speech recognition",
881-884.
Dufour, Richard / Bougares, Fethi / Estève, Yannick / Deléglise, Paul:
"Unsupervised model adaptation on targeted speech segments for LVCSR system combination",
885-888.
Ayllón Clemente, Irene / Heckmann, Martin / Denecke, Alexander / Wrede, Britta / Goerick, Christian:
"Incremental word learning using large-margin discriminative training and variance floor estimation",
889-892.
Virtanen, Tuomas / Gemmeke, Jort F. / Hurmalainen, Antti:
"State-based labelling for a sparse representation of speech and its application to robust speech recognition",
893-896.
Hannemann, Mirko / Kombrink, Stefan / Karafiát, Martin / Burget, Lukáš:
"Similarity scoring for recognizing repeated out-of-vocabulary words",
897-900.
Seppi, Dino / Compernolle, Dirk Van:
"Data pruning for template-based automatic speech recognition",
901-904.
Siu, Man-Hung / Gish, Herbert / Chan, Arthur / Belfield, William:
"Improved topic classification and keyword discovery using an HMM-based speech recognizer trained without supervision",
2838-2841.
Kanevsky, Dimitri / Sainath, Tara N. / Ramabhadran, Bhuvana / Nahamoo, David:
"An analysis of sparseness and regularization in exemplar-based methods for speech classification",
2842-2845.
Mohamed, Abdel-rahman / Yu, Dong / Deng, L.:
"Investigation of full-sequence training of deep belief networks for speech recognition",
2846-2849.
Wang, Yow-Bang / Lee, Lin-shan:
"Mandarin tone recognition using affine-invariant prosodic features and tone posteriorgram",
2850-2853.
Zweig, Geoffrey / Nguyen, Patrick / Droppo, Jasha / Acero, Alex:
"Continuous speech recognition with a TF-IDF acoustic model",
2854-2857.
Zweig, Geoffrey / Nguyen, Patrick:
"SCARF: a segmental conditional random field toolkit for speech recognition",
2858-2861.
Speech Production: Various Approaches
Amano-Kusumoto, Akiko / Hosom, John-Paul / Kain, Alexander:
"Speaking style dependency of formant targets",
905-908.
Kitamura, Tatsuya:
"Similarity of effects of emotions on the speech organ configuration with and without speaking",
909-912.
Bone, Daniel / Kim, Samuel / Lee, Sungbok / Narayanan, Shrikanth S.:
"A study of intra-speaker and inter-speaker affective variability using electroglottograph and inverse filtered glottal waveforms",
913-916.
Sakakibara, Ken-Ichi / Imagawa, Hiroshi / Kimura, Miwako / Yokonishi, Hisayuki / Tayama, Niro:
"Modal analysis of vocal fold vibrations using laryngotopography",
917-920.
Vainio, Martti / Airas, Matti / Järvikivi, Juhani / Alku, Paavo:
"Laryngeal voice quality in the expression of focus",
921-924.
Fujimoto, Masako / Maekawa, Kikuo / Funatsu, Seiya:
"Laryngeal characteristics during the production of geminate consonants",
925-928.
Cisonni, Julien / Nozaki, Kazunori / Hirtum, Annemie Van / Wada, Shigeo:
"Numerical study of turbulent flow-induced sound production in presence of a tooth-shaped obstacle: towards sibilant [s] physical modeling",
929-932.
Hanique, Iris / Schuppler, Barbara / Ernestus, Mirjam:
"Morphological and predictability effects on schwa reduction: the case of dutch word-initial syllables",
933-936.
Al Moubayed, Samer / Ananthakrishnan, G.:
"Acoustic-to-articulatory inversion based on local regression",
937-940.
Broersma, Mirjam:
"Korean lenis, fortis, and aspirated stops: effect of place of articulation on acoustic realization",
941-944.
Nakashika, Toru / Tachibana, Ryuki / Nishimura, Masafumi / Takiguchi, Tetsuya / Ariki, Yasuo:
"Speech synthesis by modeling harmonics structure with multiple function",
945-948.
Otani, Makoto / Hirahara, Tatsuya:
"Physics of body-conducted silent speech - production, propagation and representation of non-audible murmur",
949-952.
Speech Enhancement
Chakladar, Subhojit / Kim, Nam Soo / Jin, Yu Gwang / Kang, Tae Gyoon:
"Multichannel noise reduction using low order RTF estimate",
953-956.
Lee, Inho / Yoon, Jongsung / Lee, Yoonjae / Ko, Hanseok:
"Reinforced blocking matrix with cross channel projection for speech enhancement",
957-960.
Cheng, Ning / Liu, Wenju / Wang, Lan:
"Masking property based microphone array post-filter design",
961-964.
Sato, Yusuke / Hoya, Tetsuya / Bakardjian, Hovagim / Cichocki, Andrzej:
"Reduction of broadband noise in speech signals by multilinear subspace analysis",
965-968.
Hong, Jungpyo / Han, Seungho / Jeong, Sangbae / Hahn, Minsoo:
"Novel probabilistic control of noise reduction for improved microphone array beamforming",
969-972.
Li, Kai / Fu, Qiang / Yan, Yonghong:
"Speech enhancement using improved generalized sidelobe canceller in frequency domain with multi-channel postfiltering",
973-976.
Even, Jani / Ishi, Carlos / Saruwatari, Hiroshi / Hagita, Norihiro:
"Close speaker cancellation for suppression of non-stationary background noise for hands-free speech interface",
977-980.
Srinivasamurthy, Ajay / Sreenivas, Thippur V.:
"Multi-channel iterative dereverberation based on codebook constrained iterative multi-channel wiener filter",
981-984.
Medabalimi, Anand Joseph Xavier / Reddy Mallidi, Sri Harish / Yegnanarayana, Bayya:
"Speaker-dependent mapping of source and system features for enhancement of throat microphone speech",
985-988.
Cai, Jun / Marini, Stefano / Malarme, Pierre / Grenez, Francis / Schoentgen, Jean:
"An analytic modeling approach to enhancing throat microphone speech commands for keyword spotting",
989-992.
So, Stephen / Wójcicki, Kamil K. / Paliwal, Kuldip K.:
"Single-channel speech enhancement using kalman filtering in the modulation domain",
993-996.
Yao, Miao / Liang, Weiqian:
"Integrated feedback and noise reduction algorithm in digital hearing aids via oscillation detection",
997-1000.
Mercier, Charles / Lefebvre, Roch:
"A blind signal-to-noise ratio estimator for high noise speech recordings",
1001-1004.
Special Session: Fact and Replica of Speech Production (Special Session)
Imagawa, Hiroshi / Sakakibara, Ken-Ichi / Tokuda, Isao T. / Otsuka, Mamiko / Tayama, Niro:
"Estimation of glottal area function using stereo-endoscopic high-speed digital imaging",
1005-1008.
Nozaki, Kazunori / Ohnishi, Youhei / Suda, Takeshi / Wada, Shigeo / Shimojo, Shinji:
"Toward aero-acoustical analysis of the sibilant /s/: an oral cavity modeling",
1009-1012.
Motoki, Kunitoshi:
"Effects of wall impedance on transmission and attenuation of higher-order modes in vocal-tract model",
1013-1016.
Birkholz, Peter / Kröger, Bernd J. / Neuschaefer-Rube, Christiane:
"Articulatory synthesis and perception of plosive-vowel syllables with virtual consonant targets",
1017-1020.
Fukui, Kotaro / Kusano, Toshihiro / Mukaeda, Yoshikazu / Suzuki, Yuto / Takanishi, Atsuo / Honda, Masaaki:
"Speech robot mimicking human articulatory motion",
1021-1024.
Arai, Takayuki:
"Mechanical vocal-tract models for speech dynamics",
1025-1028.
Brady, Michael C.:
"Prosodic timing analysis for articulatory re-synthesis using a bank of resonators with an adaptive oscillator",
1029-1032.
ASR: Language Modeling
Emami, Ahmad / Chen, Stanley F. / Ittycheriah, Abraham / Soltau, Hagen / Zhao, Bing:
"Decoding with shrinkage-based language models",
1033-1036.
Chen, Stanley F. / Chu, Stephen M.:
"Enhanced word classing for model M",
1037-1040.
Park, Junho / Liu, Xunying / Gales, Mark J. F. / Woodland, Phil C.:
"Improved neural network based language modelling and adaptation",
1041-1044.
Mikolov, Tomáš / Karafiát, Martin / Burget, Lukáš / Černocký, Jan / Khudanpur, Sanjeev:
"Recurrent neural network based language model",
1045-1048.
Jyothi, Preethi / Fosler-Lussier, Eric:
"Discriminative language modeling using simulated ASR errors",
1049-1052.
Neubig, Graham / Mimura, Masato / Mori, Shinsuke / Kawahara, Tatsuya:
"Learning a language model from continuous speech",
1053-1056.
Single-Channel Speech Enhancement
So, Stephen / Paliwal, Kuldip K.:
"Fast converging iterative kalman filtering for speech enhancement using long and overlapped tapered windows with large side lobe attenuation",
1081-1084.
Sun, Xuejing / Yen, Kuan-Chieh / Alves, Rogerio:
"Robust noise estimation using minimum correction with harmonicity control",
1085-1088.
Triki, Mahdi:
"New insights into subspace noise tracking",
1089-1092.
Triki, Mahdi / Janse, Kees:
"Bias considerations for minimum subspace noise tracking",
1093-1096.
Ming, Ji / Srinivasan, Ramji / Crookes, Danny:
"A corpus-based approach to speech enhancement from nonstationary noise",
1097-1100.
Chen, Zhe / Cheng, You-Chi / Yin, Fuliang / Lee, Chin-Hui:
"Bandwidth expansion of speech based on wavelet transform modulus maxima vector mapping",
1101-1104.
Speech Synthesis: Miscellaneous Topics
Ogbureke, Kalu U. / Cahill, Peter / Carson-Berndsen, Julie:
"Hidden Markov models with context-sensitive observations for grapheme-to-phoneme conversion",
1105-1108.
Langner, Brian / Vogel, Stephan / Black, Alan W.:
"Evaluating a dialog language generation system: comparing the mountain system to other NLG approaches",
1109-1112.
Mattheyses, Wesley / Latacz, Lukas / Verhelst, Werner:
"Active appearance models for photorealistic visual speech synthesis",
1113-1116.
Bellegarda, Jerome R.:
"Latent affective mapping: a novel framework for the data-driven analysis of emotion in text",
1117-1120.
Janska, Anna C. / Clark, Robert A. J.:
"Native and non-native speaker judgements on the quality of synthesized speech",
1121-1124.
Espinosa, Dominic / White, Michael / Fosler-Lussier, Eric / Brew, Chris:
"Machine learning for text selection with expressive unit-selection voices",
1125-1128.
Prosody: Basics & Applications
Ivanov, Alexei V. / Riccardi, Giuseppe / Ghosh, S. / Tonelli, S. / Stepanov, E. A.:
"Acoustic correlates of meaning structure in conversational speech",
1129-1132.
Obin, Nicolas / Rodet, Xavier / Lacheret, Anne:
"HMM-based prosodic structure model using rich linguistic context",
1133-1136.
Wollermann, Charlotte / Schröder, Bernhard / Schade, Ulrich:
"Audiovisual congruence and pragmatic focus marking",
1137-1140.
Zellers, Margaret / Gubian, Michele / Post, Brechtje:
"Redescribing intonational categories with functional data analysis",
1141-1144.
Huang, Shen / Li, Hongyan / Wang, Shijin / Liang, Jiaen / Xu, Bo:
"Exploring goodness of prosody by diverse matching templates",
1145-1148.
Rouvier, Mickael / Dufour, Richard / Linarès, Georges / Estève, Yannick:
"A language-identification inspired method for spontaneous speech detection",
1149-1152.
Bailly, Gérard / Lelong, Amélie:
"Speech dominoes and phonetic convergence",
1153-1156.
Brendel, Mátyás / Zaccarelli, Riccardo / Devillers, Laurence:
"A quick sequential forward floating feature selection algorithm for emotion detection from speech",
1157-1160.
Kiss, Géza / Santen, Jan P. H. van:
"Automated vocal emotion recognition using phoneme class specific features",
1161-1164.
Pass, Adrian / Zhang, Jianguo / Stewart, Darryl:
"Feature selection for pose invariant lip biometrics",
1165-1168.
Hussein, Hussein / Hoffmann, Rüdiger:
"Signal-based accent and phrase marking using the fujisaki model",
1169-1172.
Kim, Jangwon / Lee, Sungbok / Narayanan, Shrikanth S.:
"A study of interplay between articulatory movement and prosodic characteristics in emotional speech production",
1173-1176.
ASR: Feature Extraction I, II
Li, Shang-wen / Sun, Liang-che / Lee, Lin-shan:
"Improved phoneme recognition by integrating evidence from spectro-temporal and cepstral features",
1177-1180.
Ravuri, Suman V. / Morgan, Nelson:
"Using spectro-temporal features to improve AFE feature extraction for ASR",
1181-1184.
Saratxaga, Ibon / Hernáez, Inma / Odriozola, Igor / Navas, Eva / Luengo, Iker / Erro, Daniel:
"Using harmonic phase information to improve ASR rate",
1185-1188.
Yamamoto, Kazumasa / Sueyoshi, Eiichi / Nakagawa, Seiichi:
"Speech recognition using long-term phase information",
1189-1192.
Zelinka, Jan / Trmal, Jan / Müller, Luděk:
"Low-dimensional space transforms of posteriors in speech recognition",
1193-1196.
Plahl, Christian / Schlüter, Ralf / Ney, Hermann:
"Hierarchical bottle neck features for LVCSR",
1197-1200.
Grézl, František / Karafiát, Martin:
"Hierarchical neural net architectures for feature extraction in ASR",
1201-1204.
Sridhar, Vivek Kumar Rangarajan / Prasad, Rohit / Natarajan, Prem:
"Mutual information analysis for feature and sensor subset selection in surface electromyography based speech recognition",
1205-1208.
Meyer, Bernd T. / Kollmeier, Birger:
"Learning from human errors: prediction of phoneme confusions based on modified ASR training",
1209-1212.
Li, Bo / Sim, Khe Chai:
"Hidden logistic linear regression for support vector machine based phone verification",
2614-2617.
Ng, Tim / Zhang, Bing / Nguyen, Long:
"Jointly optimized discriminative features for speech recognition",
2618-2621.
Müller, Florian / Mertins, Alfred:
"Invariant integration features combined with speaker-adaptation methods",
2622-2625.
Raugas, Mark / Sridhar, Vivek Kumar Rangarajan / Prasad, Rohit / Natarajan, Prem:
"Multi resolution discriminative models for subvocalic speech recognition",
2626-2629.
Valente, Fabio / Doss, Mathew Magimai / Plahl, Christian / Ravuri, Suman V. / Wang, Wen:
"A comparative large scale study of MLP features for Mandarin ASR",
2630-2633.
Do, Cong-Thanh / Pastor, Dominique / Lan, Gaël Le / Goalic, André:
"Recognizing cochlear implant-like spectrally reduced speech with HMM-based ASR: experiments with MFCCs and PLP coefficients",
2634-2637.
Speech Perception: Cross Language and Age
Kondo, Kazuhiro / Kanda, Takayuki / Kobayashi, Yosuke / Yagyu, Hiroyuki:
"Speech intelligibility of diagonally localized speech with competing noise using bone-conduction headphones",
1213-1216.
Divenyi, Pierre L.:
"Masking of vowel-analog transitions by vowel-analog distracters",
1217-1220.
Pellegrino, François / Ferragne, Emmanuel / Meunier, Fanny:
"2010, a speech oddity: phonetic transcription of reversed speech",
1221-1224.
Lin, Hsin-Yi / Fon, Janice:
"Perception on pitch reset at discourse boundaries",
1225-1228.
Dole, Marjorie / Hoen, Michel / Meunier, Fanny:
"Effect of spatial separation on speech-in-noise comprehension in dyslexic adults",
1229-1232.
Marklund, Ellen / Lacerda, Francisco / Ericsson, Anna:
"Speech categorization context effects in seven- to nine-month-old infants",
1233-1236.
Kewley-Port, Diane / Humes, Larry E. / Fogerty, Daniel:
"Changes in temporal processing of speech across the adult lifespan",
1237-1240.
Bernstein, Jared / Cheng, Jian / Suzuki, Masanori:
"Fluency and structural complexity as predictors of L2 oral proficiency",
1241-1244.
Ven, Marco van de / Tucker, Benjamin V. / Ernestus, Mirjam:
"Semantic facilitation in bilingual everyday speech comprehension",
1245-1248.
Hsieh, Bo-ren / Pan, Ho-hsien:
"L2 experience and non-native vowel categorization of L1-Mandarin speakers",
1249-1252.
Wester, Mirjam:
"Cross-lingual talker discrimination",
1253-1256.
Otake, Takashi:
"Dajare is not the lowest form of wit",
1257-1260.
SLP Systems
Torres, Rafael / Takeuchi, Shota / Kawanami, Hiromichi / Matsui, Tomoko / Saruwatari, Hiroshi / Shikano, Kiyohiro:
"Comparison of methods for topic classification in a speech-oriented guidance system",
1261-1264.
Comas, Pere R. / Turmo, Jordi / Màrquez, Lluís:
"Using dependency parsing and machine learning for factoid question answering on spoken documents",
1265-1268.
Parada, Carolina / Sethy, Abhinav / Dredze, Mark / Jelinek, Frederick:
"A spoken term detection framework for recovering out-of-vocabulary words using the web",
1269-1272.
Lee, Hung-yi / Chen, Chia-ping / Yeh, Ching-feng / Lee, Lin-shan:
"Improved spoken term detection by discriminative training of acoustic models based on user relevance feedback",
1273-1276.
Tschöpel, Sebastian / Schneider, Daniel:
"A lightweight keyword and tag-cloud retrieval algorithm for automatic speech recognition transcripts",
1277-1280.
Kanedera, Noboru / Funada, Tetsuo / Nakagawa, Seiichi:
"Lecture subtopic retrieval by retrieval keyword expansion using subordinate concept",
1281-1284.
Nanjo, Hiroaki / Iyonaga, Yusuke / Yoshimi, Takehiko:
"Spoken document retrieval for oral presentations integrating global document similarities into local document similarities",
1285-1288.
Polifroni, Joseph / Seneff, Stephanie:
"Combining word-based features, statistical language models, and parsing for named entity recognition",
1289-1292.
Zidouni, Azeddine / Rosset, Sophie / Glotin, Hervé:
"Efficient combined approach for named entity recognition in spoken language",
1293-1296.
Yella, Sree Harsha / Varma, Vasudeva / Prahallad, Kishore:
"Prominence based scoring of speech segments for automatic speech-to-speech summarization",
1297-1300.
Liu, Zihan / Xie, Lei / Feng, Wei:
"Maximum lexical cohesion for fine-grained news story segmentation",
1301-1304.
Wang, Xiaoxuan / Xie, Lei / Ma, Bin / Chng, Eng Siong / Li, Haizhou:
"Phoneme lattice based texttiling towards multilingual story segmentation",
1305-1308.
Quality of Experiencing Speech Services (Special Session)
Schlesinger, Anton / Boone, Marinus M.:
"The characterization of the relative information content by spectral features for the objective intelligibility assessment of nonlinearly processed speech",
1309-1312.
Wältermann, Marcel / Raake, Alexander / Möller, Sebastian:
"Analytical assessment and distance modeling of speech transmission quality",
1313-1316.
Côté, Nicolas / Koehl, Vincent / Gautier-Turbin, Valérie / Raake, Alexander / Möller, Sebastian:
"An intrusive super-wideband speech quality model: DIAL",
1317-1320.
Egger, Sebastian / Schatz, Raimund / Scherer, Stefan:
"It takes two to tango - assessing the impact of delay on conversational interactivity on perceived speech quality",
1321-1324.
Möller, Sebastian / Hinterleitner, Florian / Falk, Tiago H. / Polzehl, Tim:
"Comparison of approaches for instrumentally predicting the quality of text-to-speech systems",
1325-1328.
Kiss, Imre / Polifroni, Joseph / Wang, Chao / Choueiter, Ghinwa / Phillips, Mike:
"A hybrid architecture for mobile voice user interfaces",
1329-1332.
Turunen, Markku / Hakulinen, Jaakko / Heimonen, Tomi:
"Assessment of spoken and multimodal applications: lessons learned from laboratory and field studies",
1333-1336.
Engelbrecht, Klaus-Peter / Ketabdar, Hamed / Möller, Sebastian:
"Improving cross database prediction of dialogue quality using mixture of experts",
1337-1340.
Language Processing
Guinaudeau, Camille / Gravier, Guillaume / Sébillot, Pascale:
"Improving ASR-based topic segmentation of TV programs with confidence measures and semantic relations",
1365-1368.
Luz, Saturnino / Su, Jing:
"The relevance of timing, pauses and overlaps in dialogues: detecting topic changes in scenario based meetings",
1369-1372.
Dufour, Richard / Favre, Benoit:
"Semi-supervised part-of-speech tagging in speech applications",
1373-1376.
Tantini, Frédéric / Cerisara, Christophe / Gardent, Claire:
"Memory-based active learning for French broadcast news",
1377-1380.
Gillick, Dan:
"Can conversational word usage be used to predict speaker demographics?",
1381-1384.
Liu, Chao-Hong / Wu, Chung-Hsien:
"Prosodic word-based error correction in speech recognition using prosodic word expansion and contextual information",
1385-1388.
Speech and Audio Segmentation
Hoffmann, Sarah / Pfister, Beat:
"Fully automatic segmentation for prosodic speech corpora",
1389-1392.
Khanagha, Vahid / Daoudi, Khalid / Pont, Oriol / Yahia, Hussein:
"A novel text-independent phonetic segmentation algorithm based on the microcanonical multiscale formalism",
1393-1396.
Lin, You-Yu / Wang, Yih-Ru / Liao, Yuan-Fu:
"Phone boundary detection using sample-based acoustic parameters",
1397-1400.
Musti, Utpala / Toutios, Asterios / Ouni, Slim / Colotte, Vincent / Wrobel-Dautcourt, Brigitte / Berger, Marie-Odile:
"HMM-based automatic visual speech segmentation using facial data",
1401-1404.
Wang, D. / Vogt, Robert / Sridharan, Sridha:
"Bayes factor based speaker segmentation for speaker diarization",
1405-1408.
Huang, Qiang / Cox, Stephen:
"Using high-level information to detect key audio events in a tennis game",
1409-1412.
Prosody: Analysis
Lai, Catherine:
"What do you mean, you're uncertain?: the interpretation of cue words and rising intonation in dialogue",
1413-1416.
Liu, Yi-Fen / Tseng, Shu-Chuan / Jang, Jyh-Shing Roger / Chen, C.-H. Alvin:
"Coping imbalanced prosodic unit boundary detection with linguistically-motivated prosodic features",
1417-1420.
Chen, Zhigang / Hu, Guoping / Jiang, Wei:
"Improving prosodic phrase prediction by unsupervised adaptation and syntactic features extraction",
1421-1424.
Li, Yujia / Lee, Tan:
"Perception-based automatic approximation of F0 contours in Cantonese speech",
1425-1428.
Fernandez, Raul / Ramabhadran, Bhuvana:
"Discriminative training and unsupervised adaptation for labeling prosodic events with limited training data",
1429-1432.
Cvejic, Erin / Kim, Jeesun / Davis, Chris / Gibert, Guillaume:
"Prosody for the eyes: quantifying visual prosody using guided principal component analysis",
1433-1436.
Systems for LVCSR and Rich Transcription
Parihar, Naveen / Schlüter, Ralf / Rybach, David / Hansen, Eric A.:
"Parallel lexical-tree based LVCSR on multi-core processors",
1485-1488.
Chong, Jike / Gonina, Ekaterina / You, Kisun / Keutzer, Kurt:
"Exploring recognition network representations for efficient speech inference on highly parallel platforms",
1489-1492.
Caseiro, Diamantino:
"WFST compression for automatic speech recognition",
1493-1496.
Bulyko, Ivan:
"Speech recognizer optimization under speed constraints",
1497-1500.
Metze, Florian / Hsiao, Roger / Jin, Qin / Nallasamy, Udhyakumar / Schultz, Tanja:
"The 2010 CMU GALE speech-to-text system",
1501-1504.
Nwe, Tin Lay / Sun, Hanwu / Ma, Bin / Li, Haizhou:
"Speaker diarization in meeting audio for single distant microphone",
1505-1508.
Batista, Fernando / Moniz, Helena / Trancoso, Isabel / Meinedo, Hugo / Mata, Ana Isabel / Mamede, Nuno:
"Extending the punctuation module for european portuguese",
1509-1512.
Sakti, Sakriani / Isotani, Ryosuke / Kawai, Hisashi / Nakamura, Satoshi:
"Utilizing a noisy-channel approach for Korean LVCSR",
1513-1516.
Nußbaum-Thom, Markus / Wiesler, Simon / Sundermeyer, Martin / Plahl, Christian / Hahn, Stefan / Schlüter, Ralf / Ney, Hermann:
"The RWTH 2009 quaero ASR evaluation system for English and German",
1517-1520.
Phonetics
Munson, Benjamin / Solum, Renata:
"When is indexical information about speech activated? evidence from a cross-modal priming experiment",
1521-1524.
Munson, Benjamin:
"The influence of actual and perceived sexual orientation on diadochokinetic rate in women and men",
1525-1528.
Yu, Kristine M.:
"Laryngealization and features for Chinese tonal recognition",
1529-1532.
Nguyen, Viet Son / Castelli, Eric / Carré, René:
"Production and perception of vietnamese short vowels in V1V2 context",
1533-1536.
Fenk-Oczlon, Gertraud / Fenk, August:
"Measuring basic tempo across languages and some implications for speech rhythm",
1537-1540.
Hirata, Yukari / Amano, Shigeaki:
"Durational structure of Japanese single/geminate stops in three- and four-mora words spoken at varied rates",
1541-1544.
Sano, Shin-ichiro / Ooigawa, Tomohiko:
"Distribution and trichotomic realization of voiced velars in Japanese - an experimental study",
1545-1548.
Sieczkowska, Jagoda / Möbius, Bernd / Dogil, Grzegorz:
"Specification in context - devoicing processes in Polish, French, american English and German sonorants",
1549-1552.
Nielsen, Kuniko:
"Phonetic imitation of Japanese vowel devoicing",
1553-1556.
Stevens, Mary / Hajek, John:
"Post-aspiration in standard Italian: some first cross-regional acoustic evidence",
1557-1560.
Grimaldi, Mirko / Calabrese, Andrea / Sigona, Francesco / Garrapa, Luigina / Sisinni, Bianca:
"Articulatory grounding of southern salentino harmony processes",
1561-1564.
Tanida, Yuuki / Ueno, Taiji / Saito, Satoru / Ralph, Matthew A. Lambon:
"Effects of accent typicality and phonotactic frequency on nonword immediate serial recall performance in Japanese",
1565-1567.
Fujimura, Osamu:
"How abstract is phonetics?",
1568-1571.
Speech Production: Vocal Tract Modeling and Imaging
Lammert, Adam C. / Proctor, Michael I. / Narayanan, Shrikanth S.:
"Data-driven analysis of realtime vocal tract MRI using correlated image regions",
1572-1575.
Proctor, Michael I. / Bone, Daniel / Katsamanis, Athanasios / Narayanan, Shrikanth S.:
"Rapid semi-automatic segmentation of real-time magnetic resonance images for parametric vocal tract analysis",
1576-1579.
Kim, Yoon-Chul / Narayanan, Shrikanth S. / Nayak, Krishna S.:
"Improved real-time MRI of oral-velar coordination using a golden-ratio spiral view order",
1580-1583.
Bresch, Erik / Katsamanis, Athanasios / Goldstein, Louis / Narayanan, Shrikanth S.:
"Statistical multi-stream modeling of real-time MRI articulatory speech data",
1584-1587.
Ananthakrishnan, G. / Badin, Pierre / Vargas, Julián Andrés Valdés / Engwall, Olov:
"Predicting unseen articulations from multi-speaker articulatory models",
1588-1591.
Qin, Chao / Carreira-Perpiñán, Miguel Á.:
"Estimating missing data sequences in x-ray microbeam recordings",
1592-1595.
Qin, Chao / Carreira-Perpiñán, Miguel Á. / Farhadloo, Mohsen:
"Adaptation of a tongue shape model by local feature transformations",
1596-1599.
Lee, Sungbok / Narayanan, Shrikanth S.:
"Vocal tract contour analysis of emotional speech by the functional data curve representation",
1600-1603.
Lammert, Adam C. / Goldstein, Louis / Iskarous, Khalil:
"Locally-weighted regression for estimating the forward kinematics of a geometric vocal tract model",
1604-1607.
Reimer, Michael / Rudzicz, Frank:
"Identifying articulatory goals from kinematic data using principal differential analysis",
1608-1611.
Ming, Zuheng / Beautemps, Denis / Feng, Gang / Schmerber, Sébastien:
"Estimation of speech lip features from discrete cosinus transform",
1612-1615.
Ahmadi, Farzaneh / McLoughlin, Ian V. / Sharifzadeh, Hamid R.:
"Autoregressive modelling for linear prediction of ultrasonic speech",
1616-1619.
Speech Intelligibility Enhancement for All Ages, Health Conditions and Environments (Special Session)
Arai, Takayuki / Hodoshima, Nao:
"Enhanced speech yielding higher intelligibility for all listeners and environments",
1620-1623.
Sadjadi, Seyed Omid / Patil, Sanjay A. / Hansen, John H. L.:
"Quality conversion of non-acoustic signals for facilitating human-to-human speech communication under harsh acoustic conditions",
1624-1627.
Nakamura, Keigo / Toda, Tomoki / Saruwatari, Hiroshi / Shikano, Kiyohiro:
"The use of air-pressure sensor in electrolaryngeal speech enhancement based on statistical voice conversion",
1628-1631.
Kim, Gibak / Loizou, Philipos C.:
"A new binary mask based on noise constraints for improved speech intelligibility",
1632-1635.
Tang, Yan / Cooke, Martin:
"Energy reallocation strategies for speech enhancement in known noise conditions",
1636-1639.
Chen, Jing / Baer, Thomas / Moore, Brian C. J.:
"Effects of enhancement of spectral changes on speech quality and subjective speech intelligibility",
1640-1643.
ASR: Acoustic Model Adaptation
Breslin, Catherine / Chin, K. K. / Gales, Mark J. F. / Knill, Kate / Xu, Haitian:
"Prior information for rapid speaker adaptation",
1644-1647.
Lööf, Jonas / Schlüter, Ralf / Ney, Hermann:
"Discriminative adaptation for log-linear acoustic models",
1648-1651.
Vergyri, Dimitra / Lamel, Lori / Gauvain, Jean-Luc:
"Automatic speech recognition of multiple accented English data",
1652-1655.
Li, Jinyu / Tsao, Yu / Lee, Chin-Hui:
"Shrinkage model adaptation in automatic speech recognition",
1656-1659.
Li, Jinyu / Yu, Dong / Gong, Yifan / Deng, L.:
"Unscented transform with online distortion estimation for HMM adaptation",
1660-1663.
Seltzer, Michael L. / Acero, Alex:
"HMM adaptation using linear spline interpolation with integrated spline parameter training for robust speech recognition",
1664-1667.
SLP Systems for Information Extraction/Retrieval
Wang, Dong / King, Simon / Evans, Nicholas / Troncy, Raphaël:
"CRF-based stochastic pronunciation modeling for out-of-vocabulary spoken term detection",
1668-1671.
Chen, Chia-ping / Lee, Hung-yi / Yeh, Ching-feng / Lee, Lin-shan:
"Improved spoken term detection by feature space pseudo-relevance feedback",
1672-1675.
Jansen, Aren / Church, Kenneth / Hermansky, Hynek:
"Towards spoken term discovery at scale with zero resources",
1676-1679.
Gouvêa, Evandro / Ezzat, Tony:
"Vocabulary independent spoken query: a case for subword units",
1680-1683.
Lin, Shih-Hsiang / Yeh, Yao-Ming / Chen, Berlin:
"Extractive speech summarization - from the view of decision theory",
1684-1687.
Murray, Gabriel / Carenini, Giuseppe / Ng, Raymond:
"The impact of ASR on abstractive vs. extractive meeting summaries",
1688-1691.
Speech Representation
Deng, L. / Seltzer, Michael L. / Yu, Dong / Acero, Alex / Mohamed, Abdel-rahman / Hinton, G.:
"Binary coding of speech spectrograms using a deep auto-encoder",
1692-1695.
Nam, Juhan / Mysore, Gautham J. / Ganseman, Joachim / Lee, Kyogu / Abel, Jonathan S.:
"A super-resolution spectrogram using coupled PLCA",
1696-1699.
Tzedakis, Georgios / Pantazis, Yannis / Rosec, Olivier / Stylianou, Yannis:
"Fast least-squares solution for sinusoidal, harmonic and quasi-harmonic models",
1700-1703.
Asaei, Afsaneh / Bourlard, Hervé / Garner, Philip N.:
"Sparse component analysis for speech recognition in multi-speaker environment",
1704-1707.
Skogstad, Trond / Svendsen, Torbjørn:
"Intra-frame variability as a predictor of frame classifiability",
1708-1711.
Shimamura, Tetsuya / Nguyen, Ngoc Dinh:
"Autocorrelation and double autocorrelation based spectral representations for a noisy word recognition system",
1712-1715.
Voice Conversion
Helander, Elina / Silén, Hanna / Míguez, Joaquin / Gabbouj, Moncef:
"Maximum a posteriori voice conversion using sequential monte carlo methods",
1716-1719.
Lanchantin, Pierre / Rodet, Xavier:
"Dynamic model selection for spectral voice conversion",
1720-1723.
Nose, Takashi / Kobayashi, Takao:
"Speaker-independent HMM-based voice conversion using quantized fundamental frequency",
1724-1727.
Saito, Daisuke / Watanabe, Shinji / Nakamura, Atsushi / Minematsu, Nobuaki:
"Probabilistic integration of joint density model and speaker model for voice conversion",
1728-1731.
Wu, Zhi-Zheng / Kinnunen, Tomi / Chng, Eng Siong / Li, Haizhou:
"Text-independent F0 transformation with non-parallel data for voice conversion",
1732-1735.
Zhuang, Xiaodan / Wang, Lijuan / Soong, Frank K. / Hasegawa-Johnson, Mark:
"A minimum converted trajectory error (MCTE) approach to high quality speech-to-lips conversion",
1736-1739.
Prosody: Language-Specific Models
Karlsson, Anastasia / House, David / Svantesson, Jan-Olof / Tayanin, Damrong:
"Influence of lexical tones on intonation in kammu",
1740-1743.
Nambu, Satoshi / Lee, Yong-cheol:
"Phonetic realization of second occurrence focus in Japanese",
1744-1747.
Kuang, Jianjing:
"Prosodic grouping and relative clause disambiguation in Mandarin",
1748-1751.
Li, Ya / Tao, Jianhua / Zhang, Meng / Pan, Shifeng / Xu, Xiaoying:
"Text-based unstressed syllable prediction in Mandarin",
1752-1755.
Duběda, Tomáš:
""flat pitch accents" in Czech",
1756-1759.
Duběda, Tomáš:
"Positional variability of pitch accents in Czech",
1760-1763.
Mandal, Shyamal Das / Saha, Arup / Basu, Tulika / Hirose, Keikichi / Fujisaki, Hiroya:
"Modeling of sentence-medial pauses in bangla readout speech: occurrence and duration",
1764-1767.
Leemann, Adrian / Zuberbühler, Lucy:
"Declarative sentence intonation patterns in 8 swiss German dialects",
1768-1771.
Jeon, Je Hun / Liu, Yang:
"Syllable-level prominence detection with acoustic evidence",
1772-1775.
Prasad, Sankalan / Bali, Kalika:
"Prosody cues for classification of the discourse particle "hã" in hindi",
1776-1779.
Jia, Yuan / Li, Aijun:
"Interaction of syntax-marked focus and wh-question induced focus in standard Chinese",
1780-1783.
Al Moubayed, Samer / Beskow, Jonas:
"Prominence detection in Swedish using syllable correlates",
1784-1787.
Zhi, Na / Hirst, Daniel / Bertinetto, Pier Marco:
"Automatic analysis of the intonation of a tone language. applying the momel algorithm to spontaneous standard Chinese (beijing)",
1788-1791.
Ng, Raymond W. M. / Leung, Cheung-Chi / Hautamäki, Ville / Lee, Tan / Ma, Bin / Li, Haizhou:
"Towards long-range prosodic attribute modeling for language recognition",
1792-1795.
Schubert, Robert / Jokisch, Oliver / Hirschfeld, Diane:
"A modified parameterization of the Fujisaki model",
1796-1799.
ASR: Language Modeling and Speech Understanding I
Momtazi, Saeedeh / Faubel, Friedrich / Klakow, Dietrich:
"Within and across sentence boundary language model",
1800-1803.
Sarikaya, Ruhi / Chen, Stanley F. / Sethy, Abhinav / Ramabhadran, Bhuvana:
"Impact of word classing on shrinkage-based language models",
1804-1807.
Oger, Stanislas / Popescu, Vladimir / Linarès, Georges:
"Combination of probabilistic and possibilistic language models",
1808-1811.
Ballinger, Brandon / Allauzen, Cyril / Gruenstein, Alexander / Schalkwyk, Johan:
"On-demand language model interpolation for mobile speech input",
1812-1815.
Schlippe, Tim / Zhu, Chenfei / Gebhardt, Jan / Schultz, Tanja:
"Text normalization based on statistical machine translation and internet user support",
1816-1819.
Alumäe, Tanel / Kurimo, Mikko:
"Efficient estimation of maximum entropy language models with n-gram features: an SRILM extension",
1820-1823.
Gillot, Christian / Cerisara, Christophe / Langlois, David / Haton, Jean-Paul:
"Similar n-gram language model",
1824-1827.
Jongtaveesataporn, Markpong / Furui, Sadaoki:
"Topic and style-adapted language modeling for Thai broadcast news ASR",
1828-1831.
Emami, Ahmad / Kuo, Hong-Kwang J. / Zitouni, Imed / Mangu, Lidia:
"Augmented context features for Arabic speech recognition",
1832-1835.
Ortega, Lucía / Galiano, Isabel / Hurtado, Lluís-F. / Sanchis, Emilio / Segarra, Encarna:
"A statistical segment-based approach for spoken language understanding",
1836-1839.
Lecouteux, Benjamin / Rubino, Raphaël / Linarès, Georges:
"Improving back-off models with bag of words and hollow-grams",
2418-2421.
Chelba, Ciprian / Brants, Thorsten / Neveitt, Will / Xu, Peng:
"Study on interaction between entropy pruning and kneser-ney smoothing",
2422-2425.
Yamamoto, Hitoshi / Hanazawa, Ken / Miki, Kiyokazu / Shinoda, Koichi:
"Dynamic language model adaptation using keyword category classification",
2426-2429.
Naptali, Welly / Tsuchiya, Masatoshi / Nakagawa, Seiichi:
"Integration of cache-based model and topic dependent class model with soft clustering and soft voting",
2430-2433.
Duvert, Fréderic / De Mori, Renato:
"Conditional models for detecting lambda-functions in a spoken language understanding system",
2434-2437.
Haidar, Md. Akmal / O'Shaughnessy, Douglas:
"Novel weighting scheme for unsupervised language model adaptation using latent dirichlet allocation",
2438-2441.
Tan, Qun Feng / Audhkhasi, Kartik / Georgiou, Panayiotis G. / Ettelaie, Emil / Narayanan, Shrikanth S.:
"Automatic speech recognition system channel modeling",
2442-2445.
Oba, Takanobu / Hori, Takaaki / Nakamura, Atsushi:
"Round-robin discrimination model for reranking ASR hypotheses",
2446-2449.
Sak, Haşim / Saraçlar, Murat / Güngör, Tunga:
"On-the-fly lattice rescoring for real-time automatic speech recognition",
2450-2453.
First and Second Language Acquisition
Cooper, Angela / Wang, Yue:
"Cantonese tone word learning by tone and non-tone language speakers",
1840-1843.
Cutler, Anne / Shanley, Janise:
"Validation of a training method for L2 continuous-speech segmentation",
1844-1847.
Yuan, Jiahong:
"Linguistic rhythm in foreign accent",
1848-1849.
Sonu, Mee / Tajima, Keiichi / Kato, Hiroaki / Sagisaka, Yoshinori:
"The effect of a word embedded in a sentence and speaking rate variation on the perceptual training of geminate and singleton consonant distinction",
1850-1853.
Tsurutani, Chiharu:
"Foreign accent matters most when timing is wrong",
1854-1857.
Hong, Hyejin / Kim, Jina / Chung, Minhwa:
"Effects of Korean learners' consonant cluster reduction strategies on English speech recognition performance",
1858-1861.
Levitt, June S. / Katz, William F.:
"The effects of EMA-based augmented visual feedback on the English speakers' acquisition of the Japanese flap: a perceptual study",
1862-1865.
Masuda, Hinako / Arai, Takayuki:
"Perception of voiceless fricatives by Japanese listeners of advanced and intermediate level English proficiency",
1866-1869.
Meister, Lya / Meister, Einar:
"Perception of estonian vowel categories by native and non-native speakers",
1870-1873.
Shi, Qin / Li, Kun / Zhang, ShiLei / Chu, Stephen M. / Xiao, Ji / Ou, ZhiJian:
"Spoken English assessment system for non-native speakers using acoustic and prosodic features",
1874-1877.
Lyakso, Elena E. / Frolova, Olga V. / Kurazhova, Anna V. / Gaikova, Julia S.:
"Russian infants and children's sounds and speech corpuses for language acquisition studies",
1878-1881.
Monnin, Julia / Lœvenbruck, Hélène:
"Language-specific influence on phoneme development: French and drehu data",
1882-1885.
Holliday, Jeffrey J. / Beckman, Mary E. / Mays, Chanelle:
"Did you say susi or shushi? measuring the emergence of robust fricative contrasts in English- and Japanese-acquiring children",
1886-1889.
Spoken Language Resources, Systems and Evaluation I, II
Novak, Josef R. / Dixon, Paul R. / Furui, Sadaoki:
"An empirical comparison of the t3, juicer, HDecode and sphinx3 decoders",
1890-1893.
Garner, Philip N. / Dines, John:
"Tracter: a lightweight dataflow framework",
1894-1897.
Davel, Marelie H. / Wet, Febe de:
"Verifying pronunciation dictionaries using conflict analysis",
1898-1901.
Roy, Brandon C. / Vosoughi, Soroush / Roy, Deb:
"Automatic estimation of transcription accuracy and difficulty",
1902-1905.
Lambert, Benjamin / Singh, Rita / Raj, Bhiksha:
"Creating a linguistic plausibility dataset with non-expert annotators",
1906-1909.
Hu, Xinhui / Isotani, Ryosuke / Kawai, Hisashi / Nakamura, Satoshi:
"Construction and evaluations of an annotated Chinese conversational corpus in travel domain for the language model of speech recognition",
1910-1913.
Hughes, Thad / Nakajima, Kaisuke / Ha, Linne / Vasu, Atul / Moreno, Pedro J. / LeBeau, Mike:
"Building transcribed speech corpora quickly and cheaply for many languages",
1914-1917.
Christensen, Heidi / Barker, Jon / Ma, Ning / Green, Phil D.:
"The CHiME corpus: a resource and a challenge for computational hearing in multisource environments",
1918-1921.
Cao, Wen / Wang, Dongning / Zhang, Jinsong / Xiong, Ziyu:
"Developing a Chinese L2 speech database of Japanese learners with narrow-phonetic labels for computer assisted pronunciation training",
1922-1925.
Ishikawa, Shogo / Kiriyama, Shinya / Takebayashi, Yoichi / Kitazawa, Shigeyoshi:
"How children acquire situation understanding skills?: a developmental analysis utilizing multimodal speech behavior corpus",
1926-1929.
Wechsung, Ina / Schaffer, Stefan / Schleicher, Robert / Naumann, Anja / Möller, Sebastian:
"The influence of expertise and efficiency on modality selection strategies and perceived mental effort",
1930-1933.
Kühnel, Christine / Weiss, Benjamin / Möller, Sebastian:
"Parameters describing multimodal interaction - definitions and three usage scenarios",
1934-1937.
Zgorzelski, Alexander / Schmitt, Alexander / Heinroth, Tobias / Minker, Wolfgang:
"Repair strategies on trial: which error recovery do users like best?",
1938-1941.
Kamvar, Maryam / Beeferman, Doug:
"Say what? why users choose to speak their web queries",
1966-1969.
Teutenberg, Jonathan / Watson, Catherine I.:
"The effect of audience familiarity on the perception of modified accent",
1970-1973.
Richmond, Korin / Clark, Robert A. J. / Fitt, Sue:
"On generating combilex pronunciations via morphological analysis",
1974-1977.
Gödde, Florian / Möller, Sebastian:
"Say it as you mean it - analyzing free user comments in the VOICE awards corpus",
1978-1981.
Rozgić, Viktor / Xiao, Bo / Katsamanis, Athanasios / Baucom, Brian R. / Georgiou, Panayiotis G. / Narayanan, Shrikanth S.:
"A new multichannel multi modal dyadic interaction database",
1982-1985.
Lyu, Dau-Cheng / Tan, Tien-Ping / Chng, Eng Siong / Li, Haizhou:
"SEAME: a Mandarin-English code-switching speech corpus in south-east asia",
1986-1989.
Speech Production: Analysis
Felps, Daniel / Geng, Christian / Berger, Michael / Richmond, Korin / Gutierrez-Osuna, Ricardo:
"Relying on critical articulators to estimate vocal tract spectra in an articulatory-acoustic database",
1990-1993.
Ramanarayanan, Vikram / Byrd, Dani / Goldstein, Louis / Narayanan, Shrikanth S.:
"Investigating articulatory setting - pauses, ready position, and rest - using real-time MRI",
1994-1997.
Qin, Chao / Carreira-Perpiñán, Miguel Á.:
"Articulatory inversion of american English /turnr/ by conditional density modes",
1998-2001.
Youssef, Atef Ben / Badin, Pierre / Bailly, Gérard:
"Can tongue be recovered from face? the answer of data-driven statistical models",
2002-2005.
Torreira, Francisco / Ernestus, Mirjam:
"Phrase-medial vowel devoicing in spontaneous French",
2006-2009.
Cheng, Chierh / Xu, Yi / Gubian, Michele:
"Exploring the mechanism of tonal contraction in taiwan Mandarin",
2010-2013.
Paralanguage & Cognition
Weiss, Benjamin / Burkhardt, Felix:
"Voice attributes affecting likability perception",
2014-2017.
Jokinen, Kristiina / Harada, Kazuaki / Nishida, Masafumi / Yamamoto, Seiichi:
"Turn-alignment using eye-gaze and speech in conversational interaction",
2018-2021.
Yap, Tet Fei / Epps, Julien / Ambikairajah, Eliathamby / Choi, Eric H. C.:
"An investigation of formant frequencies for cognitive load classification",
2022-2025.
Goudbeek, Martijn / Broersma, Mirjam:
"Language specific effects of emotion on phoneme duration",
2026-2029.
Black, Matthew / Katsamanis, Athanasios / Lee, Chi-Chun / Lammert, Adam C. / Baucom, Brian R. / Christensen, Andrew / Georgiou, Panayiotis G. / Narayanan, Shrikanth S.:
"Automatic classification of married couples' behavior using audio features",
2030-2033.
Kowadlo, Gideon / Ye, Patrick / Zukerman, Ingrid:
"Influence of gestural salience on the interpretation of spoken requests",
2034-2037.
Robust ASR Against Noise
Mitra, Vikramjit / Nam, Hosung / Espy-Wilson, Carol / Saltzman, Elliot / Goldstein, Louis:
"Robust word recognition using articulatory trajectories and gestures",
2038-2041.
Yamada, Takeshi / Nakajima, Tomohiro / Kitawaki, Nobuhiko / Makino, Shoji:
"Performance estimation of noisy speech recognition considering recognition task complexity",
2042-2045.
Faubel, Friedrich / Klakow, Dietrich:
"Estimating noise from noisy speech features with a monte carlo variant of the expectation maximization algorithm",
2046-2049.
Tamura, Satoshi / Hishikawa, Eriko / Taguchi, Wataru / Hayamizu, Satoru:
"Template-based spectral estimation using microphone array for speech recognition",
2050-2053.
Mushtaq, Aleem / Tsao, Yu / Hui-Lee, Chin:
"A particle filter feature compensation approach to robust speech recognition",
2054-2057.
Kim, Chanwoo / Stern, Richard M.:
"Nonlinear enhancement of onset for robust speech recognition",
2058-2061.
Badiezadegan, Shirin / Rose, Richard C.:
"Mask estimation in non-stationary noise environments for missing feature based robust speech recognition",
2062-2065.
Kim, Lae-Hoon / Kim, Kyung-Tae / Hasegawa-Johnson, Mark:
"Robust automatic speech recognition with decoder oriented ideal binary mask estimation",
2066-2069.
Ince, Gökhan / Nakadai, Kazuhiro / Rodemann, Tobias / Tsujino, Hiroshi / Imura, Jun-ichi:
"A robust speech recognition system against the ego noise of a robot",
2070-2073.
Wu, Kuo-Hao / Chen, Chia-Ping:
"Empirical mode decomposition for noise-robust automatic speech recognition",
2074-2077.
Kim, Wooil / Suh, Jun-Won / Hansen, John H. L.:
"An effective feature compensation scheme tightly matched with speech recognizer employing SVM-based GMM generation",
2078-2081.
Gemmeke, Jort F. / Virtanen, Tuomas:
"Artificial and online acquired noise dictionaries for noise robust ASR",
2082-2085.
Saito, Akira / Nankaku, Yoshihiko / Lee, Akinobu / Tokuda, Keiichi:
"Voice activity detection based on conditional random fields using multiple features",
2086-2089.
Zhao, Yong / Juang, Biing-Hwang:
"A comparative study of noise estimation algorithms for VTS-based robust speech recognition",
2090-2093.
Seide, Frank / Zhao, Pei:
"On using missing-feature theory with cepstral features - approximations to the multivariate integral",
2094-2097.
Sun, Yang / Gemmeke, Jort F. / Cranen, Bert / Bosch, Louis ten / Boves, Lou:
"Using a DBN to integrate sparse classification and GMM-based ASR",
2098-2101.
Voice Conversion and Speech Synthesis
Röbel, Axel:
"Shape-invariant speech transformation with the phase vocoder",
2146-2149.
Yanagisawa, Kayoko / Huckvale, Mark:
"A phonetic alternative to cross-language voice conversion in a text-dependent context: evaluation of speaker identity",
2150-2153.
Klabbers, Esther / Kain, Alexander / Santen, Jan P. H. van:
"Evaluation of speaker mimic technology for personalizing SGD voices",
2154-2157.
Ohta, Kumi / Toda, Tomoki / Ohtani, Yamato / Saruwatari, Hiroshi / Shikano, Kiyohiro:
"Adaptive voice-quality control based on one-to-many eigenvoice conversion",
2158-2161.
Villavicencio, Fernando / Bonada, Jordi:
"Applying voice conversion to concatenative singing-voice synthesis",
2162-2165.
Wang, Miaomiao / Wen, Miaomiao / Hirose, Keikichi / Minematsu, Nobuaki:
"Improved generation of fundamental frequency in HMM-based speech synthesis using generation process model",
2166-2169.
Lei, Ming / Wu, Yijian / Soong, Frank K. / Ling, Zhen-Hua / Dai, Lirong:
"A hierarchical F0 modeling method for HMM-based speech synthesis",
2170-2173.
Latorre, Javier / Gales, Mark J. F. / Zen, Heiga:
"Training a parametric-based logF0 model with the minimum generation error criterion",
2174-2177.
Wen, Miaomiao / Wang, Miaomiao / Hirose, Keikichi / Minematsu, Nobuaki:
"Improving Mandarin segmental duration prediction with automatically extracted syntax features",
2178-2181.
Niekerk, Daniel R. van / Barnard, Etienne:
"An intonation model for TTS in sepedi",
2182-2185.
Pucher, Michael / Schabus, Dietmar / Yamagishi, Junichi:
"Synthesis of fast speech with interpolation of adapted HSMMs and its evaluation by blind and sighted listeners",
2186-2189.
Webster, Gabriel / Krstulović, Sacha / Knill, Kate:
"A comparison of pronunciation modeling approaches for HMM-TTS",
2190-2193.
Ling, Zhen-Hua / Richmond, Korin / Yamagishi, Junichi:
"HMM-based text-to-articulatory-movement prediction and analysis of critical articulators",
2194-2197.
Detection, Classification, and Segmentation
Ye, Jiaxing / Kobayashi, Takumi / Higuchi, Tetsuya:
"Audio-based sports highlight detection by fourier local auto-correlations",
2198-2201.
Bořil, Hynek / Sangwan, Abhijeet / Hasan, Taufiq / Hansen, John H. L.:
"Automatic excitement-level detection for sports highlights generation",
2202-2205.
Bach, Jörg-Hendrik / Anemüller, Jörn:
"Detecting novel objects in acoustic scenes through classifier incongruence",
2206-2209.
Ntalampiras, Stavros / Potamitis, Ilyas / Fakotakis, Nikos:
"A multidomain approach for automatic home environmental sound classification",
2210-2213.
Cardinal, Patrick / Gupta, Vishwa / Boulianne, Gilles:
"Content-based advertisement detection",
2214-2217.
Ntalampiras, Stavros / Potamitis, Ilyas / Fakotakis, Nikos:
"Identification of abnormal audio events based on probabilistic novelty detection",
2218-2221.
Braunschweiler, Norbert / Gales, Mark J. F. / Buchholz, Sabine:
"Lightly supervised recognition for automatic alignment of large coherent speech recordings",
2222-2225.
Ben-Harush, Oshry / Lapidot, Itshak / Guterman, Hugo:
"Incremental diarization of telephone conversations",
2226-2229.
Cherla, Srikanth / Ramasubramanian, V.:
"Audio analytics by template modeling and 1-pass DP based decoding",
2230-2233.
Ziółko, Mariusz / Gałka, Jakub / Ziółko, Bartosz / Drwiȩga, Tomasz:
"Perceptual wavelet decomposition for speech segmentation",
2234-2237.
Keri, Venkatesh / Prahallad, Kishore:
"A comparative study of constrained and unconstrained approaches for segmentation of speech signal",
2238-2241.
Sonderegger, Morgan / Keshet, Joseph:
"Automatic discriminative measurement of voice onset time",
2242-2245.
Leng, Yi Ren / Tran, Huy Dat / Kitaoka, Norihide / Li, Haizhou:
"Selective gammatone filterbank feature for robust sound event recognition",
2246-2249.
Compressive Sensing for Speech and Language Processing (Special Session)
Yang, Allen Y. / Zhou, Zihan / Ma, Yi / Sastry, S. Shankar:
"Towards a robust face recognition system using compressive sensing",
2250-2253.
Sainath, Tara N. / Ramabhadran, Bhuvana / Nahamoo, David / Kanevsky, Dimitri / Sethy, Abhinav:
"Sparse representation features for speech recognition",
2254-2257.
Sethy, Abhinav / Sainath, Tara N. / Ramabhadran, Bhuvana / Kanevsky, Dimitri:
"Data selection for language modeling using sparse representations",
2258-2261.
Gemmeke, Jort F. / Remes, Ulpu / Palomäki, Kalle J.:
"Observation uncertainty measures for sparse imputation",
2262-2265.
Sainath, Tara N. / Maskey, Sameer R. / Kanevsky, Dimitri / Ramabhadran, Bhuvana / Nahamoo, David / Hirschberg, Julia:
"Sparse representations for text categorization",
2266-2269.
Sivaram, Garimella S. V. S. / Ganapathy, Sriram / Hermansky, Hynek:
"Sparse auto-associative neural networks: theory and application to speech recognition",
2270-2273.
ASR: Lexical and Pronunciation Modeling
Hu, Chi / Zhuang, Xiaodan / Hasegawa-Johnson, Mark:
"FSM-based pronunciation modeling using articulatory phonological code",
2274-2277.
Jouvet, Denis / Fohr, Dominique / Illina, Irina:
"Detailed pronunciation variant modeling for speech transcription",
2278-2281.
Adde, Line / Réveil, Bert / Martens, Jean-Pierre / Svendsen, Torbjørn:
"A minimum classification error approach to pronunciation variation modeling of non-native proper names",
2282-2285.
Laurent, Antoine / Meignier, Sylvain / Merlin, Teva / Deléglise, Paul:
"Acoustics-based phonetic transcription method for proper nouns",
2286-2289.
Schlippe, Tim / Ochs, Sebastian / Schultz, Tanja:
"Wiktionary as a source for automatic pronunciation extraction",
2290-2293.
Badr, Ibrahim / McGraw, Ian / Glass, James:
"Learning new word pronunciations from spoken examples",
2294-2297.
Speaker Recognition and Diarization
Chen, I-Fan / Cheng, Shih-Sian / Wang, Hsin-Min:
"Phonetic subspace mixture model for speaker diarization",
2298-2301.
Zelenák, Martin / Segura, Carlos / Hernando, Javier:
"Overlap detection for speaker diarization by fusing spectral and spatial features",
2302-2305.
Dielmann, Alfred / Garau, Giulia / Bourlard, Hervé:
"Floor holder detection and end of speaker turn prediction in meetings",
2306-2309.
Vaquero, Carlos / Ortega, Alfonso / Villalba, Jesús / Miguel, Antonio / Lleida, Eduardo:
"Confidence measures for speaker segmentation and their relation to speaker verification",
2310-2313.
Larcher, Anthony / Lévy, Christophe / Matrouf, Driss / Bonastre, Jean-François:
"Decoupling session variability modelling and speaker characterisation",
2314-2317.
Leung, Cheung-Chi / Zhu, Donglai / Lee, Kong Aik / Ma, Bin / Li, Haizhou:
"Incorporating MAP estimation and covariance transform for SVM based speaker recognition",
2318-2321.
Speech and Audio Classification
Rossignol, Stéphane / Pietquin, Olivier:
"Single-speaker/multi-speaker co-channel speech classification",
2322-2325.
Vinyals, Oriol / Friedland, Gerald / Morgan, Nelson:
"Discriminative training for hierarchical clustering in speaker diarization",
2326-2329.
Geiger, Jürgen / Wallhoff, Frank / Rigoll, Gerhard:
"GMM-UBM based open-set online speaker diarization",
2330-2333.
Golipour, Ladan / O'Shaughnessy, Douglas:
"A segment-based non-parametric approach for monophone recognition",
2334-2337.
Butko, Taras / Nadeu, Climent:
"A fast one-pass-training feature selection technique for GMM-based acoustic event detection with audio-visual data",
2338-2341.
Yamakawa, Nobuhide / Kitahara, Tetsuro / Takahashi, Toru / Komatani, Kazunori / Ogata, Tetsuya / Okuno, Hiroshi G.:
"Effects of modelling within- and between-frame temporal variations in power spectra on non-verbal sound recognition",
2342-2345.
Emotion Recognition
He, Ling / Lech, Margaret / Allen, Nicholas:
"On the importance of glottal flow spectral energy for the recognition of emotions in speech",
2346-2349.
Devillers, Laurence / Vaudable, Christophe / Chastagnol, Clément:
"Real-life emotion-related states detection in call centers: a cross-corpora study",
2350-2353.
Hassan, Ali / Damper, Robert I.:
"Multi-class and hierarchical SVMs for emotion recognition",
2354-2357.
Hübner, David / Vlasenko, Bogdan / Grosser, Tobias / Wendemuth, Andreas:
"Determining optimal features for emotion recognition from speech by applying an evolutionary algorithm",
2358-2361.
Wöllmer, Martin / Metallinou, Angeliki / Eyben, Florian / Schuller, Björn / Narayanan, Shrikanth S.:
"Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional LSTM modeling",
2362-2365.
Audhkhasi, Kartik / Narayanan, Shrikanth S.:
"Data-dependent evaluator modeling and its application to emotional valence classification from speech",
2366-2369.
Speech Coding, Modeling, and Transmission
Ma, Zhanyu / Leijon, Arne:
"Modelling speech line spectral frequencies with dirichlet mixture models",
2370-2373.
Ma, Zhanyu / Leijon, Arne:
"PDF-optimized LSF vector quantization based on beta mixture models",
2374-2377.
Garcia, Jose Enrique / Ortega, Alfonso / Miguel, Antonio / Lleida, Eduardo:
"Non-linear predictive vector quantization of feature vectors for distributed speech recognition",
2378-2381.
Laaksonen, Lasse / Tammi, Mikko / Malenovsky, Vladimir / Vaillancourt, Tommy / Lee, Mi Suk / Yamanashi, Tomofumi / Oshikiri, Masahiro / Lamblin, Claude / Kovesi, Balazs / Miao, Lei / Zhang, Deming / Gibbs, Jon / Francois, Holly:
"Superwideband extension of g.718 and g.729.1 speech codecs",
2382-2385.
Carmona, José L. / Gómez, Angel M. / Peinado, Antonio M. / Pérez-Córdoba, José L. / González, José A.:
"A multipulse FEC scheme based on amplitude estimation for CELP codecs over packet networks",
2386-2389.
Rämö, Anssi / Toukomaa, Henri:
"Voice quality evaluation of recent open source codecs",
2390-2393.
Borgström, Bengt J. / Borgström, Per H. / Alwan, Abeer:
"Efficient HMM-based estimation of missing features, with applications to packet loss concealment",
2394-2397.
Xiao, Xiaoqiang / Nickel, Robert M.:
"Speech inventory based discriminative training for joint speech enhancement and low-rate speech coding",
2398-2401.
Gong, Qipeng / Kabal, Peter:
"Quality-based playout buffering with FEC for conversational voIP",
2402-2405.
Tamura, Masatsune / Kagoshima, Takehiko / Akamine, Masami:
"Sub-band basis spectrum model for pitch-synchronous log-spectrum and phase based on approximation of sparse coding",
2406-2409.
Harshavardhan, Sundar / Seelamantula, Chandra Sekhar / Sreenivas, Thippur V.:
"A multimodal density function estimation approach to formant tracking",
2410-2413.
Rasilo, Heikki / Laine, Unto K. / Räsänen, Okko Johannes:
"Estimation studies of vocal tract shape trajectory using a variable length and lossy kelly-lochbaum model",
2414-2417.
Speech Perception: Processing and Intelligibility
Haque, Serajul / Togneri, Roberto:
"A feature extraction method for automatic speech recognition based on the cochlear nucleus",
2454-2457.
Thomas, Samuel / Patil, Kailash / Ganapathy, Sriram / Mesgarani, Nima / Hermansky, Hynek:
"A phoneme recognition framework based on auditory spectro-temporal receptive fields",
2458-2461.
Beeston, Amy V. / Brown, Guy J.:
"Perceptual compensation for effects of reverberation in speech identification: a computer model based on auditory efferent processing",
2462-2465.
Schuppler, Barbara / Ernestus, Mirjam / Dommelen, Wim van / Koreman, Jacques:
"Predicting human perception and ASR classification of word-final [t] by its acoustic sub-segmental properties",
2466-2469.
Robertson, Matthew / Brown, Guy J. / Lecluyse, Wendy / Panda, Manasa / Tan, Christine M.:
"A speech-in-noise test based on spoken digits: comparison of normal and impaired listeners using a computer model",
2470-2473.
Kagomiya, Takayuki / Nakagawa, Seiji:
"Evaluation of bone-conducted ultrasonic hearing-aid regarding transmission of paralinguistic information: a comparison with cochlear implant simulator",
2474-2477.
Jürgens, Tim / Fredelake, Stefan / Meyer, Ralf M. / Kollmeier, Birger / Brand, Thomas:
"Challenging the speech intelligibility index: macroscopic vs. microscopic prediction of sentence recognition in normal and hearing-impaired listeners",
2478-2481.
Uslar, Verena N. / Brand, Thomas / Hanke, Mirko / Carroll, Rebecca / Ruigendijk, Esther / Hamann, Cornelia / Kollmeier, Birger:
"Does sentence complexity interfere with intelligibility in noise? evaluation of the oldenburg linguistically and audiologically controlled sentence test (OLACS)",
2482-2485.
Ramirez, Juan-Pablo / Ketabdar, Hamed / Raake, Alexander:
"Intelligibility predictions for speech against fluctuating masker",
2486-2489.
Ito, Masashi / Ohara, Keiji / Ito, Akinori / Yano, Masafumi:
"An effect of formant amplitude in vowel perception",
2490-2493.
Petkov, Christopher I. / Wilson, Benjamin:
"Functional imaging of brain regions sensitive to communication sounds in primates",
2494-2497.
Spoken Language Understanding and Spoken Language Translation I, II
Wang, Ye-Yi:
"Strategies for statistical spoken language understanding with small amount of data - an empirical study",
2498-2501.
Jabaian, Bassam / Besacier, Laurent / Lefèvre, Fabrice:
"Investigating multiple approaches for SLU portability to a new language",
2502-2505.
Austermann, Anja / Yamada, Seiji / Funakoshi, Kotaro / Nakano, Mikio:
"Learning naturally spoken commands for a robot",
2506-2509.
Albalate, Amparo / Suchindranath, Aparna / Suendermann, David / Minker, Wolfgang:
"A semi-supervised cluster-and-label approach for utterance classification",
2510-2513.
Quarteroni, Silvia / Riccardi, Giuseppe:
"Classifying dialog acts in human-human and human-machine spoken conversations",
2514-2517.
Liu, Fei / Liu, Yang:
"Exploring speaker characteristics for meeting summarization",
2518-2521.
Xie, Shasha / Lin, Hui / Liu, Yang:
"Semi-supervised extractive speech summarization via co-training algorithm",
2522-2525.
Celikyilmaz, Asli / Hakkani-Tür, Dilek:
"Extractive summarization using a latent variable model",
2526-2529.
Ettelaie, Emil / Georgiou, Panayiotis G. / Narayanan, Shrikanth S.:
"Hierarchical classification for speech-to-speech translation",
2530-2533.
Paulik, Matthias / Waibel, Alex:
"Rapid development of speech translation using consecutive interpretation",
2534-2537.
Maskey, Sameer R. / Rennie, Steven J. / Zhou, Bowen:
"Combining many alignments for speech to speech translation",
2538-2541.
Gotab, Pierre / Damnati, Geraldine / Bechet, Frederic / Delphin-Poulat, Lionel:
"Online SLU model adaptation with a partial oracle",
2862-2865.
Deshmukh, Om D. / Doddala, Harish / Verma, Ashish / Visweswariah, Karthik:
"Role of language models in spoken fluency evaluation",
2866-2869.
Yaman, Sibel / Hakkani-Tür, Dilek / Tur, Gokhan:
"Social role discovery from spoken language using dynamic Bayesian networks",
2870-2873.
Sanchez, Michelle Hewlett / Tur, Gokhan / Ferrer, Luciana / Hakkani-Tür, Dilek:
"Domain adaptation and compensation for emotion detection",
2874-2877.
Ananthakrishnan, Sankaranarayanan / Prasad, Rohit / Natarajan, Prem:
"Phrase alignment confidence for statistical machine translation",
2878-2881.
Lane, Ian R. / Waibel, Alex:
"Named-entity projection and data-driven morphological decomposition for field maintainable speech-to-speech translation systems",
2882-2885.
Social Signals in Speech (Special Session)
Brunet, Paul M. / Charfuelan, Marcela / Cowie, Roderick / Schröder, Marc / Donnan, Hastings / Douglas-Cowie, Ellen:
"Detecting Politeness and efficiency in a cooperative social interaction",
2542-2545.
Campbell, Nick / Scherer, Stefan:
"Comparing measures of synchrony and alignment in dialogue speech timing with respect to turn-taking activity",
2546-2549.
Kurtić, Emina / Brown, Guy J. / Wells, Bill:
"Resources for turn competition in overlap in multi-party conversations: speech rate, pausing and duration",
2550-2553.
Truong, Khiet P. / Heylen, Dirk:
"Disambiguating the functions of conversational sounds with prosody: the case of yeah",
2554-2557.
Charfuelan, Marcela / Schröder, Marc / Steiner, Ingmar:
"Prosody and voice quality of vocal social signals: the case of dominance in scenario meetings",
2558-2561.
Neiberg, D. / Gustafson, J.:
"The prosody of Swedish conversational grunts",
2562-2565.
Physiology and Pathology of Spoken Language
Mertens, Christophe / Grenez, Francis / Crevier-Buchman, Lise / Schoentgen, Jean:
"Reliable tracking based on speech sample salience of vocal cycle length perturbations",
2566-2569.
Kasuya, Hideki / Yoshida, Hajime / Ebihara, Satoshi / Mori, Hiroki:
"Longitudinal changes of selected voice source parameters",
2570-2573.
Alpan, Ali / Schoentgen, Jean / Maryn, Youri / Grenez, Francis:
"Automatic perceptual categorization of disordered connected speech",
2574-2577.
Kim, Heejin / Rong, Panying / Loucks, Torrey M. / Hasegawa-Johnson, Mark:
"Kinematic analysis of tongue movement control in spastic dysarthria",
2578-2581.
Jacobi, Irene / Molen, Lisette van der / Rossum, Maya van / Hilgers, Frans:
"Pre- and short-term posttreatment vocal functioning in patients with advanced head and neck cancer treated with concomitant chemoradiotherapy",
2582-2585.
Ma, Joan K. Y. / Hoffmann, Rüdiger:
"Acoustic analysis of intonation in parkinson's disease",
2586-2589.
Speaker Diarization
Vaquero, Carlos / Vinyals, Oriol / Friedland, Gerald:
"A hybrid approach to online speaker diarization",
2638-2641.
Bozonnet, Simon / Evans, Nicholas / Anguera, Xavier / Vinyals, Oriol / Friedland, Gerald / Fredouille, Corinne:
"System output combination for improved speaker diarization",
2642-2645.
Bozonnet, Simon / Evans, Nicholas / Fredouille, Corinne / Wang, Dong / Troncy, Raphaël:
"An integrated top-down/bottom-up approach to speaker diarization",
2646-2649.
Vijayasenan, Deepu / Valente, Fabio / Bourlard, Hervé:
"Advances in fast multistream diarization based on the information bottleneck framework",
2650-2653.
Garau, Giulia / Dielmann, Alfred / Bourlard, Hervé:
"Audio-visual synchronisation for speaker diarisation",
2654-2657.
Han, Kyu J. / Narayanan, Shrikanth S.:
"An improved cluster model selection method for agglomerative hierarchical speaker clustering using incremental Gaussian mixture models",
2658-2661.
Ward, Nigel G. / Fuentes, Olac / Vega, Alejandro:
"Dialog prediction for a general model of turn-taking",
2662-2665.
Herbig, Tobias / Gerl, Franz / Minker, Wolfgang:
"Speaker tracking in an unsupervised speech controlled system",
2666-2669.
Lopez-Otero, Paula / Docio-Fernandez, Laura / Garcia-Mateo, Carmen:
"MultiBIC: an improved speaker segmentation technique for TV shows",
2670-2673.
Multi-Modal ASR, Including Audio-Visual ASR
Hosom, John-Paul / Jakobs, Tom / Baker, Allen / Fager, Susan:
"Automatic speech recognition for assistive writing in speech supplemented word prediction",
2674-2677.
Karpov, Alexey / Ronzhin, Andrey / Markov, Konstantin / Železný, Miloš:
"Viseme-dependent weight optimization for CHMM-based audio-visual speech recognition",
2678-2681.
Terry, Louis H. / Livescu, Karen / Pierrehumbert, Janet B. / Katsaggelos, Aggelos K.:
"Audio-visual anticipatory coarticulation modeling by human and machine",
2682-2685.
Janke, Matthias / Wand, Michael / Schultz, Tanja:
"Impact of lack of acoustic feedback in EMG-based silent speech recognition",
2686-2689.
Ni, Chong-Jia / Liu, Wenju / Xu, Bo:
"Using prosody to improve Mandarin automatic speech recognition",
2690-2693.
Tamura, Satoshi / Ishikawa, Masato / Hashiba, Takashi / Takeuchi, Shin'ichi / Hayamizu, Satoru:
"A robust audio-visual speech recognition using audio-visual voice activity detection",
2694-2697.
Kolossa, Dorothea / Chong, Jike / Zeiler, Steffen / Keutzer, Kurt:
"Efficient manycore CHMM speech recognition for audiovisual and multistream data",
2698-2701.
Yoshida, Takami / Nakadai, Kazuhiro:
"Two-layered audio-visual integration in voice activity detection and automatic speech recognition for robots",
2702-2705.
Heracleous, Panikos / Hagita, Norihiro:
"Non-audible murmur recognition based on fusion of audio and visual streams",
2706-2709.
Speaker and Language Recognition
BenZeghiba, Mohamed Faouzi / Gauvain, Jean-Luc / Lamel, Lori:
"Improved n-gram phonotactic models for language recognition",
2710-2713.
Boonsuk, Sirinoot / Zhu, Donglai / Ma, Bin / Suchato, Atiwong / Punyabukkana, Proadpran / Thatphithakkul, Nattanun / Wutiwiwatchai, Chai:
"A study of term weighting in phonotactic approach to spoken language recognition",
2714-2717.
Siniscalchi, Sabato Marco / Reed, Jeremy / Svendsen, Torbjørn / Lee, Chin-Hui:
"Exploiting context-dependency and acoustic resolution of universal speech attribute models in spoken language recognition",
2718-2721.
Imseng, David / Doss, Mathew Magimai / Bourlard, Hervé:
"Hierarchical multilayer perceptron based language identification",
2722-2725.
Martin, Alvin F. / Greenberg, Craig S.:
"The NIST 2010 speaker recognition evaluation",
2726-2729.
Cheng, Shih-Sian / Chen, I-Fan / Wang, Hsin-Min:
"Bayesian speaker recognition using Gaussian mixture model and laplace approximation",
2730-2733.
Kinnunen, Tomi / Saeidi, Rahim / Sandberg, Johan / Hansson-Sandsten, Maria:
"What else is new than the hamming window? robust MFCCs for speaker recognition via multitapering",
2734-2737.
Sarkar, Achintya Kumar / Umesh, S.:
"Fast computation of speaker characterization vector using MLLR and sufficient statistics in anchor model framework",
2738-2741.
Karam, Zahi N. / Campbell, William M.:
"Graph-embedding for speaker recognition",
2742-2745.
You, Chang Huai / Li, Haizhou / Lee, Kong Aik:
"A hybrid modeling strategy for GMM-SVM speaker recognition with adaptive relevance factor",
2746-2749.
Harshavardhan, Sundar / Sreenivas, Thippur V.:
"Robust mixture modeling using t-distribution: application to speaker ID",
2750-2753.
Jung, Chi-Sang / Han, Kyu J. / Seo, Hyunson / Narayanan, Shrikanth S. / Kang, Hong-Goo:
"A variable frame length and rate algorithm based on the spectral kurtosis measure for speaker verification",
2754-2757.
Source Localization and Separation
Hayashida, Kohei / Morise, Masanori / Nishiura, Takanobu:
"Near field sound source localization based on cross-power spectrum phase analysis with multiple microphones",
2758-2761.
Choi, Jinho / Yoo, Chang D.:
"A maximum a posteriori sound source localization in reverberant and noisy conditions",
2762-2765.
Nakatani, Tomohiro / Araki, Shoko / Yoshioka, Takuya / Fujimoto, Masakiyo:
"Multichannel source separation based on source location cue with log-spectral shaping by hidden Markov source model",
2766-2769.
Chau, Duc Thanh / Li, Junfeng / Akagi, Masato:
"A DOA estimation algorithm based on equalization-cancellation theory",
2770-2773.
Habib, Tania / Romsdorfer, Harald:
"Concurrent speaker localization using multi-band position-pitch (m-popi) algorithm with spectro-temporal pre-processing",
2774-2777.
Song, Ji-Hyun / Lee, Kyu-Ho / Park, Yun-Sik / Kang, Sang-Ick / Chang, Joon-Hyuk:
"On using Gaussian mixture model for double-talk detection in acoustic echo suppression",
2778-2781.
Demir, Cemil / Cemgil, A. Taylan / Saraçlar, Murat:
"Catalog-based single-channel speech-music separation",
2782-2785.
Hu, Ke / Wang, DeLiang:
"Unvoiced speech segregation based on CASA and spectral subtraction",
2786-2789.
Hu, Ke / Wang, DeLiang:
"Unsupervised sequential organization for cochannel speech separation",
2790-2793.
INTERSPEECH 2010 Paralinguistic Challenge (Special Session)
Schuller, Björn / Steidl, Stefan / Batliner, Anton / Burkhardt, Felix / Devillers, Laurence / Müller, Christian / Narayanan, Shrikanth S.:
"The INTERSPEECH 2010 paralinguistic challenge",
2794-2797.
Lingenfelser, Florian / Wagner, Johannes / Vogt, Thurid / Kim, Jonghwa / André, Elisabeth:
"Age and gender classification from speech using decision level fusion and ensemble based techniques",
2798-2801.
Jeon, Je Hun / Xia, Rui / Liu, Yang:
"Level of interest sensing in spoken dialog using multi-level fusion of acoustic and lexical evidence",
2802-2805.
Nguyen, Phuoc / Le, Trung / Tran, Dat / Huang, Xu / Sharma, Dharmendra:
"Fuzzy support vector machines for age and gender classification",
2806-2809.
Gajšek, Rok / Žibert, Janez / Justin, Tadej / Štruc, Vitomir / Vesnicer, Boštjan / Mihelič, France:
"Gender and affect recognition based on GMM and GMM-UBM modeling with relevance MAP estimation",
2810-2813.
Porat, Royi / Lange, Dan / Zigel, Yaniv:
"Age recognition based on speech signals using weights supervector",
2814-2817.
Meinedo, Hugo / Trancoso, Isabel:
"Age and gender classification using fusion of acoustic and prosodic features",
2818-2821.
Kockmann, Marcel / Burget, Lukáš / Černocký, Jan:
"Brno university of technology system for interspeech 2010 paralinguistic challenge",
2822-2825.
Li, Ming / Jung, Chi-Sang / Han, Kyu J.:
"Combining five acoustic level modeling methods for automatic speaker age and gender recognition",
2826-2829.
Bocklet, Tobias / Stemmer, Georg / Zeissler, Viktor / Nöth, Elmar:
"Age and gender recognition based on multiple systems - early vs. late fusion",
2830-2833.
Feld, Michael / Burkhardt, Felix / Müller, Christian:
"Automatic speaker age and gender recognition in the car for tailoring dialog and mobile services",
2834-2837.
Signal Processing for Music and Song
Aikawa, Kiyoaki / Uenuma, Junko / Akitake, Tomoko:
"Acoustic correlates of voice quality improvement by voice training",
2886-2889.
Dong, Minghui / Chan, Paul / Cen, Ling / Li, Haizhou / Teo, Jason / Kua, Ping Jen:
"Phonetic segmentation of singing voice using MIDI and parallel speech",
2890-2893.
Saino, Keijiro / Tachibana, Makoto / Kenmochi, Hideki:
"A singing style modeling system for singing voice synthesizers",
2894-2897.
Yang, Jingzhou / Liu, Jia / Zhang, Wei-Qiang:
"A fast query by humming system based on notes",
2898-2901.
Jo, Seokhwan / Joo, Sihyun / Yoo, Chang D.:
"Melody pitch estimation based on range estimation and candidate extraction using harmonic structure model",
2902-2905.
Park, Jihoon / Kim, Kwangki / Seo, Jeongil / Hahn, Minsoo:
"Modified spatial audio object coding scheme with harmonic extraction and elimination structure for interactive audio service",
2906-2909.
Modeling First Language Acquisition
Bergmann, Christina / Gubian, Michele / Boves, Lou:
"Modelling the effect of speaker familiarity and noise on infant word recognition",
2910-2913.
Miyazawa, Kouki / Kikuchi, Hideaki / Mazuka, Reiko:
"Unsupervised learning of vowels from continuous speech based on self-organized phoneme acquisition model",
2914-2917.
Plummer, Andrew R. / Beckman, Mary E. / Belkin, Mikhail / Fosler-Lussier, Eric / Munson, Benjamin:
"Learning speaker normalization using semisupervised manifold alignment",
2918-2921.
Räsänen, Okko Johannes:
"Fully unsupervised word learning from continuous speech using transitional probabilities of atomic acoustic events",
2922-2925.
Bosch, Louis ten / Boves, Lou:
"Language acquisition and cross-modal associations: computational simulation of the result of infant studies",
2926-2929.
Versteegh, Maarten / Bosch, Louis ten / Boves, Lou:
"Active word learning under uncertain input conditions",
2930-2933.
Discourse and Dialogue
Lavalley, Rémi / Clavel, Chloé / Bellot, Patrice / El-Bèze, Marc:
"Combining text categorization and dialog modeling for speaker role identification on call center conversations",
3062-3065.
Nakamura, Akira / Hayamizu, Satoru:
"Topic-dependent n-gram models based on optimization of context lengths in LDA",
3066-3069.
Obin, Nicolas / Dellwo, Volker / Lacheret, Anne / Rodet, Xavier:
"Expectations for discourse genre identification: a prosodic study",
3070-3073.
Granell, Ramon / Pulman, Stephen / Martínez-Hinarejos, Carlos-D. / Benedí, José Miguel:
"Dialogue act tagging and segmentation with a single perceptron",
3074-3077.
Fujii, Yasuhisa / Yamamoto, Kazumasa / Nakagawa, Seiichi:
"Improving the readability of class lecture ASR results using a confusion network",
3078-3081.
Voice Activity and Turn Detection
Kim, Sang-Kyun / Choi, Jae-Hun / Kang, Sang-Ick / Song, Ji-Hyun / Chang, Joon-Hyuk:
"Toward detecting voice activity employing soft decision in second-order conditional MAP",
3082-3085.
Lu, Xugang / Unoki, Masashi / Isotani, Ryosuke / Kawai, Hisashi / Nakamura, Satoshi:
"Voice activity detection in a reguarized reproducing kernel hilbert space",
3086-3089.
Wu, Ji / Zhang, Xiao-lei / Li, Wei:
"A new VAD framework using statistical model and human knowledge based empirical rule",
3090-3093.
Huggins, Mark / Smolenski, Brett / Lawson, Aaron:
"Adaptive high accuracy approaches to speech activity detection in noisy and hostile audio environments",
3094-3097.
Ghosh, Prasanta Kumar / Tsiartas, Andreas / Georgiou, Panayiotis G. / Narayanan, Shrikanth S.:
"Robust voice activity detection in stereo recording with crosstalk",
3098-3101.
Fujimoto, Masakiyo / Watanabe, Shinji / Nakatani, Tomohiro:
"Voice activity detection using frame-wise model re-estimation method based on Gaussian pruning with weight normalization",
3102-3105.
Lee, Bowon / Muhkerjee, Debargha:
"Spectral entropy-based voice activity detector for videoconferencing systems",
3106-3109.
Dean, David / Sridharan, Sridha / Vogt, Robert / Mason, Michael:
"The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms",
3110-3113.
Yu, Tao / Hansen, John H. L.:
"A Bayesian approach to voice activity detection using multiple statistical models and discriminative training",
3114-3117.
Ghaemmaghami, Houman / Baker, Brendan / Vogt, Robert / Sridharan, Sridha:
"Noise robust voice activity detection using features extracted from the time-domain autocorrelation function",
3118-3121.
Oonishi, Tasuku / Iwano, Koji / Furui, Sadaoki:
"VAD-measure-embedded decoder with online model adaptation",
3122-3125.
Deng, Shiwen / Han, Jiqing:
"Robust statistical voice activity detection using a likelihood ratio sign test",
3126-3129.
Ivanov, Alexei V. / Riccardi, Giuseppe:
"Automatic turn segmentation in spoken conversations",
3130-3133.
Kawaguchi, Yohei / Togami, Masahito / Obuchi, Yasunari:
"Turn taking-based conversation detection by using DOA estimation",
3134-3137.