Table of Contents and Access to Abstracts
Keynotes
Hermansky, Hynek:
"My adventures with speech".
Munson, Benjamin:
"On the interaction of social and linguistic factors in phonetic variation in typical and atypical speakers".
Giraud, Anne-Lise:
"Are cortical oscillations a useful ingredient of speech perception?".
Clerc, Maureen:
"Verbal communication through brain computer interfaces".
Systems for Search/Retrieval of Speech Documents
Anguera, Xavier:
"Information retrieval-based dynamic time warping",
1-5.
Can, Doğan / Narayanan, Shrikanth:
"On the computation of document frequency statistics from spoken corpora using factor automata",
6-10.
Katsurada, Kouichi / Miura, Seiichi / Seng, Kheang / Iribe, Yurie / Nitta, Tsuneo:
"Acceleration of spoken term detection using a suffix array by assigning optimal threshold values to sub-keywords",
11-14.
Mandal, Arindam / Hout, Julien van / Tam, Yik-Cheung / Mitra, Vikramjit / Lei, Yun / Zheng, Jing / Vergyri, Dimitra / Ferrer, Luciana / Graciarena, Martin / Kathol, Andreas / Franco, Horacio:
"Strategies for high accuracy keyword detection in noisy channels",
15-19.
Abad, Alberto / Rodríguez-Fuentes, Luis Javier / Penagarikano, Mikel / Varona, Amparo / Bordel, Germán:
"On the calibration and fusion of heterogeneous spoken term detection systems",
20-24.
Narumi, Shiro / Konno, Kazuma / Nakano, Takuya / Itoh, Yoshiaki / Kojima, Kazunori / Ishigame, Masaaki / Tanaka, Kazuyo / Lee, Shi-wook:
"Intensive acoustic models constructed by integrating low-occurrence models for spoken term detection",
25-28.
Speech Analysis I-IV
Kane, John / Yanushevskaya, Irena / Dalton, John / Gobl, Christer / Ní Chasaide, Ailbhe:
"Using phonetic feature extraction to determine optimal speech regions for maximising the effectiveness of glottal source analysis",
29-33.
Kawahara, Hideki / Morise, Masanori / Toda, Tomoki / Nisimura, Ryuichi / Irino, Toshio:
"Beyond bandlimited sampling of speech spectral envelope imposed by the harmonic structure of voiced sounds",
34-38.
Lee, JeeSok / Soong, Frank K. / Kang, Hong-Goo:
"A source-filter based adaptive harmonic model and its application to speech prosody modification",
39-43.
Ramesh, K. / Prasanna, S. R. M. / Govind, D.:
"Detection of glottal opening instants using Hilbert envelope",
44-48.
Gowda, Dhananjaya / Pohjalainen, Jouni / Kurimo, Mikko / Alku, Paavo:
"Robust formant detection using group delay function and stabilized weighted linear prediction",
49-53.
Hézard, Thomas / Hélie, Thomas / Doval, Boris:
"A source-filter separation algorithm for voiced sounds based on an exact anticausal/causal pole decomposition for the class of periodic signals",
54-58.
Godoy, Elizabeth / Koutsogiannaki, M. / Stylianou, Yannis:
"Assessing the intelligibility impact of vowel space expansion via clear speech-inspired frequency warping",
1169-1173.
Jensen, Jesper / Taal, Cees H.:
"Prediction of intelligibility of noisy and time-frequency weighted speech based on mutual information between amplitude envelopes",
1174-1178.
Jokinen, Emma / Takanen, Marko / Alku, Paavo:
"Frequency-adaptive post-filtering for intelligibility enhancement of narrowband telephone speech",
1179-1183.
Li, Junfeng / Chen, Fei / Akagi, Masato / Yan, Yonghong:
"Comparative investigation of objective speech intelligibility prediction measures for noise-reduced signals in Mandarin and Japanese",
1184-1187.
Hines, Andrew / Skoglund, Jan / Kokaram, Anil / Harte, Naomi:
"Monitoring the effects of temporal clipping on voIP speech quality",
1188-1192.
Yuan, Jiahong:
"The spectral dynamics of vowels in Mandarin Chinese",
1193-1197.
Slaney, Malcolm / Shriberg, Elizabeth / Huang, Jui-Ting:
"Pitch-gesture modeling using subband autocorrelation change detection",
1911-1915.
Gangamohan, P. / Kadiri, Sudarsana Reddy / Yegnanarayana, B.:
"Analysis of emotional speech at subsegmental level",
1916-1920.
Morise, Masanori / Kawahara, Hideki / Ozawa, Kenji:
"Periodicity extraction for voiced sounds with multiple periodicity",
1921-1925.
Taylor, John H. / Milner, Ben:
"Modelling and estimation of the fundamental frequency of speech using a hidden Markov model",
1926-1930.
Pohjalainen, Jouni / Alku, Paavo:
"Extended weighted linear prediction using the autocorrelation snapshot — a robust speech analysis method and its application to recognition of vocal emotions",
1931-1935.
Asgari, Meysam / Shafran, Izhak:
"Improving the accuracy and the robustness of harmonic model for pitch estimation",
1936-1940.
Kane, John / Scherer, Stefan / Morency, Louis-Philippe / Gobl, Christer:
"A comparative study of glottal open quotient estimation techniques",
1658-1662.
Kasess, Christian H. / Kreuzer, Wolfgang:
"Estimation of multiple-branch vocal tract models: the influence of prior assumptions",
1663-1667.
Geiger, Jürgen T. / Eyben, Florian / Schuller, Björn / Rigoll, Gerhard:
"Detecting overlapping speech with long short-term memory recurrent neural networks",
1668-1672.
Sasou, Akira:
"Evaluation of fundamental validity in applying AR-HMM with automatic topology generation to pathology voice analysis",
1673-1676.
Adiga, Nagaraj / Prasanna, S. R. M.:
"Significance of instants of significant excitation for source modeling",
1677-1681.
Arya, Devanshu / Raj, Anant / Hegde, Rajesh M.:
"Significance of variable height-bandwidth group delay filters in the spectral reconstruction of speech",
1682-1686.
Patil, Hemant A. / Patel, Tanvina B.:
"Nonlinear prediction of speech signal using volterra-wiener series",
1687-1691.
Satt, Aharon / Sorin, Alexander / Toledo-Ronen, Orith / Barkan, Oren / Kompatsiaris, Ioannis / Kokonozi, Athina / Tsolaki, Magda:
"Evaluation of speech-based protocol for detection of early-stage dementia",
1692-1696.
Azarov, Elias / Vashkevich, Maxim / Petrovsky, Alexander:
"Instantaneous harmonic representation of speech using multicomponent sinusoidal excitation",
1697-1701.
Babacan, Onur / Drugman, Thomas / d'Alessandro, Nicolas / Henrich, Nathalie / Dutoit, Thierry:
"A quantitative comparison of glottal closure instant estimation algorithms on a large variety of singing sounds",
1702-1706.
Gómez-García, J. A. / Godino-Llorente, Juan Ignacio / Castellanos-Domínguez, G.:
"Automatic gender recognition in normal and pathological speech",
1707-1711.
Cai, Shanqing / Bunnell, H. Timothy / Patel, Rupal:
"Unsupervised vocal-tract length estimation through model-based acoustic-to-articulatory inversion",
1712-1716.
Mirzaei, Sayeh / Van hamme, Hugo / Norouzi, Yaser:
"Model order estimation using Bayesian NMF for discovering phone patterns in spoken utterances",
1717-1721.
Language and Dialect Recognition
Liu, Weiwei / Zhang, Wei-Qiang / Li, Zhiyi / Liu, Jia:
"Parallel absolute-relative feature based phonotactic language recognition",
59-63.
Diez, Mireia / Varona, Amparo / Penagarikano, Mikel / Rodríguez-Fuentes, Luis Javier / Bordel, Germán:
"Dimensionality reduction of phone log-likelihood ratio features for spoken language recognition",
64-68.
Ma, Jeff / Zhang, Bing / Matsoukas, Spyros / Mallidi, Sri Harish / Li, Feipeng / Hermansky, Hynek:
"Improvements in language identification on the RATS noisy speech corpus",
69-73.
Soufifar, Mehdi / Burget, Lukáš / Plchot, Oldřich / Cumani, Sandro / Černocký, Jan:
"Regularized subspace n-gram model for phonotactic ivector extraction",
74-78.
Behravan, Hamid / Hautamäki, Ville / Kinnunen, Tomi:
"Foreign accent detection from spoken Finnish using i-vectors",
79-83.
McLaren, Mitchell / Lawson, Aaron / Lei, Yun / Scheffer, Nicolas:
"Adaptive Gaussian backend for robust language identification",
84-88.
ASR — Neural Networks
Paulik, Matthias:
"Lattice-based training of bottleneck feature extraction neural networks",
89-93.
Gehring, Jonas / Lee, Wonkyum / Kilgour, Kevin / Lane, Ian / Miao, Yajie / Waibel, Alex:
"Modular combination of deep neural networks for acoustic modeling",
94-98.
Chang, Shuo-Yiin / Morgan, Nelson:
"Informative spectro-temporal bottleneck features for noise-robust speech recognition",
99-103.
Yan, Zhi-Jie / Huo, Qiang / Xu, Jian:
"A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR",
104-108.
Rath, Shakti P. / Povey, Daniel / Veselý, Karel / Černocký, Jan:
"Improved feature processing for deep neural networks",
109-113.
Vinyals, Oriol / Morgan, Nelson:
"Deep vs. wide: depth on a budget for robust speech recognition",
114-118.
Speech Acoustics
Braun, Angelika:
"An early case of “VOT”",
119-122.
Fox, Robert Allen / Jacewicz, Ewa / Hart, Jessica:
"Pitch pattern variations in three regional varieties of American English",
123-127.
Liénard, Jean-Sylvain / Barras, Claude:
"Fine-grain voice strength estimation from vowel spectral cues",
128-132.
Godoy, Elizabeth / Mayo, Catherine / Stylianou, Yannis:
"Linking loudness increases in normal and lombard speech to decreasing vowel formant separation",
133-137.
Motoki, Kunitoshi:
"Three-dimensional rectangular vocal-tract model with asymmetric wall impedances",
138-142.
Airaksinen, Manu / Story, Brad / Alku, Paavo:
"Quasi closed phase analysis for glottal inverse filtering",
143-147.
Paralinguistic Challenge (Special Session)
Schuller, Björn / Steidl, Stefan / Batliner, Anton / Vinciarelli, Alessandro / Scherer, Klaus / Ringeval, Fabien / Chetouani, Mohamed / Weninger, Felix / Eyben, Florian / Marchi, Erik / Mortillaro, Marcello / Salamin, Hugues / Polychroniou, Anna / Valente, Fabio / Kim, Samuel:
"The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism",
148-152.
Janicki, Artur:
"Non-linguistic vocalisation recognition based on hybrid GMM-SVM approach",
153-157.
Oh, Jieun / Cho, Eunjoon / Slaney, Malcolm:
"Characteristic contours of syllabic-level units in laughter",
158-162.
Krikke, Teun F. / Truong, Khiet P.:
"Detection of nonverbal vocalizations using Gaussian mixture models: looking for fillers and laughter in conversational speech",
163-167.
Wagner, Johannes / Lingenfelser, Florian / André, Elisabeth:
"Using phonetic patterns for detecting social cues in natural conversations",
168-172.
Gupta, Rahul / Audhkhasi, Kartik / Lee, Sungbok / Narayanan, Shrikanth:
"Paralinguistic event detection from speech using probabilistic time-series smoothing and masking",
173-177.
An, Gouzhen / Brizan, David Guy / Rosenberg, Andrew:
"Detecting laughter and filled pauses using syllable-based features",
178-181.
Bone, Daniel / Chaspari, Theodora / Audkhasi, Kartik / Gibson, James / Tsiartas, Andreas / Segbroeck, Maarten Van / Li, Ming / Lee, Sungbok / Narayanan, Shrikanth:
"Classifying language-related developmental disorders from speech cues: the promise and the potential confounds",
182-186.
Kirchhoff, Katrin / Liu, Yuzong / Bilmes, Jeff:
"Classification of developmental disorders from speech signals using submodular feature selection",
187-190.
Asgari, Meysam / Bayestehtashk, Alireza / Shafran, Izhak:
"Robust and accurate features for detecting and diagnosing autism spectrum disorders",
191-194.
Martínez, David / Ribas, Dayana / Lleida, Eduardo / Ortega, Alfonso / Miguel, Antonio:
"Suprasegmental information modelling for autism disorder spectrum and specific language impairment classification",
195-199.
Grèzes, Félix / Richards, Justin / Rosenberg, Andrew:
"Let me finish: automatic conflict detection using speaker overlap",
200-204.
Sethu, Vidhyasaharan / Epps, Julien / Ambikairajah, Eliathamby / Li, Haizhou:
"GMM based speaker variability compensated system for interspeech 2013 compare emotion challenge",
205-209.
Räsänen, Okko / Pohjalainen, Jouni:
"Random subset feature selection in automatic recognition of developmental disorders, affective states, and level of conflict from speech",
210-214.
Lee, Hung-yi / Hu, Ting-yao / Jing, How / Chang, Yun-Fan / Tsao, Yu / Kao, Yu-Cheng / Pao, Tsang-Long:
"Ensemble of machine learning and acoustic segment model techniques for speech emotion and autism spectrum disorders recognition",
215-219.
Gosztolya, Gábor / Busa-Fekete, Róbert / Tóth, László:
"Detecting autism, emotions and social signals using adaboost",
220-224.
Perception of Prosody
Niebuhr, Oliver:
"Resistance is futile — the intonation between continuation rise and calling contour in German",
225-229.
Mixdorff, Hansjörg / Niebuhr, Oliver:
"The influence of F0 contour continuity on prominence perception",
230-234.
Smith, Caroline L. / Edmunds, Paul:
"Native English listeners' perceptions of prosody in L1 and L2 reading",
235-238.
Tsurutani, Chiharu / Luo, Dean:
"Naturalness judgement of L2 Mandarin Chinese — does timing matter?",
239-242.
Aalto, Daniel / Šimko, Juraj / Vainio, Martti:
"Language background affects the strength of the pitch bias in a duration discrimination task",
243-247.
Zellers, Margaret:
"Pitch and lengthening as cues to turn transition in Swedish",
248-252.
Bissiri, Maria Paola / Zellers, Margaret:
"Perception of glottalization in varying pitch contexts across languages",
253-257.
Walsh, Michael / Schweitzer, Katrin / Schauffler, Nadja:
"Exemplar-based pitch accent categorisation using the generalized context model",
258-262.
Braun, Bettina / Asano, Yuki:
"Double contrast is signalled by prenuclear and nuclear accent types alone, not by f0-plateaux",
263-266.
Correia, Susana / Frota, Sónia / Butler, Joseph / Vigário, Marina:
"Word stress perception in European Portuguese",
267-271.
Arnold, Denis / Wagner, Petra / Baayen, R. Harald:
"Using generalized additive models and random forests to model prosodic prominence in German",
272-276.
Pfitzinger, Hartmut R. / Mixdorff, Hansjörg:
"Perceiving speech rate differences between natural and time-scale modified utterances",
277-281.
Prosody, Phonetics of Language Varieties
Barbosa, Plínio A. / Eriksson, Anders / Åkesson, Joel:
"On the robustness of some acoustic parameters for signalling word stress across styles in Brazilian Portuguese",
282-286.
Lyu, Shao-ren / Pan, Ho-hsien:
"Reexamine the sandhi rules and the merging tones in hakka language",
287-290.
Tabain, Marija / Beare, Richard / Butcher, Andrew:
"A preliminary spectral analysis of palatal and velar stop bursts in pitjantjatjara",
291-295.
Mahanta, Shakuntala / Twaha, A. I.:
"Presentational focus realisation in nalbaria variety of assamese",
296-299.
Cruz, Marisa / Frota, Sónia:
"On the relation between intonational phrasing and pitch accent distribution. evidence from European Portuguese varieties",
300-304.
Nemoto, Rena / Adda-Decker, Martine:
"How are word-final schwas different in the north and south of france?",
305-309.
Ashby, Simone / Barbosa, Sílvia / Silva, Catarina / Fumo, Paulino / Ferreira, José Pedro:
"Modeling postcolonial language varieties: challenges and lessons learned from mozambican Portuguese",
310-314.
Sahkai, Heete / Kalvik, Mari-Liis / Mihkla, Meelis:
"Prosody of contrastive focus in estonian",
315-319.
Kisler, Thomas / Reichel, Uwe D.:
"Exploring the connection of acoustic and distinctive features",
320-324.
Cunha, Conceição / Harrington, Jonathan / Hoole, Phil:
"A physiological analysis of the tense/lax vowel contrast in two varieties of German",
325-329.
Meister, Einar / Meister, Lya:
"Production of estonian quantity contrasts by native speakers of Finnish",
330-334.
Meynadier, Yohann / Gaydina, Yulia:
"Aerodynamic and durational cues of phonological voicing in whisper",
335-339.
Reichel, Uwe D.:
"Information theoretic syllable structure and its relation to the c-center effect",
340-344.
Andreeva, Bistra / Barry, William / Koreman, Jacques:
"The bulgarian stressed and unstressed vowel system. a corpus study",
345-348.
Speech Synthesis I. II
Prom-on, Santitham / Birkholz, Peter / Xu, Yi:
"Training an articulatory synthesizer with continuous acoustic data",
349-353.
Kiss, Géza / Santen, Jan P. H. van:
"Estimating speaker-specific intonation patterns using the linear alignment model",
354-358.
Sung, June Sig / Hong, Doo Hwa / Koo, Hyun Woo / Kim, Nam Soo:
"Factored maximum likelihood kernelized regression for HMM-based singing voice synthesis",
359-363.
Takamichi, Shinnosuke / Toda, Tomoki / Shiga, Yoshinori / Sakti, Sakriani / Neubig, Graham / Nakamura, Satoshi:
"Improvements to HMM-based speech synthesis based on parameter generation with rich context models",
364-368.
Nakashika, Toru / Takashima, Ryoichi / Takiguchi, Tetsuya / Ariki, Yasuo:
"Voice conversion in high-order eigen space using deep belief nets",
369-372.
Silén, Hanna / Nurminen, Jani / Helander, Elina / Gabbouj, Moncef:
"Voice conversion for non-parallel datasets using dynamic kernel partial least squares regression",
373-377.
Nose, Takashi / Kanemoto, Misa / Koriyama, Tomoki / Kobayashi, Takao:
"A style control technique for singing voice synthesis based on multiple-regression HSMM",
378-382.
Hinterleitner, Florian / Norrenbrock, Christoph R. / Möller, Sebastian / Heute, Ulrich:
"Predicting the quality of text-to-speech systems from a large-scale feature set",
383-387.
Nurminen, Jani / Silén, Hanna / Gabbouj, Moncef:
"Speaker-specific retraining for enhanced compression of unit selection text-to-speech databases",
388-391.
Huckvale, Mark / Leff, Julian / Williams, Geoff:
"Avatar therapy: an audio-visual dialogue system for treating auditory hallucinations",
392-396.
Muthukumar, Prasanna Kumar / Black, Alan W. / Bunnell, H. Timothy:
"Optimizations and fitting procedures for the liljencrants-fant model for statistical parametric speech synthesis",
397-401.
Hovy, Dirk / Anumanchipalli, Gopala Krishna / Parlikar, Alok / Vaughn, Caroline / Lammert, Adam / Hovy, Eduard / Black, Alan W.:
"Analysis and modeling of “focus” in context",
402-406.
Ishihara, Tatsuma / Kameoka, Hirokazu / Yoshizato, Kota / Saito, Daisuke / Sagayama, Shigeki:
"Probabilistic speech F0 contour model incorporating statistical vocabulary model of phrase-accent command sequence",
1017-1021.
McLoughlin, Ian Vince / Li, Jingjie / Song, Yan:
"Reconstruction of continuous voiced speech from whispers",
1022-1026.
Niekerk, Daniel R. van / Barnard, Etienne:
"Generating fundamental frequency contours for speech synthesis in yorùbá",
1027-1031.
Azarov, Elias / Vashkevich, Maxim / Likhachov, Denis / Petrovsky, Alexander:
"Real-time voice conversion using artificial neural networks with rectified linear units",
1032-1036.
Krityakien, Oraphan / Hirose, Keikichi / Minematsu, Nobuaki:
"Generation of fundamental frequency contours for Thai speech synthesis using tone nucleus model",
1037-1041.
Chen, Langzhou / Braunschweiler, Norbert:
"Unsupervised speaker and expression factorization for multi-speaker expressive synthesis of ebooks",
1042-1046.
Nakajima, Hideharu / Mizuno, Hideyuki / Yoshioka, Osamu / Takahashi, Satoshi:
"Which resemblance is useful to predict phrase boundary rise labels for Japanese expressive text-to-speech synthesis, numerically-expressed stylistic or distribution-based semantic?",
1047-1051.
Ni, Jinfu / Shiga, Yoshinori / Hori, Chiori / Kidawara, Yutaka:
"A targets-based superpositional model of fundamental frequency contours applied to HMM-based speech synthesis",
1052-1056.
Kobayashi, Kazuhiro / Doi, Hironori / Toda, Tomoki / Nakano, Tomoyasu / Goto, Masataka / Neubig, Graham / Sakti, Sakriani / Nakamura, Satoshi:
"An investigation of acoustic features for singing voice conversion based on perceptual age",
1057-1061.
Bollepalli, Bajibabu / Raitio, Tuomo / Alku, Paavo:
"Effect of MPEG audio compression on HMM-based speech synthesis",
1062-1066.
Doi, Hironori / Toda, Tomoki / Nakano, Tomoyasu / Goto, Masataka / Nakamura, Satoshi:
"Evaluation of a singing voice conversion method based on many-to-many eigenvoice conversion",
1067-1071.
Koriyama, Tomoki / Nose, Takashi / Kobayashi, Takao:
"Statistical nonparametric speech synthesis using sparse Gaussian processes",
1072-1076.
Mohammadi, Amir / Demiroglu, Cenk:
"Hybrid nearest-neighbor/cluster adaptive training for rapid speaker adaptation in statistical speech synthesis systems",
1077-1081.
Cabral, João P.:
"Uniform concatenative excitation model for synthesising speech without voiced/unvoiced classification",
1082-1086.
Perception, Dialectal Differences
Tran, Thi Anh Xuan / Nguyen, Viet Son / Castelli, Eric / Carré, René:
"Production and perception of pseudo-V1CV2 outside the vowel triangle: speech illusion effects",
407-411.
Candea, Maria / Adda-Decker, Martine / Lamel, Lori:
"Recent evolution of non-standard consonantal variants in French broadcast news",
412-416.
Zimmerer, Frank / Yasuda, Rei / Reetz, Henning:
"Architekt or archtekt? perception of devoiced vowels produced by Japanese speakers of German",
417-420.
Plummer, Andrew R. / Ménard, Lucie / Munson, Benjamin / Beckman, Mary E.:
"Comparing vowel category response surfaces over age-varying maximal vowel spaces within and across language communities",
421-425.
Babel, Molly / McGuire, Grant:
"Perceived vocal attractiveness across dialects is similar but not uniform",
426-430.
Wang, Hongyan / Heuven, Vincent J. van:
"Mutual intelligibility of American, Chinese and Dutch-accented speakers of English tested by SUS and SPIN sentences",
431-435.
Speech Enhancement — Single Channel
Lu, Xugang / Tsao, Yu / Matsuda, Shigeki / Hori, Chiori:
"Speech enhancement based on deep denoising autoencoder",
436-440.
Saruwatari, Hiroshi / Kanehara, Suzumi / Miyazaki, Ryoichi / Shikano, Kiyohiro / Kondo, Kazunobu:
"Musical noise analysis for Bayesian minimum mean-square error speech amplitude estimators based on higher-order statistics",
441-445.
Lyubimov, Nikolay / Kotov, Mikhail:
"Non-negative matrix factorization with linear constraints for single-channel speech enhancement",
446-450.
Tseng, Hung-Wei / Vishnubhotla, Srikanth / Hong, Mingyi / Wang, Xiangfeng / Xiao, Jinjun / Luo, Zhi-Quan / Zhang, Tao:
"A single channel speech enhancement approach by combining statistical criterion and multi-frame sparse dictionary learning",
451-455.
Mirbagheri, Majid / Xu, Yanbo / Akram, Sahar / Shamma, Shihab:
"Speech enhancement using convolutive nonnegative matrix factorization with cosparsity regularization",
456-459.
McCallum, Matthew / Guillemin, Bernard:
"Joint stochastic-deterministic wiener filtering with recursive Bayesian estimation of deterministic speech",
460-464.
Dialog Modeling
Knuuttila, Juho / Räsänen, Okko / Laine, Unto K.:
"Automatic self-supervised learning of associations between speech and text",
465-469.
Daubigney, Lucie / Geist, Matthieu / Pietquin, Olivier:
"Particle swarm optimisation of spoken dialogue system strategies",
470-474.
Lison, Pierre:
"Model-based Bayesian reinforcement learning for dialogue management",
475-479.
Ghigi, Fabrizio / Torres, María Inés / Justo, Raquel / Benedí, José-Miguel:
"Evaluating spoken dialogue models under the interactive pattern recognition framework",
480-484.
Chen, Yun-Nung / Metze, Florian:
"Multi-layer mutually reinforced random walk with hidden parameters for improved multi-party meeting summarization",
485-489.
Su, Pei-hao / Wang, Yow-Bang / Wen, Tsung-Hsien / Yu, Tien-han / Lee, Lin-shan:
"A recursive dialogue game framework with optimal Policy offering personalized computer-assisted language learning",
490-494.
ASR — Lexical, Prosodic and Cross/Multi-Lingual
Hahn, Stefan / Lehnen, Patrick / Wiesler, Simon / Schlüter, Ralf / Ney, Hermann:
"Improving LVCSR with hidden conditional random fields for grapheme-to-phoneme conversion",
495-499.
Do, Van Hai / Xiao, Xiong / Chng, Eng Siong / Li, Haizhou:
"Context-dependent phone mapping for LVCSR of under-resourced languages",
500-504.
Rasipuram, Ramya / Magimai-Doss, Mathew:
"Improving grapheme-based ASR by probabilistic lexical modeling approach",
505-509.
Motlicek, Petr / Imseng, David / Garner, Philip N.:
"Crosslingual tandem-SGMM: exploiting out-of-language data for acoustic model and feature level adaptation",
510-514.
Vu, Ngoc Thang / Schultz, Tanja:
"Multilingual multilayer perceptron for rapid language adaptation between and across language families",
515-519.
Rosenberg, Andrew:
"Modeling prosodic sequences with k-means and dirichlet process GMMs",
520-524.
Phonetic Convergence
Schweitzer, Antje / Lewandowski, Natalie:
"Convergence of articulation rate in spontaneous speech",
525-529.
Pardo, Jennifer S.:
"Phonetic convergence in shadowed speech: a comparison of perceptual and acoustic measures",
530-534.
Włodarczak, Marcin / Šimko, Juraj / Wagner, Petra:
"Pitch and duration as a basis for entrainment of overlapped speech onsets",
535-538.
Bonin, Francesca / Looze, Céline De / Ghosh, Sucheta / Gilmartin, Emer / Vogel, Carl / Polychroniou, Anna / Salamin, Hugues / Vinciarelli, Alessandro / Campbell, Nick:
"Investigating fine temporal dynamics of prosodic and lexical accommodation",
539-543.
Kim, Jeesun / Demirdjian, Ruben / Davis, Chris:
"Spontaneous and explicit speech imitation",
544-547.
Podlipský, Václav Jonáš / Šimáčková, Šárka / Chládková, Kateřina:
"Imitation interacts with one's second-language phonology but it does not operate cross-linguistically",
548-552.
Speech Production, Acquisition and Development I, II
Hsieh, Po-jen:
"Prosodic markings of semantic predictability in taiwan Mandarin",
553-557.
Hoffmann, Rüdiger / Mehnert, Dieter / Dietzel, Rolf:
"How did it work? historic phonetic devices explained by coeval photographs",
558-562.
Kohtz, Lea S. / Niebuhr, Oliver:
"Eliciting speech with sentence lists — a critical evaluation with special emphasis on segmental anchoring",
563-567.
Wang, Yuguang / Dang, Jianwu / Chen, Xi / Wei, Jianguo / Wang, Hongcui / Honda, Kiyoshi:
"An MRI-based acoustic study of Mandarin vowels",
568-571.
Hirst, Daniel:
"Melody metrics for prosodic typology: comparing English, French and Chinese",
572-576.
Proctor, Michael / Goldstein, Louis / Lammert, Adam / Byrd, Dani / Toutios, Asterios / Narayanan, Shrikanth:
"Velic coordination in French nasals: a real-time magnetic resonance imaging study",
577-581.
Huckvale, Mark / Sharma, Amrita:
"Learning to imitate adult speech with the KLAIR virtual infant",
582-586.
Lucero, Jorge C. / Schoentgen, Jean / Behlau, Mara:
"Physics-based synthesis of disordered voices",
587-591.
d'Apolito, Sonia / Fivela, Barbara Gili:
"Place assimilation and articulatory strategies: the case of sibilant sequences in French as L1 and L2",
592-596.
Samlowski, Barbara / Wagner, Petra / Möbius, Bernd:
"Effects of lexical class and lemma frequency on German homographs",
597-601.
Lancia, Leonardo / Avelino, Heriberto / Voigt, Daniel:
"Measuring laryngealization in running speech: interaction with contrastive tones in yalálag zapotec",
602-606.
Rusaw, Erin:
"A neural oscillator model of speech timing and rhythm",
607-611.
Wong, Nicole / Fu, Maojing / Liang, Zhi-Pei / Shosted, Ryan K. / Sutton, Bradley P.:
"Observations of perseverative coarticulation in lateral approximants using MRI",
612-616.
Fujimoto, Masako / Kitamura, Tatsuya / Hatano, Hiroaki / Fujimoto, Ichiro:
"Timing differences in articulation between voiced and voiceless stop consonants: an analysis of cine-MRI data",
955-958.
Lammert, Adam / Ramanarayanan, Vikram / Proctor, Michael / Narayanan, Shrikanth:
"Vocal tract cross-distance estimation from real-time MRI using region-of-interest analysis",
959-962.
Arrabothu, Apoorv Reddy / Chennupati, Nivedita / Yegnanarayana, B.:
"Syllable nuclei detection using perceptually significant features",
963-967.
Hsieh, Fang-Ying / Goldstein, Louis / Byrd, Dani / Narayanan, Shrikanth:
"Truncation of pharyngeal gesture in English diphthong [aɪ]",
968-972.
Yang, Zhaojun / Ramanarayanan, Vikram / Byrd, Dani / Narayanan, Shrikanth:
"The effect of word frequency and lexical class on articulatory-acoustic coupling",
973-977.
Yamakawa, Kimiko / Amano, Shigeaki:
"Discrimination between fricative and affricate in Japanese using time and spectral domain variables",
978-981.
Drozdova, Polina / Cucchiarini, Catia / Strik, Helmer:
"L2 syntax acquisition: the effect of oral and written computer assisted practice",
982-986.
Signorello, Rosario / Demolin, Didier:
"The physiological use of the charismatic voice in Political speech",
987-991.
Rose, Ralph L.:
"Crosslinguistic corpus of hesitation phenomena: a corpus for investigating first and second language speech performance",
992-996.
Preuß, Simon / Neuschaefer-Rube, Christiane / Birkholz, Peter:
"Real-time control of a 2d animation model of the vocal tract using optopalatography",
997-1001.
Siddins, Jessica / Harrington, Jonathan / Kleber, Felicitas / Reubold, Ulrich:
"The influence of accentuation and polysyllabicity on compensatory shortening in German",
1002-1006.
Ding, Hongwei / Hoffmann, Rüdiger:
"An investigation of vowel epenthesis in Chinese learners' production of German consonants",
1007-1011.
Richmond, Korin / Ling, Zhen-Hua / Yamagishi, Junichi / Uría, Benigno:
"On the evaluation of inversion mapping performance in the acoustic domain",
1012-1016.
General Topics in ASR
Gupta, Vishwa / Boulianne, Gilles:
"Comparing computation in Gaussian mixture and neural network based large-vocabulary speech recognition",
617-621.
Stein, Daniel / Schwenninger, Jochen / Stadtschnitzer, Michael:
"Simultaneous perturbation stochastic approximation for automatic speech recognition",
622-626.
Sheffield, David / Anderson, Michael / Lee, Yunsup / Keutzer, Kurt:
"Hardware/software codesign for mobile speech recognition",
627-631.
Shi, Yangyang / Larson, Martha / Wiggers, Pascal / Jonker, Catholijn M.:
"Exploiting the succeeding words in recurrent neural network language models",
632-636.
Torbati, Amir Hossein Harati Nejad / Picone, Joseph / Sobel, Marc:
"Speech acoustic unit segmentation using hierarchical dirichlet processes",
637-641.
Georges, Munir / Kanthak, Stephan / Klakow, Dietrich:
"Transducer-based speech recognition with dynamic language models",
642-646.
Kubo, Yotaro / Hori, Takaaki / Nakamura, Atsushi:
"A method for structure estimation of weighted finite-state transducers and its application to grapheme-to-phoneme conversion",
647-651.
Jouvet, Denis / Fohr, Dominique:
"Combining forward-based and backward-based decoders for improved speech recognition performance",
652-656.
Siohan, Olivier / Bacchiani, Michiel:
"ivector-based acoustic data selection",
657-661.
Lei, Xin / Senior, Andrew / Gruenstein, Alexander / Sorensen, Jeffrey:
"Accurate and compact large vocabulary speech recognition on mobile devices",
662-665.
Allauzen, Cyril / Riley, Michael:
"Pre-initialized composition for large-vocabulary speech recognition",
666-670.
Kurniawati, Evelyn / George, Sapna:
"Speaker dependent activation keyword detector based on GMM-UBM",
671-674.
Sak, Haşim / Sung, Yun-hsuan / Beaufays, Françoise / Allauzen, Cyril:
"Written-domain language modeling for automatic speech recognition",
675-679.
Voice Activity Detection and Speech Segmentation
Versteegh, Maarten / Bosch, Louis ten:
"Detecting words in speech using linear separability in a bag-of-events vector space",
680-684.
Burlick, Matt / Dimitriadis, Dimitrios / Zavesky, Eric:
"On the improvement of multimodal voice activity detection",
685-689.
Geiger, Jürgen T. / Eyben, Florian / Evans, Nicholas / Schuller, Björn / Rigoll, Gerhard:
"Using linguistic information to detect overlapping speech",
690-694.
Ye, Jiaxing / Kobayashi, Takumi / Murakawa, Masahiro / Higuchi, Tetsuya:
"Incremental acoustic subspace learning for voice activity detection using harmonicity-based features",
695-699.
Chung, Hoon / Lee, SungJoo / Lee, YunKeun:
"Endpoint detection using weighted finite state transducer",
700-703.
Segbroeck, Maarten Van / Tsiartas, Andreas / Narayanan, Shrikanth:
"A robust frontend for VAD: exploiting contextual, discriminative and spectral cues of human voice",
704-708.
Graciarena, Martin / Alwan, Abeer / Ellis, Dan / Franco, Horacio / Ferrer, Luciana / Hansen, John H. L. / Janin, Adam / Lee, Byung-Suk / Lei, Yun / Mitra, Vikramjit / Morgan, Nelson / Sadjadi, Seyed Omid / Tsai, T. J. / Scheffer, Nicolas / Tan, Lee Ngee / Williams, Benjamin:
"All for one: feature combination for highly channel-degraded speech activity detection",
709-713.
Coz, Maxime Le / Pinquier, Julien / André-Obrecht, Régine:
"Superposed speech localisation using frequency tracking",
714-717.
Tsiartas, Andreas / Chaspari, Theodora / Katsamanis, Nassos / Ghosh, Prasanta Kumar / Li, Ming / Segbroeck, Maarten Van / Potamianos, Alexandros / Narayanan, Shrikanth:
"Multi-band long-term signal variability features for robust voice activity detection",
718-722.
Lezzoum, Narimene / Gagnon, Ghyslain / Voix, Jérémie:
"A low-complexity voice activity detector for smart hearing protection of hyperacusic persons",
723-727.
Ryant, Neville / Liberman, Mark / Yuan, Jiahong:
"Speech activity detection on youtube using deep neural networks",
728-731.
Germain, François G. / Sun, Dennis L. / Mysore, Gautham J.:
"Speaker and noise independent voice activity detection",
732-736.
Tsai, T. J. / Janin, Adam:
"Confidence-based scoring: a useful diagnostic tool for detection tasks",
737-741.
Kanai, Yasuaki / Morita, Shota / Unoki, Masashi:
"Concurrent processing of voice activity detection and noise reduction using empirical mode decomposition and modulation spectrum analysis",
742-746.
Show and Tell Sessions 1-3
Al Moubayed, Samer / Beskow, Jonas / Skantze, Gabriel:
"The furhat social companion talking head",
747-749.
Gelin, Rodolphe / Barbieri, Gabriele:
"Audition: the most important sense for humanoid robots?",
750-751.
Hueber, Thomas:
"Ultraspeech-player: intuitive visualization of ultrasound articulatory data for speech therapy and pronunciation training",
752-753.
Oh, Jieun / Wang, Ge:
"Laughter modulation: from speech to speech-laugh",
754-755.
Bikel, Daniel M. / Hall, Keith B.:
"Refr: an open-source reranker framework",
756-758.
Sosi, Alessandro / Brugnara, Fabio / Cristoforetti, Luca / Matassoni, Marco / Ravanelli, Mirco / Omologo, Maurizio:
"Embedding speech recognition to control lights",
759-760.
Meltzner, Geoffrey S. / Heaton, James T. / Deng, Yunbin:
"The MUTE silent speech recognition system",
761-763.
Scobbie, James M. / Turk, Alice / Geng, Christian / King, Simon / Lickley, Robin / Richmond, Korin:
"The edinburgh speech production facility doubletalk corpus",
764-766.
Sityaev, Dmitry / Hotz, Jonathan / Snitkovsky, Vadim:
"Lexee: a cloud-based platform for building and deploying voice-enabled mobile applications",
767-769.
Ouni, Slim:
"Visualizing articulatory data with VisArtico",
770-772.
Soury, Mariette / Gossart, Clément / Adda-Decker, Martine / Devillers, Laurence:
"A tool to elicit and collect multicultural and multimodal laughter",
773-774.
Schleicher, Robert / Westermann, Tilo / Li, Jinjin / Lawitschka, Moritz / Mateev, Benjamin / Reichmuth, Ralf / Möller, Sebastian:
"Design of a mobile app for interspeech conferences: towards an open tool for the spoken language community",
775-777.
Metze, Florian / Fosler-Lussier, Eric / Bates, Rebecca:
"The speech recognition virtual kitchen",
1858-1860.
Chen, John / Wen, Shufei / Sridhar, Vivek Kumar Rangarajan / Bangalore, Srinivas:
"Multilingual web conferencing using speech-to-speech translation",
1861-1863.
Ferragne, Emmanuel / Flavier, Sébastien / Fressard, Christian:
"ROCme! software for the recording and management of speech corpora",
1864-1865.
Burkhardt, Felix:
"Voice search in mobile applications with the rootvole framework",
1866-1868.
III, John S. Novak / Archer, Jason / Shafiro, Valeriy / Kenyon, Robert / Leigh, Jason:
"On-line audio dilation for human interaction",
1869-1871.
Mowlaee, Pejman / Watanabe, Mario Kaoru / Saeidi, R.:
"Phase-aware single-channel speech enhancement",
1872-1874.
Hirano, Hiroko / Nakamura, Ibuki / Minematsu, Nobuaki / Suzuki, Masayuki / Nakagawa, Chieko / Nakamura, Noriko / Tagawa, Yukinori / Hirose, Keikichi / Hashimoto, Hiroya:
"A free online accent and intonation dictionary for teachers and learners of Japanese",
1875-1876.
Astrinaki, Maria / Yamagishi, Junichi / King, Simon / d'Alessandro, Nicolas / Dutoit, Thierry:
"Reactive accent interpolation through an interactive map application",
1877-1878.
Berkling, Kay:
"A non-experts user interface for obtaining automatic diagnostic spelling evaluations for learners of the German writing system",
1879-1881.
Soury, Mariette / Gossart, Clément / Adda-Decker, Martine / Devillers, Laurence:
"A tool to elicit and collect multicultural and multimodal laughter".
Schleicher, Robert / Westermann, Tilo / Li, Jinjin / Lawitschka, Moritz / Mateev, Benjamin / Reichmuth, Ralf / Möller, Sebastian:
"Design of a mobile app for interspeech conferences: towards an open tool for the spoken language community".
Clark, Robert A. J.:
"Simple4all",
2654-2656.
Pointeau, Grégoire / Petit, Maxime / Hinaut, Xavier / Gibert, Guillaume / Dominey, Peter Ford:
"On-line learning of lexical items and grammatical constructions via speech, gaze and action-based human-robot interaction",
2657-2659.
Miyakoda, Haruko:
"Development of a pronunciation training system based on auditory-visual elements",
2660-2661.
Azarov, Elias / Vashkevich, Maxim / Likhachov, Denis / Petrovsky, Alexander:
"Real-time and non-real-time voice conversion systems with web interfaces",
2662-2663.
Csala, E. / Németh, G. / Zainkó, Cs.:
"Application of the NAO humanoid robot in the treatment of bone marrow-transplanted children (demo)",
2664-2666.
Wan, Vincent / Anderson, Robert / Blokland, Art / Braunschweiler, Norbert / Chen, Langzhou / Kolluru, BalaKrishna / Latorre, Javier / Maia, Ranniery / Stenger, Björn / Yanagisawa, Kayoko / Stylianou, Yannis / Akamine, Masami / Gales, M. J. F. / Cipolla, Roberto:
"Photo-realistic expressive text to talking head synthesis",
2667-2669.
Maddieson, Ian / Flavier, Sébastien / Marsico, Egidio / Pellegrino, François:
"Demonstration of LAPSyd: lyon-albuquerque phonological systems database",
2670-2671.
Boyce, Suzanne / Speights, Marisha / Ishikawa, Keiko / MacAuslan, Joel:
"Speechmark acoustic landmark tool: application to voice pathology",
2672-2674.
Catanese, Laurence / Souviraà-Labastie, Nathan / Qu, Bingqing / Campion, Sebastien / Gravier, Guillaume / Vincent, Emmanuel / Bimbot, Frédéric:
"MODIS: an audio motif discovery software",
2675-2677.
Discourse, Intonation, Prosody
Eriksson, Anders / Barbosa, Plínio A. / Åkesson, Joel:
"The acoustics of word stress in Swedish: a function of stress level, speaking style and word accent",
778-782.
Michelas, Amandine / Portes, Cristel / Champagne-Lavau, Maud:
"Intonational contrasts encode speaker's certainty in neutral vs. incredulity declarative questions in French",
783-787.
Ishimoto, Yuichi / Enomoto, Mika / Iida, Hitoshi:
"Prosodic changes pre-announcing a syntactic completion point in Japanese utterance",
788-792.
Simard, Candide:
"Prosodic encoding of declarative, interrogative and imperative sentences in jaminjung, a language of australia",
793-797.
Vullinghs, Anne / Goudbeek, Martijn / Krahmer, Emiel:
"Crosslinguistic priming in interactive reference: evidence for conceptual alignment in speech production",
798-802.
Kousidis, Spyros / Schlangen, David / Skopeteas, Stavros:
"A cross-linguistic study on turn-taking and temporal alignment in verbal interaction",
803-807.
Source Separation
Grais, Emad M. / Erdogan, Hakan:
"Discriminative nonnegative dictionary learning using cross-coherence penalties for single channel source separation",
808-812.
Kim, Han-Gyu / Jang, Gil-Jin / Park, Jeong-Sik / Oh, Yung-Hwan:
"Monaural speech segregation based on pitch track correction using an ensemble kalman filter",
813-816.
Tran, Thuy N. / Cowley, William / Pollok, André:
"Voice activity classification for automatic bi-speaker adaptive beamforming in speech separation",
817-821.
Kinoshita, Keisuke / Souden, Mehrez / Nakatani, Tomohiro:
"Blind source separation using spatially distributed microphones based on microphone-location dependent source activities",
822-826.
Barker, Tom / Virtanen, Tuomas:
"Non-negative tensor factorisation of modulation spectrograms for monaural sound source separation",
827-831.
Watanabe, Mario Kaoru / Mowlaee, Pejman:
"Iterative sinusoidal-based partial phase reconstruction in single-channel source separation",
832-836.
Paralinguistic Information I, II
Yao, Xiao / Jitsuhiro, Takatoshi / Miyajima, Chiyomi / Kitaoka, Norihide / Takeda, Kazuya:
"Classification of speech under stress by modeling the aerodynamics of the laryngeal ventricle",
837-841.
Rakov, Rachel / Rosenberg, Andrew:
"“sure, i did the right thing”: a system for sarcasm detection in speech",
842-846.
Scherer, Stefan / Stratou, Giota / Gratch, Jonathan / Morency, Louis-Philippe:
"Investigating voice quality as a speaker-independent indicator of depression and PTSD",
847-851.
Pellegrini, Thomas / Hämäläinen, Annika / Mareüil, Philippe Boula de / Tjalve, Michael / Trancoso, Isabel / Candeias, Sara / Dias, Miguel Sales / Braga, Daniela:
"A corpus-based study of elderly and young speakers of European Portuguese: acoustic correlates and their impact on speech recognition performance",
852-856.
Cummins, Nicholas / Epps, Julien / Sethu, Vidhyasaharan / Breakspear, Michael / Goecke, Roland:
"Modeling spectral variability for the classification of depressed speech",
857-861.
Pérez-Rosas, Verónica / Mihalcea, Rada:
"Sentiment analysis of online spoken reviews",
862-866.
Shepstone, Sven Ewan / Tan, Zheng-Hua / Jensen, Søren Holdt:
"Demographic recommendation by means of group profile elicitation using speaker age and gender recognition",
2827-2831.
Malandrakis, Nikolaos / Sundaram, Shiva / Potamianos, Alexandros:
"Affective classification of generic audio clips using regression models",
2832-2836.
Jeon, Je Hun / Le, Duc / Xia, Rui / Liu, Yang:
"A preliminary study of cross-lingual emotion recognition from speech: automatic classification versus human perception",
2837-2840.
Han, Wenjing / Li, Haifeng / Ruan, Huabin / Ma, Lin / Sun, Jiayin / Schuller, Björn:
"Active learning for dimensional speech emotion recognition",
2841-2845.
Kelly, Finnian / Harte, Naomi:
"Auditory detectability of vocal ageing and its effect on forensic automatic speaker recognition",
2846-2850.
Alam, Firoj / Riccardi, G.:
"Comparative study of speaker personality traits recognition in conversational and broadcast news speech",
2851-2855.
Zhang, Zixing / Deng, Jun / Marchi, Erik / Schuller, Björn:
"Active learning by label uncertainty for acoustic emotion recognition",
2856-2860.
Xiao, Bo / Georgiou, Panayiotis G. / Imel, Zac E. / Atkins, David C. / Narayanan, Shrikanth:
"Modeling therapist empathy and vocal entrainment in drug addiction counseling",
2861-2865.
Miyazaki, Chiaki / Higashinaka, Ryuichiro / Makino, Toshiro / Matsuo, Yoshihiro:
"Estimating callers' levels of knowledge in call center dialogues",
2866-2870.
Arias, Juan Pablo / Busso, Carlos / Yoma, Néstor Becerra:
"Energy and F0 contour modeling with functional data analysis for emotional speech detection",
2871-2875.
Mishra, Taniya / Dimitriadis, Dimitrios:
"Incremental emotion recognition",
2876-2880.
Hanilçi, Cemal / Kinnunen, Tomi / Rajan, Padmanabhan / Pohjalainen, Jouni / Alku, Paavo / Ertaş, Figen:
"Comparison of spectrum estimators in speaker verification: mismatch conditions induced by vocal effort",
2881-2885.
Xia, Rui / Liu, Yang:
"Using denoising autoencoder for emotion recognition",
2886-2889.
ASR — Robustness Against Noise I-III
Abdelaziz, Ahmed Hussen / Zeiler, Steffen / Kolossa, Dorothea:
"Using twin-HMM-based audio-visual speech enhancement as a front-end for robust audio-visual speech recognition",
867-871.
Gibson, James / Segbroeck, Maarten Van / Ortega, Antonio / Georgiou, Panayiotis G. / Narayanan, Shrikanth:
"Spectro-temporal directional derivative features for automatic speech recognition",
872-875.
Xiao, Xiong / Chng, Eng Siong / Li, Haizhou:
"Attribute-based histogram equalization (HEQ) and its adaptation for robust speech recognition",
876-880.
Joshi, Vikas / Prasad, N. Vishnu / Umesh, S.:
"Modified cepstral mean normalization — transforming to utterance specific non-zero mean",
881-885.
Mitra, Vikramjit / Franco, Horacio / Graciarena, Martin:
"Damped oscillator cepstral coefficients for robust speech recognition",
886-890.
Alam, Md. Jahangir / Kenny, Patrick / O'Shaughnessy, Douglas:
"Regularized MVDR spectrum estimation-based robust feature extractors for speech recognition",
891-895.
Lu, Liang / Ghoshal, Arnab / Renals, Steve:
"Noise adaptive training for subspace Gaussian mixture models",
3492-3496.
Saon, George / Thomas, Samuel / Soltau, Hagen / Ganapathy, Sriram / Kingsbury, Brian:
"The IBM speech activity detection system for the DARPA RATS program",
3497-3501.
Sehr, Armin / Yoshioka, Takuya / Delcroix, Marc / Kinoshita, Keisuke / Nakatani, Tomohiro / Maas, Roland / Kellermann, Walter:
"Conditional emission densities for combining speech enhancement and recognition systems",
3502-3506.
Wolf, Martin / Nadeu, Climent:
"Channel selection using n-best hypothesis for multi-microphone ASR",
3507-3511.
Ishii, Takaaki / Komiyama, Hiroki / Shinozaki, Takahiro / Horiuchi, Yasuo / Kuroiwa, Shingo:
"Reverberant speech recognition based on denoising autoencoder",
3512-3516.
Maymon, Shay / Dognin, Pierre / Cui, Xiaodong / Goel, Vaibhava:
"Adaptive stereo-based stochastic mapping",
3517-3521.
Kao, Yu-Chen / Chen, Berlin:
"Distribution-based feature normalization for robust speech recognition leveraging context and dynamics cues",
2958-2962.
Liu, Shilin / Sim, Khe Chai:
"An investigation of temporally varying weight regression for noise robust speech recognition",
2963-2967.
Li, Yang / Liu, Xunying / Wang, Lan:
"Feature space generalized variable parameter HMMs for noise robust recognition",
2968-2972.
Brakel, Philémon / Stroobandt, Dirk / Schrauwen, Benjamin:
"Bidirectional truncated recurrent neural networks for efficient speech denoising",
2973-2977.
Variani, Ehsan / Li, Feipeng / Hermansky, Hynek:
"Multi-stream recognition of noisy speech with performance monitoring",
2978-2981.
Fujimoto, Masakiyo / Nakatani, Tomohiro:
"Model-based noise suppression using unsupervised estimation of hidden Markov model for non-stationary noise",
2982-2986.
Nathwani, Karan / Hegde, Rajesh M.:
"Joint noise cancellation and dereverberation using multi-channel linearly constrained minimum variance filter",
2987-2991.
Delcroix, Marc / Kubo, Yotaro / Nakatani, Tomohiro / Nakamura, Atsushi:
"Is speech enhancement pre-processing still relevant when using deep neural networks for acoustic modeling?",
2992-2996.
Hsieh, Hsin-Ju / Chen, Berlin / Hung, Jeih-weih:
"Histogram equalization of real and imaginary modulation spectra for noise-robust speech recognition",
2997-3001.
Li, Bo / Tsao, Yu / Sim, Khe Chai:
"An investigation of spectral restoration algorithms for deep neural networks based noise robust speech recognition",
3002-3006.
Remes, Ulpu:
"Bounded conditional mean imputation with an approximate posterior",
3007-3011.
Cui, Xiaodong / Goel, Vaibhava / Kingsbury, Brian:
"Mixtures of Bayesian joint factor analyzers for noise robust automatic speech recognition",
3012-3016.
Liu, Gang / Dimitriadis, Dimitrios / Bocchieri, Enrico:
"Robust speech enhancement techniques for ASR in non-stationary noise and dynamic environments",
3017-3021.
Neural Basis of Speech Perception
Poblete, Víctor / Yoma, Néstor Becerra / Stern, Richard M.:
"Optimization of sigmoidal rate-level function based on acoustic features",
896-900.
Sadakata, Makiko / Spyrou, Loukianos / Shingai, Mizuki / Sekiyama, Kaoru:
"Composing auditory ERPs: cross-linguistic comparison of auditory change complex for Japanese fricative consonants",
901-905.
Bedoin, Nathalie / Krzonowski, Jennifer / Ferragne, Emmanuel:
"How voicing, place and manner of articulation differently modulate event-related potentials associated with response inhibition",
906-910.
Bellier, Ludovic / Mazzuca, Michel / Thai-Van, Hung / Caclin, Anne / Laboissière, Rafael:
"Categorization of speech in early auditory evoked responses",
911-915.
Manca, Anna Dora / Grimaldi, Mirko:
"Perception and production of Italian vowels: an ERP study",
916-920.
Grohe, Ann-Kathrin / Braun, Bettina:
"Implicit learning leads to familiarity effects for intonation but not for voice",
921-924.
Spoofing and Countermeasures for Automatic Speaker Verification (Special Session)
Evans, Nicholas / Kinnunen, Tomi / Yamagishi, Junichi:
"Spoofing and countermeasures for automatic speaker verification",
925-929.
Hautamäki, Rosa González / Kinnunen, Tomi / Hautamäki, Ville / Leino, Timo / Laukkanen, Anne-Maria:
"I-vectors meet imitators: on vulnerability of speaker verification systems against voice mimicry",
930-934.
Gomez-Barrero, Marta / Gonzalez-Dominguez, Javier / Galbally, Javier / Gonzalez-Rodriguez, Joaquin:
"Security evaluation of i-vector based speaker verification systems against hill-climbing attacks",
935-939.
Alegre, Federico / Vipperla, Ravichander / Amehraye, Asmaa / Evans, Nicholas:
"A new speaker verification spoofing countermeasure based on local binary patterns",
940-944.
Kons, Zvi / Aronowitz, Hagai:
"Voice transformation-based spoofing of text-dependent speaker verification systems",
945-949.
Wu, Zhizheng / Larcher, Anthony / Lee, Kong Aik / Chng, Eng Siong / Kinnunen, Tomi / Li, Haizhou:
"Vulnerability evaluation of speaker verification under voice conversion spoofing: the effect of text constraints",
950-954.
Metadata, Evaluation and Resources I, II
Sperber, Matthias / Neubig, Graham / Fügen, Christian / Nakamura, Satoshi / Waibel, Alex:
"Efficient speech transcription through respeaking",
1087-1091.
Kim, Samuel / Georgiou, Panayiotis G. / Narayanan, Shrikanth:
"Annotation and classification of Political advertisements",
1092-1096.
Higashinaka, Ryuichiro / Dohsaka, Kohji / Isozaki, Hideki:
"Using role play for collecting question-answer pairs for dialogue agents",
1097-1100.
Arimoto, Yoshiko / Okanoya, Kazuo:
"Individual differences of emotional expression in speaker's behavioral and autonomic responses",
1101-1105.
Wechsung, Ina / Weiss, Benjamin / Kühnel, Christine / Ehrenbrink, Patrick / Möller, Sebastian:
"Development and validation of the conversational agents scale (CAS)",
1106-1110.
Riccardi, G. / Ghosh, A. / Chowdhury, S. A. / Bayer, Ali Orkan:
"Motivational feedback in crowdsourcing: a case study in speech transcription",
1111-1115.
Fox, Charles / Liu, Yulan / Zwyssig, Erich / Hain, Thomas:
"The sheffield wargames corpus",
1116-1120.
Kumar, Anuj / Metze, Florian / Wang, Wenyi / Kam, Matthew:
"Formalizing expert knowledge for developing accurate speech recognizers",
1121-1125.
Moubayed, Samer Al / Edlund, Jens / Gustafson, Joakim:
"Analysis of gaze and speech patterns in three-party quiz game interaction",
1126-1130.
Galibert, Olivier:
"Methodologies for the evaluation of speaker diarization and automatic speech recognition in the presence of overlapping speech",
1131-1134.
Sangwan, Abhijeet / Kaushik, Lakshmish / Yu, Chengzhu / Hansen, John H. L. / Oard, Douglas W.:
"'houston, we have a solution': using NASA apollo program to advance speech and language processing technology",
1135-1139.
Matoušek, Jindřich / Tihelka, Daniel:
"Annotation errors detection in TTS corpora",
1511-1515.
Ahmed, Imran / Kopparapu, Sunil Kumar:
"Technique for automatic sentence level alignment of long speech and transcripts",
1516-1519.
Hoffmann, Sarah / Pfister, Beat:
"Text-to-speech alignment of long recordings using universal phone models",
1520-1524.
Stan, Adriana / Bell, Peter / Yamagishi, Junichi / King, Simon:
"Lightly supervised discriminative training of grapheme models for improved sentence-level alignment of speech and text data",
1525-1529.
Sapru, Ashtosh / Bourlard, Hervé:
"Automatic social role recognition in professional meetings using conditional random fields",
1530-1534.
Draxler, Christoph / Feiser, Hanna S.:
"Same same but different — an acoustical comparison of the automatic segmentation of high quality and mobile telephone speech",
1535-1539.
Speech Technology for Speech and Hearing Disorders I, II
Hofe, Robin / Bai, Jie / Cheah, Lam A. / Ell, Stephen R. / Gilbert, James M. / Moore, Roger K. / Green, Phil D.:
"Performance of the MVOCA silent speech interface across multiple speakers",
1140-1143.
Andrade-Miranda, Gustavo / Godino-Llorente, Juan Ignacio:
"Automatic glottal tracking from high-speed digital images using a continuous normalized cross correlation",
1144-1148.
Bocklet, Tobias / Steidl, Stefan / Nöth, Elmar / Skodda, Sabine:
"Automatic evaluation of parkinson's speech — acoustic, prosodic and voice related cues",
1149-1153.
Orosanu, Luiza / Jouvet, Denis:
"Comparison of approaches for an efficient phonetic decoding",
1154-1158.
Christensen, H. / Green, Phil D. / Hain, Thomas:
"Learning speaker-specific pronunciations of disordered speech",
1159-1163.
López-Ludeña, V. / San-Segundo, R. / González-Morcillo, C. / López, J. C. / Ferreiro, E.:
"Adapting a speech into sign language translation system to a new domain",
1164-1168.
Vaerenberg, Bart / Bosch, Louis ten / Kowalczyk, Wojtek / Coene, Martine / Smet, Herwig De / Govaerts, Paul J.:
"Language-universal speech audiometry with automated scoring",
3608-3612.
Hammer, Annemiek / Vaerenberg, Bart / Kowalczyk, Wojtek / Bosch, Louis ten / Coene, Martine / Govaerts, Paul J.:
"Balancing word lists in speech audiometry through large spoken language corpora",
3613-3616.
López-Ludeña, V. / San-Segundo, R. / Ferreiros, J. / Pardo, J. M. / Ferreiro, E.:
"Developing an information system for deaf",
3617-3621.
Kim, Myung Jong / Yoo, Joohong / Kim, Hoirin:
"Dysarthric speech recognition using dysarthria-severity-dependent and speaker-adaptive models",
3622-3626.
Muhammad, Ghulam / Melhem, Moutasem:
"Voice pathology detection and classification using MPEG-7 audio low-level features",
3627-3631.
Kacha, Abdellah / Grenez, Francis / Schoentgen, Jean:
"Empirical mode decomposition-based spectral acoustic cues for disordered voices analysis",
3632-3636.
Aihara, Ryo / Takashima, Ryoichi / Takiguchi, Tetsuya / Ariki, Yasuo:
"Exemplar-based individuality-preserving voice conversion for articulation disorders in noisy environments",
3637-3641.
Christensen, H. / Aniol, M. B. / Bell, Peter / Green, Phil D. / Hain, Thomas / King, Simon / Swietojanski, Pawel:
"Combining in-domain and out-of-domain speech data for automatic recognition of disordered speech",
3642-3645.
Mai, Guangting / Minett, James W. / Wang, William S. -Y.:
"Effects of envelope filter cutoff frequency on the intelligibility of Mandarin noise-vocoded speech in babble noise: implications for cochlear implants",
3646-3650.
Discriminative Training Methods for Language Modeling
Schwenk, Holger:
"CSLM — a modular open-source continuous space language modeling toolkit",
1198-1202.
Shi, Yangyang / Hwang, Mei-Yuh / Yao, Kaisheng / Larson, Martha:
"Speed up of recurrent neural network language models with sentence independent subsampling stochastic gradient descent",
1203-1207.
Chang, Shuangyu / Levit, Michael / Parthasarathy, Partha / Dumoulin, Benoit:
"Improving unsupervised language model adaptation with discriminative data filtering",
1208-1212.
Kobayashi, Akio / Oku, Takahiro / Fujita, Yuya / Sato, Shoei:
"Lightly supervised training for risk-based discriminative language models",
1213-1217.
Dikici, Erinç / Prud'hommeaux, Emily / Roark, Brian / Saraçlar, Murat:
"Investigation of MT-based ASR confusion models for semi-supervised discriminative language modeling",
1218-1222.
Oba, Takanobu / Ogawa, Atsunori / Hori, Takaaki / Masataki, Hirokazu / Nakamura, Atsushi:
"Unsupervised discriminative language modeling using error rate estimator",
1223-1227.
ASR — Adaptive Training
Rath, Shakti P. / Burget, Lukáš / Karafiát, Martin / Glembek, Ondřej / Černocký, Jan:
"A region-specific feature-space transformation for speaker adaptation and singularity analysis of jacobian matrix",
1228-1232.
Wang, Y.-Q. / Gales, M. J. F.:
"An explicit independence constraint for factorised adaptation in speech recognition",
1233-1237.
Saz, Oscar / Hain, Thomas:
"Asynchronous factorisation of speaker and background with feature transforms in speech recognition",
1238-1242.
Yu, Kai / Xu, Hainan:
"Cluster adaptive training with factorized decision trees for speech recognition",
1243-1247.
Abdel-Hamid, Ossama / Jiang, Hui:
"Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition",
1248-1252.
Kintzley, Keith / Jansen, Aren / Hermansky, Hynek:
"Text-to-speech inspired duration modeling for improved whole-word acoustic models",
1253-1257.
Speech Acquisition and Development
Gregory, Adele / Tabain, Marija / Robb, Michael:
"Duration of early vocalisations",
1258-1262.
Yang, Jing / Fox, Robert Allen:
"Acoustic development of vowel production in American English children",
1263-1267.
Moulin-Frier, Clément / Oudeyer, Pierre-Yves:
"The role of intrinsic motivations in learning sensorimotor vocal mappings: a developmental robotics study",
1268-1272.
Hazan, Valerie / Pettinato, Michèle:
"Children's timing and repair strategies for communication in adverse listening conditions",
1273-1277.
Barbier, Guillaume / Perrier, Pascal / Ménard, Lucie / Payan, Yohan / Tiede, Mark K. / Perkell, Joseph S.:
"Speech planning as an index of speech motor control maturity",
1278-1282.
Kinsman, Melissa / Li, Fangfang:
"The relationship between gender-differentiated productions of /s/ and gender role behaviour in young children",
1283-1286.
Articulatory Data Acquisition and Processing (Special Session)
Berry, Jeffrey / Fadiga, Luciano:
"Data-driven design of a sentence list for an articulatory speech corpus",
1287-1291.
Zhu, Yinghua / Toutios, Asterios / Narayanan, Shrikanth / Nayak, Krishna:
"Faster 3d vocal tract real-time MRI using constrained reconstruction",
1292-1296.
Canevari, Claudia / Badino, Leonardo / Fadiga, Luciano / Metta, Giorgio:
"Relevance-weighted-reconstruction of articulatory features in deep-neural-network-based acoustic-to-articulatory mapping",
1297-1301.
Tomaschek, Fabian / Wieling, Martijn / Arnold, Denis / Baayen, R. Harald:
"Word frequency, vowel length and vowel quality in speech production: an EMA study of the importance of experience",
1302-1306.
Silva, Samuel / Teixeira, António / Oliveira, Catarina / Martins, Paula:
"Towards a systematic and quantitative analysis of vocal tract data",
1307-1311.
Vaz, Colin / Ramanarayanan, Vikram / Narayanan, Shrikanth:
"A two-step technique for MRI audio enhancement using dictionary learning and wavelet packet analysis",
1312-1315.
Stella, Massimo / Stella, Antonio / Sigona, Francesco / Bernardini, Paolo / Grimaldi, Mirko / Fivela, Barbara Gili:
"Electromagnetic articulography with AG500 and AG501",
1316-1320.
Badin, Pierre / Vargas, Julián Andrés Valdés / Koncki, Arielle / Lamalle, Laurent / Savariaux, Christophe:
"Development and implementation of fiducial markers for vocal tract MRI imaging and speech articulatory modelling",
1321-1325.
Schötz, Susanne / Frid, Johan / Gustafsson, Lars / Löfqvist, Anders:
"Functional data analysis of tongue articulation in palatal vowels: gothenburg and malmöhus Swedish /iː, yː, ̟ʉː/",
1326-1330.
Green, Jordan R. / Wang, Jun / Wilson, David L.:
"SMASH: a tool for articulatory data processing and analysis",
1331-1335.
Topics in Speech Perception and Emotion
Lin, Jen-Chun / Wu, Chung-Hsien / Wei, Wen-Li:
"Emotion recognition of conversational affective speech using temporal course modeling",
1336-1340.
Altrov, Rene / Pajupuu, Hille / Pajupuu, Jaan:
"The role of empathy in the recognition of vocal emotions",
1341-1344.
Brunellière, Angèle / Dufour, Sophie:
"Electrophysiological evidence for benefits of imitation during the processing of spoken words embedded in sentential contexts",
1345-1349.
Ogane, Rintaro / Honda, Masaaki:
"Compensatory speech response to time-scale altered auditory feedback",
1350-1354.
Nwe, Tin Lay / Nguyen, Trung Hieu / Limbu, Dilip Kumar:
"Bhattacharyya distance based emotional dissimilarity measure in multi-dimensional space for emotion classification",
1355-1359.
Prego, Thiago de M. / Lima, Amaro A. de / Netto, Sergio L.:
"On the enhancement of dereverberation algorithms based on a perceptual evaluation criterion",
1360-1364.
Gussenhoven, Carlos / Zhou, Wencui:
"Revisiting pitch slope and height effects on perceived duration",
1365-1369.
Guiraud, Hélène / Ferragne, Emmanuel / Bedoin, Nathalie / Boulenger, Véronique:
"Adaptation to natural fast speech and time-compressed speech in children",
1370-1374.
Windmann, Andreas / Šimko, Juraj / Wrede, Britta / Wagner, Petra:
"Modeling durational incompressibility",
1375-1379.
Émond, Caroline / Ménard, Lucie / Laforest, Marty:
"Perceived prosodic correlates of smiled speech in spontaneous data",
1380-1383.
Raake, Alexander / Schoenenberg, Katrin / Skowronek, Janto / Egger, Sebastian:
"Predicting speech quality based on interactivity and delay",
1384-1388.
Kouklia, Charlotte / Audibert, Nicolas:
"Perceptual, acoustic and electroglottographic correlates of 3 aggressive attitudes in French: a pilot study",
1389-1393.
Discourse and Machine Learning, Paralinguistic and Nonlinguistic Cues
Morchid, Mohamed / Linarès, Georges / El-Beze, Marc / Mori, Renato De:
"Theme identification in telephone service conversations using quaternions of speech features",
1394-1398.
Rao, Hrishikesh / Kim, Jonathan C. / Rozga, Agata / Clements, Mark A.:
"Detection of laughter in children's speech using spectral and prosodic acoustic features",
1399-1403.
Truong, Khiet P.:
"Classification of cooperative and competitive overlaps in speech using cues from the context, overlapper, and overlappee",
1404-1408.
Kim, Samuel / Valente, Fabio / Vinciarelli, Alessandro:
"Annotation and detection of conflict escalation in Political debates",
1409-1413.
Schiel, Florian / Stevens, Mary / Reichel, Uwe D. / Cutugno, Francesco:
"Machine learning of probabilistic phonological pronunciation rules from the Italian CLIPS corpus",
1414-1418.
Baumeister, Barbara / Schiel, Florian:
"Human perception of alcoholic intoxication in speech",
1419-1423.
Hou, Luying / Jia, Yuan / Li, Aijun:
"Phonetic manifestation and influence of zero anaphora in Chinese reading texts",
1424-1428.
Harrat, S. / Abbas, M. / Meftouh, K. / Smaili, K.:
"Diacritics restoration for Arabic dialect texts",
1429-1433.
Włodarczak, Marcin / Wagner, Petra:
"Effects of talk-spurt silence boundary thresholds on distribution of gaps and overlaps",
1434-1437.
Kachkovskaia, Tatiana / Volskaya, Nina / Skrelin, Pavel:
"Final lengthening in Russian: a corpus-based study",
1438-1442.
Reichel, Uwe D.:
"From segmentation bootstrapping to transcription-to-word conversion",
1443-1447.
Caelen-Haumont, Geneviève / Bartkova, Katarina:
"Manual and automatic tone annotation: the case of an endangered language from north vietnam “mo piu”",
1448-1452.
Leonarduzzi, Laetitia / Herment, Sophie:
"Non-canonical syntactic structures in discourse: tonality, tonicity and tones in English (semi-)spontaneous speech",
1453-1457.
Nouri, Elnaz / Park, Sunghyun / Scherer, Stefan / Gratch, Jonathan / Carnevale, Peter / Morency, Louis-Philippe / Traum, David:
"Prediction of strategy and outcome as negotiation unfolds by using basic verbal and behavioral features",
1458-1461.
Language Identification, Speaker Diarization
Poignant, Johann / Besacier, Laurent / Le, Viet Bac / Rosset, Sophie / Quénot, Georges:
"Unsupervised naming of speakers in broadcast TV: using written names, pronounced names or both?",
1462-1466.
Bredin, Hervé / Poignant, Johann:
"Integer linear programming for speaker diarization and cross-modal identification in TV broadcast",
1467-1471.
DeMarco, Andrea / Cox, Stephen J.:
"Native accent classification via i-vectors and speaker compensation fusion",
1472-1476.
Rouvier, Mickael / Dupuy, Grégor / Gay, Paul / Khoury, Elie / Merlin, Teva / Meignier, Sylvain:
"An open-source state-of-the-art toolbox for broadcast news diarization",
1477-1481.
Kons, Zvi / Toledo-Ronen, Orith:
"Audio event classification using deep neural networks",
1482-1486.
Liang, Wei-Bin / Wu, Chung-Hsien / Hsu, Chun-Shan:
"Code-Switching event detection based on delta-BIC using phonetic eigenvoice models",
1487-1491.
Hirayama, Naoki / Yoshino, Koichiro / Itoyama, Katsutoshi / Mori, Shinsuke / Okuno, Hiroshi G.:
"Automatic estimation of dialect mixing ratio for dialect speech recognition",
1492-1496.
Rodríguez-Fuentes, Luis Javier / Brümmer, Niko / Penagarikano, Mikel / Varona, Amparo / Bordel, Germán / Diez, Mireia:
"The albayzin 2012 language recognition evaluation",
1497-1501.
Han, Kyu J. / Ganapathy, Sriram / Li, Ming / Omar, Mohamed K. / Narayanan, Shrikanth:
"TRAP language identification system for RATS phase II evaluation",
1502-1506.
Lawson, Aaron / McLaren, Mitchell / Lei, Yun / Mitra, Vikramjit / Scheffer, Nicolas / Ferrer, Luciana / Graciarena, Martin:
"Improving language identification robustness to highly channel-degraded speech through multiple system fusion",
1507-1510.
Speech Synthesis — Prosody and Emotion
Kang, Yongguo / Li, Jian / Deng, Yan / Wang, Miaomiao:
"Multi-centroidal duration generation algorithm for HMM-based TTS",
1540-1543.
Raitio, Tuomo / Suni, Antti / Pohjalainen, Jouni / Airaksinen, Manu / Vainio, Martti / Alku, Paavo:
"Analysis and synthesis of shouted speech",
1544-1548.
Nagata, Tomohiro / Mori, Hiroki / Nose, Takashi:
"Robust estimation of multiple-regression HMM parameters for dimension-based expressive dialogue speech synthesis",
1549-1553.
Brognaux, Sandrine / Picart, Benjamin / Drugman, Thomas:
"A new prosody annotation protocol for live sports commentaries",
1554-1558.
Mehrabani, Mahnoosh / Mishra, Taniya / Conkie, Alistair:
"Unsupervised prominence prediction for speech synthesis",
1559-1563.
Charfuelan, Marcela / Steiner, Ingmar:
"Expressive speech synthesis in MARY TTS using audiobook data and emotionML",
1564-1568.
Spoken Language Information Retrieval
Ward, Nigel G. / Werner, Steven D.:
"Using dialog-activity similarity for spoken information retrieval",
1569-1573.
Chen, I-Fan / Lee, Chin-Hui:
"A hybrid HMM/DNN approach to keyword spotting of short words",
1574-1578.
Wintrode, Jonathan:
"Leveraging locality for topic identification of conversational speech",
1579-1583.
Senay, Grégory / Bigot, Benjamin / Dufour, Richard / Linarès, Georges / Fredouille, Corinne:
"Person name spotting by combining acoustic matching and LDA topic models",
1584-1588.
Szaszák, György / Beke, András:
"Using phonological phrase segmentation to improve automatic keyword spotting for the highly agglutinating Hungarian language",
1589-1593.
Heck, Larry / Hakkani-Tür, Dilek / Tur, Gokhan:
"Leveraging knowledge graphs for web-scale unsupervised semantic parsing",
1594-1598.
Speaker Recognition I, II
Cumani, Sandro / Laface, Pietro:
"Fast and memory effective i-vector extraction using a factorized sub-space",
1599-1603.
Simonchik, Konstantin / Shulipa, Andrey / Pekhovsky, Timur:
"Effective estimation of a multi-session speaker model using information on signal parameters",
1604-1608.
Hautamäki, Ville / Lee, Kong Aik / Leeuwen, David A. van / Saeidi, R. / Larcher, Anthony / Kinnunen, Tomi / Hasan, Taufiq / Sadjadi, Seyed Omid / Liu, Gang / Bořil, Hynek / Hansen, John H. L. / Fauve, Benoit:
"Automatic regularization of cross-entropy cost for speaker recognition fusion",
1609-1613.
Li, Ming / Kim, Jangwon / Ghosh, Prasanta Kumar / Ramanarayanan, Vikram / Narayanan, Shrikanth:
"Speaker verification based on fusion of acoustic and articulatory information",
1614-1618.
Leeuwen, David A. van / Brümmer, Niko:
"The distribution of calibrated likelihood-ratios in speaker recognition",
1619-1623.
Kelly, Finnian / Brümmer, Niko / Harte, Naomi:
"Eigenageing compensation for speaker verification",
1624-1628.
Sarkar, A. K. / Barras, Claude:
"Anchor and UBM-based multi-class MLLR m-vector system for speaker verification",
2450-2454.
Perera, Leibny Paola Garcia / Raj, Bhiksha / Nolazco-Flores, Juan Arturo:
"Ensemble approach in speaker verification",
2455-2459.
Wang, Jun / Wang, Dong / Wu, Xiaojun / Zheng, Thomas Fang / Tejedor, Javier:
"Sequential model adaptation for speaker verification",
2460-2464.
Kanagasundaram, A. / Dean, D. / Gonzalez-Dominguez, Javier / Sridharan, S. / Ramos, D. / Gonzalez-Rodriguez, Joaquin:
"Improving short utterance based i-vector speaker recognition using source and utterance-duration normalization techniques",
2465-2469.
Aronowitz, Hagai / Barkan, Oren:
"On leveraging conversational data for building a text dependent speaker verification system",
2470-2473.
Zhang, Wei-Qiang / Li, Zhiyi / Liu, Weiwei / Liu, Jia:
"THU-EE system fusion for the NIST 2012 speaker recognition evaluation",
2474-2478.
Garcia-Romero, Daniel / McCree, Alan:
"Subspace-constrained supervector PLDA for speaker verification",
2479-2483.
Do, Cong-Thanh / Barras, Claude / Le, Viet Bac / Sarkar, A. K.:
"Augmenting short-term cepstral features with long-term discriminative features for speaker verification of telephone data",
2484-2488.
Rajan, Padmanabhan / Kinnunen, Tomi / Hanilçi, Cemal / Pohjalainen, Jouni / Alku, Paavo:
"Using group delay functions from all-pole models for speaker recognition",
2489-2493.
Portêlo, José / Abad, Alberto / Raj, Bhiksha / Trancoso, Isabel:
"Secure binary embeddings of front-end factor analysis for privacy preserving speaker verification",
2494-2498.
Taghia, Jalil / Ma, Zhanyu / Leijon, Arne:
"On von-mises fisher mixture model in text-independent speaker identification",
2499-2503.
Diez, Mireia / Varona, Amparo / Penagarikano, Mikel / Rodríguez-Fuentes, Luis Javier / Bordel, Germán:
"Using phone log-likelihood ratios as features for speaker recognition",
2504-2508.
Villalba, Jesús / Diez, Mireia / Varona, Amparo / Lleida, Eduardo:
"Handling recordings acquired simultaneously over multiple channels with PLDA",
2509-2513.
Fang, Xiao / Dehak, Najim / Glass, James:
"Bayesian distance metric learning on i-vector for speaker verification",
2514-2518.
Hautamäki, Rosa González / Hautamäki, Ville / Rajan, Padmanabhan / Kinnunen, Tomi:
"Merging human and automatic system decisions to improve speaker recognition performance",
2519-2523.
Multimodal Speech Perception
Erjavec, Grozdana / Legros, Denis:
"Effects of mouth-only and whole-face displays on audio-visual speech perception in noise: is the vision of a talker's full face truly the most efficient solution?",
1629-1633.
Tiippana, Kaisa / Tiainen, Mikko / Vainio, Lari / Vainio, Martti:
"Acoustic and visual phonetic features in the mcgurk effect — an audiovisual speech illusion",
1634-1638.
Davis, Chris / Kim, Jeesun:
"The effect of visual speech timing and form cues on the processing of speech and nonspeech",
1639-1642.
Chandrashekara, Ganesh Attigodu / Berthommier, Frédéric / Nahorna, Olha / Schwartz, Jean-Luc:
"Effect of context, rebinding and noise, on audiovisual speech fusion",
1643-1647.
Rilliard, Albert / Erickson, Donna / Shochi, Takaaki / Moraes, João Antônio de:
"Social face to face communication — American English attitudinal prosody",
1648-1652.
Bailly, Gérard / Rochet-Capellan, Amélie / Vilain, Coriandre:
"Adaptation of respiratory patterns in collaborative reading",
1653-1657.
ASR — Feature Extraction
Tóth, László:
"Convolutional deep rectifier neural nets for phone recognition",
1722-1726.
Hirsch, Hans-Günter:
"Pitch synchronous spectral analysis for a pitch dependent recognition of voiced phonemes — PISAR",
1727-1731.
Rodríguez, José Luis Oropeza:
"New parameters for automatic speech recognition based on the mammalian cochlea model using resonance analysis",
1732-1736.
Jaitly, Navdeep / Hinton, Geoffrey E.:
"Using an autoencoder with deformable templates to discover features for automated speech recognition",
1737-1740.
Yeh, Ching-Feng / Lee, Hung-yi / Lee, Lin-shan:
"Speaking rate normalization with lattice-based context-dependent phoneme duration modeling for personalized speech recognizers on mobile devices",
1741-1745.
Qi, Jun / Wang, Dong / Tejedor, Javier:
"Subspace models for bottleneck features",
1746-1750.
Qi, Jun / Wang, Dong / Xu, Ji / Tejedor, Javier:
"Bottleneck features based on gammatone frequency cepstral coefficients",
1751-1755.
Golik, Pavel / Doetsch, Patrick / Ney, Hermann:
"Cross-entropy vs. squared error training: a theoretical and experimental comparison",
1756-1760.
Patil, Vaishali / Rao, Preeti:
"Acoustic features for detection of phonemic aspiration in voiced plosives",
1761-1765.
Palaz, Dimitri / Collobert, Ronan / Magimai-Doss, Mathew:
"Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks",
1766-1770.
Olaso, Javier Mikel / Torres, María Inés:
"Hierarchical models based on a continuous acoustic space to identify phonological features",
1771-1775.
Tomar, Vikrant Singh / Rose, Richard C.:
"Locality sensitive hashing for fast computation of correlational manifold learning based feature space transformations",
1776-1780.
Schatz, Thomas / Peddinti, Vijayaditya / Bach, Francis / Jansen, Aren / Hermansky, Hynek / Dupoux, Emmanuel:
"Evaluating speech features with the minimal-pair ABX task: analysis of the classical MFC/PLP pipeline",
1781-1785.
ASR — Pronunciation, Prosodic and New Paradigms
Chiang, Chen-Yu / Siniscalchi, Sabato Marco / Chen, Sin-Horng / Lee, Chin-Hui:
"Knowledge integration for improving performance in LVCSR",
1786-1790.
Heckmann, Martin:
"Inter-speaker variability in audio-visual classification of word prominence",
1791-1795.
Liu, Shilin / Sim, Khe Chai:
"Parameter clustering for temporally varying weight regression for automatic speech recognition",
1796-1800.
Alumäe, Tanel / Nemoto, Rena:
"Phone duration modeling using clustering of rich contexts",
1801-1805.
Ahmadi, Farzaneh / Ahmadi, Mousa / McLoughlin, Ian Vince:
"Human mouth state detection using low frequency ultrasound",
1806-1810.
Li, Kun / Qian, Xiaojun / Kang, Shiyin / Meng, Helen:
"Lexical stress detection for L2 English speech using deep belief networks",
1811-1815.
Qian, Yanmin / Liu, Jia:
"MLP-HMM two-stage unsupervised training for low-resource languages on conversational telephone speech recognition",
1816-1820.
Novak, Josef R. / Minematsu, Nobuaki / Hirose, Keikichi:
"Failure transitions for joint n-gram models and G2p conversion",
1821-1825.
Kameoka, Hirokazu / Yoshizato, Kota / Ishihara, Tatsuma / Ohishi, Yasunori / Kashino, Kunio / Sagayama, Shigeki:
"Generative modeling of speech F0 contours",
1826-1830.
Davel, Marelie H. / Heerden, Charl van / Barnard, Etienne:
"G2p variant prediction techniques for ASR and STD",
1831-1835.
Jin, Jin / Tepperman, Joseph:
"Rhythm analysis of second-language speech through low-frequency auditory features",
1836-1839.
Liu, Yuzong / Kirchhoff, Katrin:
"Graph-based semi-supervised learning for phone and segment classification",
1840-1843.
Shen, Ao / Cooke, Neil / Russell, Martin:
"Selective use of gaze information to improve ASR performance in noisy environments by cache-based class language model adaptation",
1844-1848.
Abdel-Hamid, Ossama / Deng, Li / Yu, Dong / Jiang, Hui:
"Deep segmental neural networks for speech recognition",
1849-1853.
Coene, Martine / Hammer, Annemiek / Kowalczyk, Wojtek / Bosch, Louis ten / Vaerenberg, Bart / Govaerts, Paul J.:
"Quantifying cross-linguistic variation in grapheme-to-phoneme mapping",
1854-1857.
Dialog Systems
Kawahara, Tatsuya / Hayashi, Soichiro / Takanashi, Katsuya:
"Estimation of interest and comprehension level of audience through multi-modal behaviors in poster conversations",
1882-1885.
Hu, Wenping / Qian, Yao / Soong, Frank K.:
"A new DNN-based high quality pronunciation evaluation for computer-aided language learning (CALL)",
1886-1890.
Planells, Joaquin / Hurtado, Lluís-F. / Segarra, Encarna / Sanchis, Emilio:
"A multi-domain dialog system to integrate heterogeneous spoken dialog systems",
1891-1895.
Todo, Yuki / Nishimura, Ryota / Yamamoto, Kazumasa / Nakagawa, Seiichi:
"Development and evaluation of spoken dialog systems with one or two agents",
1896-1900.
Skantze, Gabriel / Oertel, Catharine / Hjalmarsson, Anna:
"User feedback in human-robot interaction: prosody, gaze and timing",
1901-1905.
Xi, Yongxin Taylor / Paulik, Matthias / Gadde, Venkata Ramana / Sankar, Ananth:
"KPCatcher — a keyphrase extraction system for enterprise videos",
1906-1910.
ASR — Pronunciation Variants and Modeling
Song, Meixu / Zhang, Qingqing / Pan, Jielin / Yan, Yonghong:
"Discriminative pronunciation modeling based on minimum phone error training",
1941-1945.
Kubo, Keigo / Sakti, Sakriani / Neubig, Graham / Toda, Tomoki / Nakamura, Satoshi:
"Grapheme-to-phoneme conversion based on adaptive regularization of weight vectors",
1946-1950.
Naghibi, Tofigh / Hoffmann, Sarah / Pfister, Beat:
"An efficient method to estimate pronunciation from multiple utterances",
1951-1955.
Basson, Willem D. / Davel, Marelie H.:
"Category-based phoneme-to-grapheme transliteration",
1956-1960.
Jyothi, Preethi / Fosler-Lussier, Eric / Livescu, Karen:
"Discriminative training of WFST factors with application to pronunciation modeling",
1961-1965.
Karanasou, Penny / Yvon, François / Lavergne, Thomas / Lamel, Lori:
"Discriminative training of a phoneme confusion model for a dynamic lexicon in ASR",
1966-1970.
Speaker Recognition Evaluation
Greenberg, Craig S. / Stanford, Vincent M. / Martin, Alvin F. / Yadagiri, Meghana / Doddington, George R. / Godfrey, John J. / Hernandez-Cordero, Jaime:
"The 2012 NIST speaker recognition evaluation",
1971-1975.
Brümmer, Niko / Doddington, George R.:
"Likelihood-ratio calibration using prior-weighted proper scoring rules",
1976-1980.
Ferrer, Luciana / McLaren, Mitchell / Scheffer, Nicolas / Lei, Yun / Graciarena, Martin / Mitra, Vikramjit:
"A noise-robust system for NIST 2012 speaker recognition evaluation",
1981-1985.
aizhou / Hansen, John H. L. / Bonastre, Jean-Francois / Marcel, S. / Mason, John S. D. / Ambikairajah, Eliathamby:
"I4u submission to NIST SRE 2012: a large-scale collaborative effort for noise-robust speaker verification",
1986-1990.
Sun, Hanwu / Ma, Bin:
"Improved unsupervised NAP training dataset design for speaker recognition",
1991-1995.
Colibro, Daniele / Vair, Claudio / Farrell, Kevin / Krause, Nir / Karvitsky, Gennady / Cumani, Sandro / Laface, Pietro:
"Nuance - Politecnico di torino's 2012 NIST speaker recognition evaluation system",
1996-2000.
Physiology and Models of Speech Production
Chen, Gang / Garellek, Marc / Kreiman, Jody / Gerratt, Bruce R. / Alwan, Abeer:
"A perceptually and physiologically motivated voice source model",
2001-2005.
Smith, Caitlin / Proctor, Michael / Iskarous, Khalil / Goldstein, Louis / Narayanan, Shrikanth:
"Stable articulatory tasks and their variable formation: tamil retroflex consonants",
2006-2009.
Ramanarayanan, Vikram / Lammert, Adam / Goldstein, Louis / Narayanan, Shrikanth:
"Articulatory settings facilitate mechanically advantageous motor control of vocal tract articulators",
2010-2013.
Rochet-Capellan, Amélie / Fuchs, Susanne:
"The interplay of linguistic structure and breathing in German spontaneous speech",
2014-2018.
Arai, Takayuki:
"Physical models of the vocal tract with a flapping tongue for flap and liquid sounds",
2019-2023.
Laprie, Yves / Loosvelt, Matthieu / Maeda, Shinji / Sock, Rudolph / Hirsch, Fabrice:
"Articulatory copy synthesis from cine x-ray films",
2024-2028.
Speech Science in End-User Applications
Bellegarda, Jerome R.:
"Large-scale personal assistant technology deployment: the siri experience",
2029-2033.
Weiss, Benjamin / Willkomm, Simon / Möller, Sebastian:
"Evaluating an adaptive dialog system for the public",
2034-2038.
Gemmeke, Jort F. / Ons, Bart / Tessema, Netsanet / Van hamme, Hugo / Loo, Janneke van de / Pauw, Guy De / Daelemans, Walter / Huyghe, Jonathan / Derboven, Jan / Vuegen, Lode / Broeck, Bert Van Den / Karsmakers, Peter / Vanrumste, Bart:
"Self-taught assistive vocal interfaces: an overview of the ALADIN project",
2039-2043.
Eyben, Florian / Weninger, Felix / Schuller, Björn:
"Affect recognition in real-life acoustic conditions — a new perspective on feature selection",
2044-2048.
Principi, Emanuele / Squartini, Stefano / Piazza, Francesco / Fuselli, Danilo / Bonifazi, Maurizio:
"A distributed system for recognizing home automation commands and distress calls in the Italian language",
2049-2053.
Zinovieva, Nina / Zhuang, Xiaodan / Peterson, Pat / Alwan, Joe / Prasad, Rohit:
"Probabilistic trainable segmenter for call center audio using multiple features",
2054-2058.
Burkhardt, Felix / Nägeli, Hans Ulrich:
"Voice search in mobile applications and the use of linked open data",
2059-2061.
Vacher, Michel / Lecouteux, Benjamin / Istrate, Dan / Joubert, Thierry / Portet, François / Sehili, Mohamed / Chahuara, Pedro:
"Evaluation of a real-time voice order recognition system from multiple audio channels in a home",
2062-2064.
Aman, Frédéric / Vacher, Michel / Rossato, Solange / Portet, François:
"In-home detection of distress calls: the case of aged users",
2065-2067.
Liu, Ding / Cheung, Anthea / Margolis, Anna / Redmond, Patrick / Suh, Jun-won / Wang, Chao:
"Data driven methods for utterance semantic tagging",
2068-2070.
Gouvêa, E. / Moreno-Daniel, A. / Reddy, A. / Chengalvarayan, R. / Thomson, D. / Ljolje, A.:
"The AT&t speech API: a study on practical challenges for customized speech to text service",
2071-2073.
D'hoore, Bart / Wiesen, Alfred:
"In-vehicle destination entry by voice: practical aspects",
2074-2076.
Perception of Non Native Sounds
Gautreau, Aurore / Hoen, Michel / Meunier, Fanny:
"Intelligibility at a multilingual cocktail party: effect of concurrent language knowledge",
2077-2080.
Jacewicz, Ewa / Fox, Robert Allen:
"Regional accents affect speech intelligibility in a multitalker environment",
2081-2085.
Tokuma, Shinichi / Tokuma, Won:
"Perception of English minimal pairs in noise by Japanese listeners: does clear speech for L2 listeners help?",
2086-2090.
Sisinni, Bianca / Escudero, Paola / Grimaldi, Mirko:
"Salento Italian listeners' perception of American English vowels",
2091-2094.
Rauber, Andréia Schurt / Rato, Anabela / Kluge, Denise Cristina / Santos, Giane Rodrigues dos:
"TP 3.1 software: a tool for designing audio, visual, and audiovisual perceptual training tasks and perception tests",
2095-2098.
Chen, Fei / Li, Junfeng / Wong, Lena L. N. / Yan, Yonghong:
"Effect of linguistic masker on the intelligibility of Mandarin sentences",
2099-2102.
Moon, Kyuwon / Sumner, Meghan:
"The learning and generalization of contrasts consistent or inconsistent with native biases",
2103-2107.
Ying, Jia / Shaw, Jason A. / Best, Catherine T.:
"L2 English learners' recognition of words spoken in familiar versus unfamiliar English accents",
2108-2112.
Wong, Janice Wing Sze:
"The effects of perceptual and/or productive training on the perception and production of English vowels /ɪ/ and /iː/ by Cantonese ESL learners",
2113-2117.
Kartushina, Natalia / Frauenfelder, Ulrich Hans:
"On the role of L1 speech production in L2 perception: evidence from Spanish learners of French",
2118-2122.
Hallé, Pierre / Kartushina, Natalia / Segui, Juan / Frauenfelder, Ulrich Hans:
"Looking for lexical feedback effects in /tl/→/kl/ repairs",
2123-2127.
Best, Catherine T. / Shaw, Jason A. / Clancy, Elizabeth:
"Recognizing words across regional accents: the role of perceptual assimilation in lexical competition",
2128-2132.
Speech Disorders — Data and Methodology
Martínez, David / Green, Phil D. / Christensen, H.:
"Dysarthria intelligibility assessment in a factor analysis total variability space",
2133-2137.
Ghio, Alain / Gasquet-Cyrus, Médéric / Roquel, Juliette / Giovanni, Antoine:
"Perceptual interference between regional accent and voice/speech disorders",
2138-2142.
Balčiūnienė, Ingrida:
"Linguistic disfluency in narrative speech: evidence from story-telling in 6-year olds",
2143-2146.
Munson, Benjamin:
"Assessing the utility of judgments of children's speech production made by untrained listeners in uncontrolled listening environments",
2147-2151.
Antolík, Tanja Kocjančič / Fougeron, Cécile:
"Consonant distortions in dysarthria due to parkinson's disease, amyotrophic lateral sclerosis and cerebellar ataxia",
2152-2156.
Verdurand, Marine / Rossato, Solange / Granjon, Lionel / Balbo, Daria / Zmarich, Claudio:
"Study of coarticulation and F2 transitions in French and Italian adult stutterers",
2157-2161.
Clapham, Renee P. / As-Brooks, Corina J. Van / Brekel, Michiel W. M. Van den / Hilgers, Frans J. M. / Son, Rob J. J. H. Van:
"Automatic tracheoesophageal voice typing using acoustic parameters",
2162-2166.
Mauclair, Julie / Koenig, Lionel / Robert, Marina / Gatignol, Peggy:
"Burst-based features for the classification of pathological voices",
2167-2171.
Helfer, Brian S. / Quatieri, Thomas F. / Williamson, James R. / Mehta, Daryush D. / Horwitz, Rachelle / Yu, Bea:
"Classification of depression state based on articulatory precision",
2172-2176.
Fraser, Kathleen C. / Rudzicz, Frank / Rochon, Elizabeth:
"Using text and acoustic features to diagnose progressive aphasia and its subtypes",
2177-2181.
Search and Computational Issues in LVCSR
Alumäe, Tanel:
"Multi-domain neural network language model",
2182-2186.
Long, Y. / Gales, M. J. F. / Lanchantin, P. / Liu, X. / Seigel, M. S. / Woodland, P. C.:
"Improving lightly supervised training for broadcast transcription",
2187-2191.
Cerisara, C. / Lorenzo, A. / Kral, P.:
"Weakly supervised parsing with rules",
2192-2196.
Nussbaum-Thom, Markus / Beck, Eugen / Alkhouli, Tamer / Schlüter, Ralf / Ney, Hermann:
"Relative error bounds for statistical classifiers based on the f-divergence",
2197-2201.
Premkumar, Melvin Jose Johnson / Vu, Ngoc Thang / Schultz, Tanja:
"Experiments towards a better LVCSR system for tamil",
2202-2206.
Thangthai, Kwanchiva / Chotimongkol, Ananlada / Wutiwiwatchai, Chai:
"A hybrid language model for open-vocabulary Thai LVCSR",
2207-2211.
Chien, Jen-Tzung / Chang, Ying-Lan:
"Hierarchical pitman-yor and dirichlet process for language model",
2212-2216.
Asami, Taichi / Kobashikawa, Satoshi / Masataki, Hirokazu / Yoshioka, Osamu / Takahashi, Satoshi:
"Unsupervised confidence calibration using examples of recognized words and their contexts",
2217-2221.
Tüske, Zoltán / Schlüter, Ralf / Ney, Hermann:
"Multilingual hierarchical MRASTA features for ASR",
2222-2226.
Chang, Harry M.:
"Heuristic selection of training sentences from historical TV guide for semi-supervised LM adaptation",
2227-2231.
Fohr, Dominique / Mella, Odile:
"Combination of random indexing based language model and n-gram language model for speech recognition",
2232-2236.
Miao, Yajie / Metze, Florian:
"Improving low-resource CD-DNN-HMM using dropout and multilingual DNN training",
2237-2241.
Qin, Long / Rudnicky, Alexander:
"Finding recurrent out-of-vocabulary words",
2242-2246.
Chiu, Justin / Rudnicky, Alexander:
"Using conversational word bursts in spoken term detection",
2247-2251.
Speech and Hearing Disorders
Acher, Audrey / Sato, Marc / Lamalle, Laurent / Vilain, Coriandre / Attye, Arnaud / Krainik, Alexandre / Bettega, Georges / Righini, Christian Adrien / Carlot, Brice / Brix, Muriel / Perrier, Pascal:
"Brain activations in speech recovery process after intra-oral surgery: an fMRI study",
2252-2256.
Mertens, Christophe / Schoentgen, Jean / Grenez, Francis / Skodda, Sabine:
"Acoustic and perceptual analysis of vocal tremor",
2257-2261.
Tantibundhit, C. / Onsuwan, C. / Klangpornkun, N. / Phienphanich, P. / Saimai, T. / Saimai, N. / Pitathawatchai, P. / Wutiwiwatchai, Chai:
"Lexical tone perception in Thai normal-hearing adults and those using hearing aids: a case study",
2262-2266.
Kagomiya, Takayuki / Nakagawa, Seiji:
"Evaluation of a bone-conducted ultrasonic hearing aid in vocal emotion transmission",
2267-2271.
Garrapa, Luigia / Bottari, Davide / Grimaldi, Mirko / Pavani, Francesco / Calabrese, Andrea / Benedetto, Michele De / Vitale, Silvano:
"Processing of /i/ and /u/ in Italian cochlear-implant children: a behavioral and neurophysiologic study",
2272-2276.
Cosentino, Stefano / Falk, Tiago H. / McAlpine, David:
"Predicting the bilateral advantage in cochlear implantees using a non-intrusive speech intelligibility measure",
2277-2281.
Speech and Audio Segmentation
Huang, Zhen / Cheng, You-Chi / Li, Kehuang / Hautamäki, Ville / Lee, Chin-Hui:
"A blind segmentation approach to acoustic event detection based on i-vector",
2282-2286.
Vuuren, Van Zyl van / Bosch, Louis ten / Niesler, Thomas:
"A dynamic programming framework for neural network-based automatic speech segmentation",
2287-2291.
Prasad, RaviShankar / Yegnanarayana, B.:
"Acoustic segmentation of speech using zero time liftering (ZTL)",
2292-2296.
Wang, Haipeng / Lee, Tan / Leung, Cheung-Chi / Ma, Bin / Li, Haizhou:
"Unsupervised mining of acoustic subword units with segment-level Gaussian posteriorgrams",
2297-2301.
Kalinli, Ozlem:
"Combination of auditory attention features with phone posteriors for better automatic phoneme segmentation",
2302-2305.
Yuan, Jiahong / Ryant, Neville / Liberman, Mark / Stolcke, Andreas / Mitra, Vikramjit / Wang, Wen:
"Automatic phonetic segmentation using boundary models",
2306-2310.
Speech Synthesis — Various Topics
Nguyen, Thi Thu Trang / D'Alessandro, Christophe / Rilliard, Albert / Tran, Do Dat:
"HMM-based TTS for hanoi vietnamese: issues in design and evaluation",
2311-2315.
Raitio, Tuomo / Kane, John / Drugman, Thomas / Gobl, Christer:
"HMM-based synthesis of creaky voice",
2316-2320.
Wang, Xiaoxuan / Sim, Khe Chai:
"Integrating conditional random fields and joint multi-gram model with syllabic features for grapheme-to-phone conversion",
2321-2325.
Lehnen, Patrick / Allauzen, Alexandre / Lavergne, Thomas / Yvon, François / Hahn, Stefan / Ney, Hermann:
"Structure learning in hidden conditional random fields for grapheme-to-phoneme conversion",
2326-2330.
Stan, Adriana / Watts, O. / Mamiya, Y. / Giurgiu, M. / Clark, Robert A. J. / Yamagishi, Junichi / King, Simon:
"TUNDRA: a multilingual corpus of found data for TTS research created with light supervision",
2331-2335.
Maia, Ranniery / Gales, M. J. F. / Stylianou, Yannis / Akamine, Masami:
"Minimum mean squared error based warped complex cepstrum analysis for statistical parametric speech synthesis",
2336-2340.
ASR — Discriminative Training
Hifny, Yasser:
"Augmented conditional random fields modeling based on discriminatively trained features",
2341-2344.
Veselý, Karel / Ghoshal, Arnab / Burget, Lukáš / Povey, Daniel:
"Sequence-discriminative training of deep neural networks",
2345-2349.
Zhang, Weibin / Fung, Pascale:
"Discriminatively trained sparse inverse covariance matrices for low resource acoustic modeling",
2350-2354.
Tachioka, Yuuki / Watanabe, Shinji:
"Discriminative training of acoustic models for system combination",
2355-2359.
Huang, Yan / Yu, Dong / Gong, Yifan / Liu, Chaojun:
"Semi-supervised GMM and DNN acoustic model training with multi-system combination and confidence re-calibration",
2360-2364.
Xue, Jian / Li, Jinyu / Gong, Yifan:
"Restructuring of deep neural network acoustic models with singular value decomposition",
2365-2369.
L2 Acquisition, Multilingualism
Chen, Nancy F. / Shivakumar, Vivaek / Harikumar, Mahesh / Ma, Bin / Li, Haizhou:
"Large-scale characterization of Mandarin pronunciation errors made by native speakers of European languages",
2370-2374.
Delvaux, Véronique / Huet, Kathy / Piccaluga, Myriam / Harmegnies, Bernard:
"Production training in second language acquisition: a comparison between objective measures and subjective judgments",
2375-2379.
Netelenbos, Nicole / Li, Fangfang:
"The production and perception of voice onset time in English-speaking children enrolled in a French immersion program",
2380-2384.
Burgos, Pepi / Cucchiarini, Catia / Hout, Roeland van / Strik, Helmer:
"Pronunciation errors by Spanish learners of Dutch: a data-driven study for ASR-based pronunciation training",
2385-2389.
Graham, Calbert / Post, Brechtje:
"Realisation of tonal alignment in the English of Japanese-English late bilinguals",
2390-2394.
Benoist-lucy, Agathe / Pillot-Loiseau, Claire:
"The influence of language and speech task upon creaky voice use among six young American women learning French",
2395-2399.
Child Computer Interaction (Special Session)
Bone, Daniel / Lee, Chi-Chun / Chaspari, Theodora / Black, Matthew P. / Williams, Marian E. / Lee, Sungbok / Levitt, Pat / Narayanan, Shrikanth:
"Acoustic-prosodic, turn-taking, and language cues in child-psychologist interactions for varying social demand",
2400-2404.
Bořil, Hynek / Zhang, Qian / Angkititrakul, Pongtep / Hansen, John H. L. / Xu, Dongxin / Gilkerson, Jill / Richards, Jeffrey A.:
"A preliminary study of child vocalization on a parallel corpus of US and shanghainese toddlers",
2405-2409.
Claus, Felix / Rosales, Hamurabi Gamboa / Petrick, Rico / Hain, Horst-Udo / Hoffmann, Rüdiger:
"A survey about databases of children's speech",
2410-2414.
Kouloumenta, Vassiliki / Perakakis, Manolis / Potamianos, Alexandros:
"Affective evaluation of multimodal dialogue games for preschoolers using physiological signals",
2415-2419.
Alam, Md. Jahangir / Attabi, Yazid / Dumouchel, Pierre / Kenny, Patrick / O'Shaughnessy, Douglas:
"Amplitude modulation features for emotion recognition from speech",
2420-2424.
Bone, Daniel / Lee, Chi-Chun / Ramanarayanan, Vikram / Narayanan, Shrikanth / Hoedemaker, Renske S. / Gordon, Peter C.:
"Analyzing eye-voice coordination in rapid automatized naming",
2425-2429.
Chaspari, Theodora / Provost, Emily Mower / Narayanan, Shrikanth:
"Analyzing the structure of parent-moderated narratives from children with ASD using an entity-based approach",
2430-2434.
Evanini, Keelan / Wang, Xinhao:
"Automated speech scoring for non-native middle school students with multiple task types",
2435-2439.
Safavi, Saeid / Jančovič, Peter / Russell, Martin / Carey, Michael:
"Identification of gender from children's speech by computers and humans",
2440-2444.
Arai, Takayuki:
"On why Japanese /r/ sounds are difficult for children to acquire",
2445-2449.
Dialog Systems and Applications I, II
Yao, Kaisheng / Zweig, Geoffrey / Hwang, Mei-Yuh / Shi, Yangyang / Yu, Dong:
"Recurrent neural networks for language understanding",
2524-2528.
Riedhammer, Korbinian / Do, Van Hai / Hieronymus, James:
"A study on LVCSR and keyword search for tagalog",
2529-2533.
Alghowinem, Sharifa / Goecke, Roland / Wagner, Michael / Epps, Julien / Parker, Gordon / Breakspear, Michael:
"Characterising depressed speech for classification",
2534-2538.
Bigot, Benjamin / Senay, Grégory / Linarès, Georges / Fredouille, Corinne / Dufour, Richard:
"Combining acoustic name spotting and continuous context models to improve spoken person name recognition in speech",
2539-2543.
Chen, I-Fan / Lee, Chin-Hui:
"A resource-dependent approach to word modeling for keyword spotting",
2544-2548.
Womack, Kathryn / Alm, Cecilia Ovesdotter / Calvelli, Cara / Pelz, Jeff B. / Shi, Pengcheng / Haake, Anne:
"Markers of confidence and correctness in spoken medical narratives",
2549-2553.
Nakamura, Ibuki / Minematsu, Nobuaki / Suzuki, Masayuki / Hirano, Hiroko / Nakagawa, Chieko / Nakamura, Noriko / Tagawa, Yukinori / Hirose, Keikichi / Hashimoto, Hiroya:
"Development of a web framework for teaching and learning Japanese prosody: OJAD (online Japanese accent dictionary)",
2554-2558.
Shriberg, Elizabeth / Stolcke, Andreas / Ravuri, Suman:
"Addressee detection for dialog systems using temporal and spectral dimensions of speaking style",
2559-2563.
Hatano, Hiroaki / Kiso, Miyako / Ishi, Carlos T.:
"Analysis of factors involved in the choice of rising or non-rising intonation in question utterances appearing in conversational speech",
2564-2568.
Celikyilmaz, Asli / Tur, Gokhan / Hakkani-Tür, Dilek:
"IsNL? a discriminative approach to detect natural language like queries for conversational understanding",
2569-2573.
Cheng, Jian / Bojja, Nikhil / Chen, Xin:
"Automatic accent quantification of indian speakers of English",
2574-2578.
Tur, Gokhan / Deoras, Anoop / Hakkani-Tür, Dilek:
"Semantic parsing using word confusion networks with conditional random fields",
2579-2583.
Strömbergsson, Sofia / Hjalmarsson, Anna / Edlund, Jens / House, David:
"Timing responses to questions in dialogue",
2584-2588.
Karafiát, Martin / Grézl, František / Hannemann, Mirko / Veselý, Karel / Černocký, Jan:
"BUT BABEL system for spontaneous Cantonese",
2589-2593.
Norouzian, Atta / Rose, Richard C. / Jansen, Aren:
"Semi-supervised manifold learning approaches for spoken term verification",
2594-2598.
Li, Ying / Fung, Pascale:
"Language modeling for mixed language speech recognition using weighted phrase extraction",
2599-2603.
Strömbergsson, Sofia / Tånnander, Christina:
"Correlates to intelligibility in deviant child speech — comparing clinical evaluations to audience response system-based evaluations by untrained listeners",
3717-3721.
Womack, Kathryn / Alm, Cecilia Ovesdotter / Calvelli, Cara / Pelz, Jeff B. / Shi, Pengcheng / Haake, Anne:
"Using linguistic analysis to characterize conceptual units of thought in spoken medical narratives",
3722-3726.
Cutugno, Francesco / Finzi, Alberto / Fiore, Michelangelo / Leone, Enrico / Rossi, Silvia:
"Interacting with robots via speech and gestures, an integrated architecture",
3727-3731.
Hatmi, Mohamed / Jacquin, Christine / Morin, Emmanuel / Meignier, Sylvain:
"Incorporating named entity recognition into the speech transcription process",
3732-3736.
Ohno, Teppei / Akiba, Tomoyosi:
"DTW-distance-ordered spoken term detection",
3737-3741.
Jung, Sangkeun / Na, Seung-Hoon:
"Refining sentence similarity with discourse information in dialog system",
3742-3746.
Nakatani, Ryohei / Takiguchi, Tetsuya / Ariki, Yasuo:
"Two-step correction of speech recognition errors based on n-gram and long contextual information",
3747-3750.
Negi, Sumit / Balasubramanyan, Ramnath / Chaudhury, Santanu:
"Inferring actor communities from videos",
3751-3755.
Bost, Xavier / El-Beze, Marc / Mori, Renato De:
"Multiple topic identification in telephone conversations",
3756-3760.
Chen, Wei / Ananthakrishnan, Sankaranarayanan / Prasad, Rohit / Natarajan, Prem:
"Variable-Span out-of-vocabulary named entity detection",
3761-3765.
Kun, Andrew L. / Palinko, Oskar / Medenica, Zeljko / Heeman, Peter A.:
"On the feasibility of using pupil diameter to estimate cognitive load changes for in-vehicle spoken dialogues",
3766-3770.
Mesnil, Grégoire / He, Xiaodong / Deng, Li / Bengio, Yoshua:
"Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding",
3771-3775.
Liu, Xiaohu / Sarikaya, Ruhi / Brockett, Chris / Quirk, Chris / Dolan, William B.:
"Paraphrase features to improve natural language understanding",
3776-3779.
Hakkani-Tür, Dilek / Celikyilmaz, Asli / Heck, Larry / Tur, Gokhan:
"A weakly-supervised approach for discovering new user intents from search query logs",
3780-3784.
Xu, Puyang / Sarikaya, Ruhi:
"Exploiting shared information for multi-intent natural language sentence classification",
3785-3789.
Spoken Machine Translation and Speech Natural Language Processing I, II
Skowronek, Janto / Herlinghaus, Julian / Raake, Alexander:
"Quality assessment of asymmetric multiparty telephone conferences: a systematic method from technical degradations to perceived impairments",
2604-2608.
Imoto, Keisuke / Shimauchi, Suehiro / Uematsu, Hisashi / Ohmuro, Hitoshi:
"User activity estimation method based on probabilistic generative model of acoustic event sequence with user activity and its subordinate categories",
2609-2613.
Kano, Takatomo / Takamichi, Shinnosuke / Sakti, Sakriani / Neubig, Graham / Toda, Tomoki / Nakamura, Satoshi:
"Generalizing continuous-space translation of paralinguistic information",
2614-2618.
Ohgushi, Masaya / Neubig, Graham / Sakti, Sakriani / Toda, Tomoki / Nakamura, Satoshi:
"An empirical comparison of joint optimization techniques for speech translation",
2619-2623.
Ostendorf, Mari / Hahn, Sangyun:
"A sequential repetition model for improved disfluency detection",
2624-2628.
Medeiros, Henrique / Moniz, Helena / Batista, Fernando / Trancoso, Isabel / Nunes, Luis:
"Disfluency detection based on prosodic features for university lectures",
2629-2633.
Meyer, Bernd T.:
"What's the difference? comparing humans and machines on the Aurora 2 speech recognition task",
2634-2638.
Gubian, Michele / Boves, Lou / Versteegh, Maarten:
"Calibration of distance measures for unsupervised query-by-example",
2639-2643.
Castan, Diego / Akbacak, Murat:
"Indexing multimedia documents with acoustic concept recognition lattices",
2644-2648.
Kousidis, Spyros / Pfeiffer, Thies / Schlangen, David:
"MINT.tools: tools and adaptors supporting acquisition, annotation and analysis of multimodal corpora",
2649-2653.
Favre, Benoit / Cheung, Kyla / Kazemian, Siavash / Lee, Adam / Liu, Yang / Munteanu, Cosmin / Nenkova, Ani / Ochei, Dennis / Penn, Gerald / Tratz, Stephen / Voss, Clare / Zeller, Frauke:
"Automatic human utility evaluation of ASR systems: does WER really predict performance?",
3463-3467.
Sridhar, Vivek Kumar Rangarajan / Chen, John / Bangalore, Srinivas:
"Corpus analysis of simultaneous interpretation data for improving real time speech translation",
3468-3472.
Cho, Eunah / Fügen, Christian / Hermann, Teresa / Kilgour, Kevin / Mediani, Mohammed / Mohr, Christian / Niehues, Jan / Rottmann, Kay / Saam, Christian / Stüker, Sebastian / Waibel, Alex:
"A real-world system for simultaneous translation of German lectures",
3473-3477.
Wu, Dekai / Addanki, Karteek / Saers, Markus:
"Freestyle: a challenge-response system for hip hop lyrics via unsupervised induction of stochastic transduction grammars",
3478-3482.
Tsiartas, Andreas / Georgiou, Panayiotis G. / Narayanan, Shrikanth:
"Toward transfer of acoustic cues of emphasis across languages",
3483-3486.
Fujita, Tomoki / Neubig, Graham / Sakti, Sakriani / Toda, Tomoki / Nakamura, Satoshi:
"Simple, lexicalized choice of translation timing for simultaneous speech translation",
3487-3491.
Language Model Adaptation
Haidar, Md. Akmal / O'Shaughnessy, Douglas:
"Fitting long-range information using interpolated distanced n-grams and cache models into a latent dirichlet language model for speech recognition",
2678-2682.
Chen, Yi-Wen / Hao, Bo-Han / Chen, Kuan-Yu / Chen, Berlin:
"Incorporating proximity information for relevance language modeling in speech recognition",
2683-2687.
Bayer, Ali Orkan / Riccardi, G.:
"Instance-based on-line language model adaptation",
2688-2692.
Mansikkaniemi, André / Kurimo, Mikko:
"Unsupervised topic adaptation for morph-based speech recognition",
2693-2697.
Schlippe, Tim / Gren, Lukasz / Vu, Ngoc Thang / Schultz, Tanja:
"Unsupervised language model adaptation for automatic speech recognition of broadcast news using web 2.0",
2698-2702.
Wen, Tsung-Hsien / Heidel, Aaron / Lee, Hung-yi / Tsao, Yu / Lee, Lin-shan:
"Recurrent neural network based language model personalization by social network crowdsourcing",
2703-2707.
Spoken Language Summarization and Understanding
Ayadi, Moataz El / Afify, Mohamed:
"Language-independent call routing using the large margin estimation principle",
2708-2712.
Deoras, Anoop / Sarikaya, Ruhi:
"Deep belief network based semantic taggers for spoken language understanding",
2713-2717.
Jabaian, Bassam / Lefèvre, Fabrice:
"Error-corrective discriminative joint decoding of automatic spoken language transcription and understanding",
2718-2722.
Lai, Catherine / Carletta, Jean / Renals, Steve:
"Detecting summarization hot spots in meetings using group level involvement and turn-taking features",
2723-2727.
Shiang, Sz-Rung / Lee, Hung-yi / Lee, Lin-shan:
"Supervised spoken document summarization based on structured support vector machine with utterance clusters as hidden variables",
2728-2732.
Klasinas, Ioannis / Potamianos, Alexandros / Iosif, Elias / Georgiladakis, Spiros / Mameli, Gianluca:
"Web data harvesting for speech understanding grammar induction",
2733-2737.
Speech Synthesis — Multimodal and Articulatory Synthesis
Toutios, Asterios / Narayanan, Shrikanth:
"Articulatory synthesis of French connected speech from EMA data",
2738-2742.
Zhang, Xinjian / Wang, Lijuan / Li, Gang / Seide, Frank / Soong, Frank K.:
"A new language independent, photo-realistic talking head driven by voice only",
2743-2747.
Wang, Chaoyang / Wang, Lijuan / Matsushita, Yasuyuki / Huang, Bojun / Chen, Magnetro / Soong, Frank K.:
"Binocular photometric stereo acquisition and reconstruction for 3d talking head applications",
2748-2752.
Hueber, Thomas / Bailly, Gérard / Badin, Pierre / Elisei, Frédéric:
"Speaker adaptation of an acoustic-articulatory inversion model using cascaded Gaussian mixture regressions",
2753-2757.
Ben-Youssef, Atef / Shimodaira, Hiroshi / Braude, David Adam:
"Articulatory features for speech-driven head motion synthesis",
2758-2762.
Braude, David Adam / Shimodaira, Hiroshi / Ben-Youssef, Atef:
"Template-warping based speech driven head motion synthesis",
2763-2767.
Speaker Diarization and Recognition
Larcher, Anthony / Bonastre, Jean-Francois / Fauve, Benoit / Lee, Kong Aik / Lévy, Christophe / Li, Haizhou / Mason, John S. D. / Parfait, Jean-Yves:
"ALIZE 3.0 — open source toolkit for state-of-the-art speaker recognition",
2768-2772.
Senoussaoui, Mohammed / Kenny, Patrick / Dumouchel, Pierre / Dehak, Najim:
"New cosine similarity scorings to implement gender-independent speaker verification",
2773-2777.
Charlet, Delphine / Fredouille, Corinne / Damnati, Géraldine / Senay, Grégory:
"Improving speaker identification in TV-shows using person name detection in overlaid text and speech",
2778-2782.
Knox, Mary Tai / Mirghafori, Nikki / Friedland, Gerald:
"Exploring methods of improving speaker accuracy for speaker diarization",
2783-2787.
Price, Ryan / Biswas, Sangeeta / Shinoda, Koichi:
"Combining deep speaker specific representations with GMM-SVM for speaker verification",
2788-2792.
Schindler, Carola / Draxler, Christoph:
"Using spectral moments as a speaker specific feature in nasals and fricatives",
2793-2796.
Models of Speech Perception
Laurent, Raphaël / Schwartz, Jean-Luc / Bessière, Pierre / Diard, Julien:
"A computational model of perceptuo-motor processing in speech perception: learning to imitate and categorize synthetic CV syllables",
2797-2801.
Theodore, Rachel M.:
"Talker-specific perceptual processing: influences on internal category structure",
2802-2806.
Lecumberri, Maria Luisa García / Tóth, Attila Máté / Tang, Yan / Cooke, Martin:
"Elicitation and analysis of a corpus of robust noise-induced word misperceptions in Spanish",
2807-2811.
Cutler, Anne / Bruggeman, Laurence:
"Vocabulary structure and spoken-word recognition: evidence from French reveals the source of embedding asymmetry",
2812-2816.
Bagou, Odile / Frauenfelder, Ulrich Hans:
"How do multiple sublexical cues converge in lexical segmentation? an artificial language learning study",
2817-2821.
Bosch, Louis ten / Boves, Lou / Ernestus, Mirjam:
"Towards an end-to-end computational model of speech comprehension: simulating a lexical decision task",
2822-2826.
Speech and Audio Signal Processing
Athanasopoulos, Georgios / Verhelst, Werner:
"A phase-modified approach for TDE-based acoustic localization",
2890-2894.
Xue, Wei / Liang, Shan / Liu, Wenju:
"Interference robust DOA estimation of human speech by exploiting historical information and temporal correlation",
2895-2899.
Harte, Naomi / Murphy, Sadhbh / Kelly, David J. / Marples, Nicola M.:
"Identifying new bird species from differences in birdsong",
2900-2904.
Nishigaki, Yuri / Sakakibara, Ken-Ichi / Morise, Masanori / Nisimura, Ryuichi / Irino, Toshio / Kawahara, Hideki:
"Controlling “shout” expression in a Japanese POP singing performance: analysis and suppression study",
2905-2909.
Mehrabani, Mahnoosh / Hansen, John H. L.:
"Dimensionality analysis of singing speech based on locality preserving projections",
2910-2914.
Molla, Md. Khademul Islam / Hirose, Keikichi:
"Audio classification using dominant spatial patterns in time-frequency space",
2915-2919.
Lin, Tse-En / Hsu, Chung-Chien / Chen, Yi-Cheng / Chen, Jian-Hueng / Chi, Tai-Shih:
"Spectro-temporal modulation based singing detection combined with pitch-based grouping for singing voice separation",
2920-2923.
Ludeña-Choez, Jimmy / Gallardo-Antolín, Ascensión:
"NMF-based temporal feature integration for acoustic event classification",
2924-2928.
Rawat, Shourabh / Schulam, Peter F. / Burger, Susanne / Ding, Duo / Wang, Yipei / Metze, Florian:
"Robust audio-codebooks for large-scale event detection in consumer videos",
2929-2933.
Altaf, M. Umair Bin / Butko, Taras / Juang, Biing-Hwang:
"Person identification using biometric markers from footsteps sound",
2934-2938.
Młynarski, Wiktor:
"Learning binaural spectrogram features for azimuthal speaker localization",
2939-2942.
Oualil, Youssef / Faubel, Friedrich / Klakow, Dietrich:
"An unsupervised Bayesian classifier for multiple speaker detection and localization",
2943-2947.
Chakraborty, Rupayan / Nadeu, Climent:
"Joint recognition and direction-of-arrival estimation of simultaneous meeting-room acoustic events",
2948-2952.
Zhuang, Xiaodan / Wu, Shuang / Natarajan, Pradeep / Prasad, Rohit / Natarajan, Prem:
"Audio self organized units for high-level event detection",
2953-2957.
Linguistic Systems, Phonetics-Phonology Interface
Maddieson, Ian / Flavier, Sébastien / Marsico, Egidio / Coupé, Christophe / Pellegrino, François:
"LAPSyd: lyon-albuquerque phonological systems database",
3022-3026.
Barbosa, Plínio A.:
"The duration compensation issue revisited",
3027-3031.
Oh, Yoon Mi / Pellegrino, François / Coupé, Christophe / Marsico, Egidio:
"Cross-language comparison of functional load for vowels, consonants, and tones",
3032-3036.
Maekawa, Kikuo:
"Notes on so-called inter-speaker difference in spontaneous speech: the case of Japanese voiced obstruent",
3037-3041.
Carignan, Christopher / Shosted, Ryan K. / Fu, Maojing / Liang, Zhi-Pei / Sutton, Bradley P.:
"The role of the pharynx and tongue in enhancement of vowel nasalization: a real-time MRI investigation of French nasal vowels",
3042-3046.
Renwick, Margaret E. L. / Baghai-Ravary, Ladan / Temple, Rosalind / Coleman, John S.:
"Assimilation of word-final nasals to following word-initial place of articulation in UK English",
3047-3051.
Speech Synthesis — Voice Conversion
Chen, Ling-Hui / Ling, Zhen-Hua / Song, Yan / Dai, Li-Rong:
"Joint spectral distribution modeling using restricted boltzmann machines for voice conversion",
3052-3056.
Wu, Zhizheng / Virtanen, Tuomas / Kinnunen, Tomi / Chng, Eng Siong / Li, Haizhou:
"Exemplar-based unit selection for voice conversion utilizing temporal information",
3057-3061.
Hwang, Hsin-Te / Tsao, Yu / Wang, Hsin-Min / Wang, Yih-Ru / Chen, Sin-Horng:
"Alleviating the over-smoothing problem in GMM-based voice conversion with discriminative training",
3062-3066.
Tanaka, Kou / Toda, Tomoki / Neubig, Graham / Sakti, Sakriani / Nakamura, Satoshi:
"A hybrid approach to electrolaryngeal speech enhancement based on spectral subtraction and statistical voice conversion",
3067-3071.
Moriguchi, Takuto / Toda, Tomoki / Sano, Motoaki / Sato, Hiroshi / Neubig, Graham / Sakti, Sakriani / Nakamura, Satoshi:
"A digital signal processor implementation of silent/electrolaryngeal speech enhancement based on real-time statistical voice conversion",
3072-3076.
Aryal, Sandesh / Felps, Daniel / Gutierrez-Osuna, Ricardo:
"Foreign accent conversion through voice morphing",
3077-3081.
Large Vocabulary Continuous Speech Recognition Systems
Audhkhasi, Kartik / Zavou, Andreas M. / Georgiou, Panayiotis G. / Narayanan, Shrikanth:
"Empirical link between hypothesis diversity and fusion performance in an ensemble of automatic speech recognition systems",
3082-3086.
Bell, Peter / Yamamoto, Hitoshi / Swietojanski, Pawel / Wu, Youzheng / McInnes, Fergus / Hori, Chiori / Renals, Steve:
"A lecture transcription system combining neural network acoustic and language models",
3087-3091.
Soltau, Hagen / Kuo, Hong-Kwang / Mangu, Lidia / Saon, George / Beran, Tomas:
"Neural network acoustic models for the DARPA RATS program",
3092-3096.
Ueffing, Nicola / Bisani, Maximilian / Vozila, Paul:
"Improved models for automatic punctuation prediction for spoken and written text",
3097-3101.
Roy, Anindya / Lamel, Lori / Fraga-Silva, Thiago / Gauvain, Jean-Luc / Oparin, Ilya:
"Some issues affecting the transcription of Hungarian broadcast audio",
3102-3106.
Golik, Pavel / Tüske, Zoltán / Schlüter, Ralf / Ney, Hermann:
"Development of the RWTH transcription system for slovenian",
3107-3111.
Robust Speaker Recognition I, II
Kanda, Naoyuki / Takeda, Ryu / Obuchi, Yasunari:
"Noise robust speaker verification with delta cepstrum normalization",
3112-3116.
Vandyke, David / Wagner, Michael / Goecke, Roland:
"R-norm: improving inter-speaker variability modelling at the score level via regression score normalisation",
3117-3121.
Kinnunen, Tomi / Alam, Md. Jahangir / Matějka, Pavel / Kenny, Patrick / Černocký, Jan / O'Shaughnessy, Douglas:
"Frequency warping and robust speaker verification: a comparison of alternative mel-scale representations",
3122-3126.
Hasan, Taufiq / Hansen, John H. L.:
"Acoustic factor analysis based universal background model for robust speaker verification in noise",
3127-3131.
Villalba, Jesús / Lleida, Eduardo / Ortega, Alfonso / Miguel, Antonio:
"A new Bayesian network to assess the reliability of speaker verification decisions",
3132-3136.
Zhu, Weizhong / Yaman, Sibel / Pelecanos, Jason:
"The IBM RATS phase II speaker recognition system: overview and analysis",
3137-3141.
Lee, Kong Aik / Larcher, Anthony / You, Chang Huai / Ma, Bin / Li, Haizhou:
"Multi-session PLDA scoring of i-vector for partially open-set speaker detection",
3651-3655.
Godin, Keith W. / Sadjadi, Seyed Omid / Hansen, John H. L.:
"Impact of noise reduction and spectrum estimation on noise robust speaker identification",
3656-3660.
Yamada, Takanori / Wang, Longbiao / Kai, Atsuhiko:
"Improvement of distant-talking speaker identification using bottleneck features of DNN",
3661-3664.
Brutti, Alessio / Omologo, Maurizio:
"Geometric contamination for GMM/UBM speaker verification in reverberant environments",
3665-3669.
McClanahan, Richard D. / Leon, Phillip L. De:
"Towards a more efficient SVM supervector speaker verification system using Gaussian reduction and a tree-structured hash",
3670-3673.
Kanagasundaram, A. / Dean, D. / Gonzalez-Dominguez, Javier / Sridharan, S. / Ramos, D. / Gonzalez-Rodriguez, Joaquin:
"Improving the PLDA based speaker verification in limited microphone data conditions",
3674-3678.
Villalba, Jesús / Lleida, Eduardo / Ortega, Alfonso / Miguel, Antonio:
"The I3a speaker recognition system for NIST SRE12: post-evaluation analysis",
3679-3683.
Stafylakis, T. / Kenny, Patrick / Ouellet, P. / Perez, J. / Kockmann, M. / Dumouchel, Pierre:
"Text-dependent speaker recognition using PLDA with uncertainty propagation",
3684-3688.
Mallidi, Sri Harish / Ganapathy, Sriram / Hermansky, Hynek:
"Robust speaker recognition using spectro-temporal autoregressive models",
3689-3693.
Rajan, Padmanabhan / Kinnunen, Tomi / Hautamäki, Ville:
"Effect of multicondition training on i-vector PLDA configurations for speaker recognition",
3694-3697.
McLaren, Mitchell / Abrash, Victor / Graciarena, Martin / Lei, Yun / Pešán, Jan:
"Improving robustness to compressed speech in speaker recognition",
3698-3702.
Mitra, Vikramjit / McLaren, Mitchell / Franco, Horacio / Graciarena, Martin / Scheffer, Nicolas:
"Modulation features for noise robust speaker identification",
3703-3707.
Hautamäki, Ville / Cheng, You-Chi / Rajan, Padmanabhan / Lee, Chin-Hui:
"Minimax i-vector extractor for short duration speaker verification",
3708-3712.
Fowler, Mike / McCurry, Mark / Bramsen, Jonathan / Dunsin, Kehinde / Remus, Jeremiah:
"Standoff speaker recognition: effects of recording distance mismatch on speaker recognition system performance",
3713-3716.
Acoustic and Articulatory Cues in Speech Perception
Shaw, Jason A. / Tyler, Michael D. / Kasisopa, Benjawan / Ma, Yuan / Proctor, Michael / Han, Chong / Derrick, Donald / Burnham, Denis:
"Vowel identity conditions the time course of tone recognition",
3142-3146.
Scharenborg, Odette / Janse, Esther:
"Changes in the role of intensity as a cue for fricative categorisation",
3147-3151.
Yasu, Keiichi / Arai, Takayuki / Kobayashi, Kei / Shindo, Mitsuko:
"Weighting of acoustic cues shifts to frication duration in identification of fricatives/affricates when auditory properties are degraded due to aging",
3152-3156.
Gao, Jiayin / Hallé, Pierre:
"Duration as a secondary cue for perception of voicing and tone in shanghai Chinese",
3157-3161.
Dekerle, Marie / Meunier, Fanny / N'Guyen, Marie-Ange / Gillet-Perret, Estelle / Lassus-Sangosse, Delphine / Donnadieu, Sophie:
"Development of central auditory processes and their links with language skills in typically developing children",
3162-3166.
Varnet, Léo / Knoblauch, Kenneth / Meunier, Fanny / Hoen, Michel:
"Show me what you listen to! auditory classification images can reveal the processing of fine acoustic cues during speech categorization",
3167-3171.
Speech Production — Data and Models
Brackhane, Fabian / Trouvain, Jürgen:
"The organ stop “vox humana” as a model for a vowel synthesiser",
3172-3176.
Ghosh, Prasanta Kumar / Narayanan, Shrikanth:
"Information theoretic acoustic feature selection for acoustic-to-articulatory inversion",
3177-3181.
Fejlová, Dita / Lukeš, David / Skarnitzl, Radek:
"Formant contours in Czech vowels: speaker-discriminating potential",
3182-3186.
Liu, Shen / Wei, Jianguo / Wang, Xin / Lu, Wenhuan / Fang, Qiang / Dang, Jianwu:
"An anisotropic diffusion filter based on multidirectional separability",
3187-3190.
Skarnitzl, Radek / Šturm, Pavel / Machač, Pavel:
"The phonological voicing contrast in Czech: an EPG study of phonated and whispered fricatives",
3191-3195.
Maeda, Shinji / Laprie, Yves:
"Vowel and prosodic factor dependent variations of vocal-tract length",
3196-3200.
Grootswagers, Tijl / Dijkstra, Karen / Bosch, Louis ten / Brandmeyer, Alex / Sadakata, Makiko:
"Word identification using phonetic features: towards a method to support multivariate fMRI speech decoding",
3201-3205.
Gowda, Dhananjaya / Kurimo, Mikko:
"Analysis of breathy, modal and pressed phonation based on low frequency spectral density",
3206-3210.
Tajima, Keiichi / Tanaka, Kuniyoshi / Martin, Andrew / Mazuka, Reiko:
"Is the vowel length contrast in Japanese exaggerated in infant-directed speech?",
3211-3215.
Chen, Gang / Samlan, Robin A. / Kreiman, Jody / Alwan, Abeer:
"Investigating the relationship between glottal area waveform shape and harmonic magnitudes through computational modeling and laryngeal high-speed videoendoscopy",
3216-3220.
Kim, Jonathan C. / Rao, Hrishikesh / Clements, Mark A.:
"Formant frequency tracking using Gaussian mixtures with maximum a posteriori adaptation",
3221-3225.
Yasuda, Rei / Zimmerer, Frank:
"Devoicing of vowels in German, a comparison of Japanese and German speakers",
3226-3229.
Smith, Caitlin / Lammert, Adam:
"Identifying consonantal tasks via measures of tongue shaping: a real-time MRI investigation of the production of vocalized syllabic /l/ in American English",
3230-3233.
Speech Enhancement
Deng, Feng / Bao, Chang-chun / Bao, Feng:
"A speech enhancement method by coupling speech detection and spectral amplitude estimation",
3234-3238.
Zheng, Chenxi / Chan, Wai-Yip:
"Late reverberation suppression using MMSE modulation spectral estimation",
3239-3243.
Turan, M. A. Tuğtekin / Erzin, Engin:
"A new statistical excitation mapping for enhancement of throat microphone recordings",
3244-3248.
Roman, Nicoleta / Mandel, Michael I.:
"Classification based binaural dereverberation",
3249-3253.
Kim, Seon Man / Kim, Hong Kook:
"Target-to-non-target directional ratio estimation based on dual-microphone phase differences for target-directional speech enhancement",
3254-3258.
Lu, Xugang / Matsuda, Shigeki / Hori, Chiori:
"Speech spectrum restoration based on conditional restricted boltzmann machine",
3259-3263.
Khan, Faheem / Milner, Ben:
"Speaker separation using visual speech features and single-channel audio",
3264-3268.
Chuang, Wei-Lun / Cheong, Kah-Meng / Hsu, Chung-Chien / Chi, Tai-Shih:
"Spectral modulation sensitivity based perceptual acoustic echo cancellation",
3269-3273.
Abrol, Vinayak / Sharma, Pulkit / Sao, Anil Kumar:
"Speech enhancement using compressed sensing",
3274-3278.
Grais, Emad M. / Erdogan, Hakan:
"Spectro-temporal post-enhancement using MMSE estimation in NMF based single-channel source separation",
3279-3283.
Kaewtip, Kantapon / Tan, Lee Ngee / Alwan, Abeer:
"A pitch-based spectral enhancement technique for robust speech processing",
3284-3288.
McCallum, Matthew / Guillemin, Bernard:
"Stochastic-deterministic signal modelling for the tracking of pitch in noise and speech mixtures using factorial HMMs",
3289-3293.
Maymon, Shay / Marcheret, Etienne / Goel, Vaibhava:
"Restoration of clipped signals with application to speech recognition",
3294-3297.
Uezu, Yasufumi / Kinoshita, Keisuke / Souden, Mehrez / Nakatani, Tomohiro:
"On the robustness of distributed EM based BSS in asynchronous distributed microphone array scenarios",
3298-3302.
ASR — Acoustic Modeling
Yang, Jingzhou / Dalen, Rogier C. van / Gales, M. J. F.:
"Infinite support vector machines in speech recognition",
3303-3307.
Giuliani, Diego / Brugnara, Fabio:
"An on-line incremental speaker adaptation technique for audio stream transcription",
3308-3312.
Telaar, Dominic / Fuhs, Mark C.:
"Accent- and speaker-specific polyphone decision trees for non-native speech recognition",
3313-3316.
Wiesler, Simon / Li, Jinyu / Xue, Jian:
"Investigations on hessian-free optimization for cross-entropy training of deep neural networks",
3317-3321.
Saiko, Masahiro / Matsuda, Shigeki / Hanazawa, Ken / Isotani, Ryosuke / Hori, Chiori:
"Cross-lingual acoustic model adaptation based on transfer vector field smoothing with MAP",
3322-3326.
Fujimura, Hiroshi / Shinohara, Yusuke / Masuko, Takashi:
"N-best rescoring by phoneme classifiers using subclass adaboost algorithm",
3327-3331.
Ogawa, Tetsuji / Li, Feipeng / Hermansky, Hynek:
"Stream selection and integration in multistream ASR using GMM-based performance monitoring",
3332-3336.
Yoma, Néstor Becerra / Garretón, Claudio / Huenupán, Fernando / Catalán, Ignacio / Wuth, Jorge:
"VTLN based on the linear interpolation of contiguous mel filter-bank energies",
3337-3341.
Triefenbach, Fabian / Jalalvand, Azarakhsh / Demuynck, Kris / Martens, Jean-Pierre:
"Context-dependent modeling and speaker normalization applied to reservoir-based phone recognition",
3342-3346.
Fraga-Silva, Thiago / Gauvain, Jean-Luc / Lamel, Lori:
"Interpolation of acoustic models for speech recognition",
3347-3351.
Tahir, M. / Huang, H. / Schlüter, Ralf / Ney, Hermann / Bosch, Louis ten / Cranen, Bert / Boves, Lou:
"Training log-linear acoustic models in higher-order polynomial feature space for speech recognition",
3352-3355.
Parinam, Venkata Neelima / Vootkuri, Chandra / Zahorian, Stephen A.:
"Comparison of spectral analysis methods for automatic speech recognition",
3356-3360.
Sanand, D. R. / Svendsen, T.:
"Synthetic speaker models using VTLN to improve the performance of children in mismatched speaker conditions for ASR",
3361-3365.
Abdel-Hamid, Ossama / Deng, Li / Yu, Dong:
"Exploring convolutional neural network structures and optimization techniques for speech recognition",
3366-3370.
Special Event: ESCA/ISCA Anniversary
Mariani, Joseph / Paroubek, Patrick / Francopoulo, Gil / Delaborde, Marine:
"Rediscovering 25 years of discoveries in spoken language processing: a preliminary ISCA archive analysis",
3371-3403.
Fujisaki, Hiroya:
"An inter- and cross-disciplinary perspective of spoken language processing",
4005.
Moore, Roger K.:
"Progress and prospects for speech technology: what ordinary people think",
4006.
Language Modeling for Conversational Speech
Shaik, M. Ali Basha / Mousa, Amr El-Desoky / Schlüter, Ralf / Ney, Hermann:
"Feature-rich sub-lexical language models using a maximum entropy approach for German LVCSR",
3404-3408.
Mousa, Amr El-Desoky / Shaik, M. Ali Basha / Schlüter, Ralf / Ney, Hermann:
"Morpheme level hierarchical pitman-yor class-based language models for LVCSR of morphologically rich languages",
3409-3413.
Lambert, Benjamin / Raj, Bhiksha / Singh, Rita:
"Discriminatively trained dependency language modeling for conversational speech recognition",
3414-3418.
Si, Yujing / Zhang, Qingqing / Li, Ta / Pan, Jielin / Yan, Yonghong:
"Prefix tree based n-best list re-scoring for recurrent neural network language model used in speech recognition system",
3419-3423.
Liu, X. / Gales, M. J. F. / Woodland, P. C.:
"Cross-domain paraphrasing for improving language modelling using out-of-domain data",
3424-3428.
Masumura, Ryo / Masataki, Hirokazu / Oba, Takanobu / Yoshioka, Osamu / Takahashi, Satoshi:
"Viterbi decoding for latent words language models using gibbs sampling",
3429-3433.
Speech Enhancement and Coding
Bäckström, Tom:
"Computationally efficient objective function for algebraic codebook optimization in ACELP",
3434-3438.
Möller, Sebastian / Kelaidi, Emilia / Köster, Friedemann / Côté, Nicolas / Bauer, Patrick / Fingscheidt, Tim / Schlien, Thomas / Pulakka, Hannu / Alku, Paavo:
"Speech quality prediction for artificial bandwidth extension algorithms",
3439-3443.
Xia, Bing-yin / Bao, Chang-chun:
"Speech enhancement with weighted denoising auto-encoder",
3444-3448.
Cernak, Milos / Na, Xingyu / Garner, Philip N.:
"Syllable-based pitch encoding for low bit rate speech coding with recognition/synthesis architecture",
3449-3452.
Duy, Nguyen Duc / Suzuki, Masayuki / Minematsu, Nobuaki / Hirose, Keikichi:
"Artificial bandwidth extension based on regularized piecewise linear mapping with discriminative region weighting and long-Span features",
3453-3457.
Lee, Bong-Ki / Lim, Chungsoo / Park, Jihwan / Chang, Joon-Hyuk:
"Enhanced muting method in packet loss concealment of ITU-t g.722 employing optimized sigmoid function",
3458-3462.
Articulatory and Acoustic Cues of Speech Prosody
Nguyen, Thi-Lan / Michaud, Alexis / Tran, Do Dat / Mac, Dang-Khoa:
"The interplay of intonation and complex lexical tones: how speaker attitudes affect the realization of glottalization on vietnamese sentence-final particles",
3522-3526.
Ní Chasaide, Ailbhe / Yanushevskaya, Irena / Kane, John / Gobl, Christer:
"The voice prominence hypothesis: the interplay of F0 and voice source features in accentuation",
3527-3531.
Lee, Albert / Xu, Yi / Prom-on, Santitham:
"Mora-based pre-low raising in Japanese pitch accent",
3532-3536.
Lœvenbruck, Hélène / Jannet, Mohamed Ameur Ben / D'Imperio, Mariapaola / Spini, Mathilde / Champagne-Lavau, Maud:
"Prosodic cues of sarcastic speech in French: slower, higher, wider",
3537-3541.
Ménard, Lucie / Leclerc, Annie / Tiede, Mark K. / Prémont, Amélie / Turgeon, Christine / Trudeau-Fisette, Paméla / Côté, Dominique:
"Correlates of contrastive focus in congenitally blind adults and sighted adults",
3542-3546.
Georgeton, Laurianne / Audibert, Nicolas:
"Is protrusion of French rounded vowels affected by prosodic positions?",
3547-3551.
Intelligibility-Enhancing Speech Modifications (Special Session)
Cooke, Martin / Mayo, Catherine / Valentini-Botinhao, Cassia:
"Intelligibility-enhancing speech modifications: the hurricane challenge",
3552-3556.
Erro, D. / Zorilă, T. C. / Stylianou, Yannis / Navas, E. / Hernaez, I.:
"Statistical synthesizer with embedded prosodic and spectral modifications to generate highly intelligible speech in noise",
3557-3561.
Suni, Antti / Karhila, Reima / Raitio, Tuomo / Kurimo, Mikko / Vainio, Martti / Alku, Paavo:
"Lombard modified text-to-speech synthesis for improved intelligibility: submission for the hurricane challenge 2013",
3562-3566.
Valentini-Botinhao, Cassia / Yamagishi, Junichi / King, Simon / Stylianou, Yannis:
"Combining perceptually-motivated spectral shaping with loudness and duration modification for intelligibility enhancement of HMM-based synthetic speech in noise",
3567-3571.
Godoy, Elizabeth / Stylianou, Yannis:
"Increasing speech intelligibility via spectral shaping with frequency warping and dynamic range compression plus transient enhancement",
3572-3576.
Schepker, Henning / Rennies, Jan / Doclo, Simon:
"Improving speech intelligibility in noise by SII-dependent preprocessing using frequency-dependent amplification and dynamic range compression",
3577-3581.
Taal, Cees H. / Jensen, Jesper:
"SII-based speech preprocessing for intelligibility improvement in noise",
3582-3586.
Zhang, Mengqiu / Petkov, Petko N. / Kleijn, W. Bastiaan:
"Rephrasing-based speech intelligibility enhancement",
3587-3591.
Aubanel, Vincent / Cooke, Martin:
"Information-preserving temporal reallocation of speech in the presence of fluctuating maskers",
3592-3596.
Petkov, Petko N. / Kleijn, W. Bastiaan:
"Preservation of speech spectral dynamics enhances intelligibility",
3597-3601.
Brouckxon, Henk / Verhelst, Werner:
"An overview of the VUB entry for the 2013 hurricane challenge",
3602-3604.
Takou, Reiko / Seiyama, Nobumasa / Imai, Atsushi:
"Improvement of speech intelligibility by reallocation of spectral energy",
3605-3607.