Fourth International Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU-2014)

St. Petersburg, Russia
May 14-16, 2014



Bibliographic Reference

[SLTU-2014] Fourth International Workshop On Spoken Language Technologies for Under-resourced Languages, St. Petersburg, Russia, May 14-16, 2014, ed. by Speech and Multimodal Interfaces Laboratory of SPIIRAS, ISCA Archive, http://www.isca-speech.org/archive/sltu_2014


Introduction to the Workshop



Author Index and Quick Access to Abstracts

Abbas   Adda   Adda-Decker   Adel   Ahmed   Alumäe   Anguera   Badenhorst   Baldewijns   Barnard (194)   Barnard (238)   Belguith   Besacier   Bougares   Braga   Burileanu (124)   Burileanu (215)   Buzo (24)   Buzo (124)   Buzo (215)   Cabral   Castelli   Chesi   Chistikov   Cho   van Compernolle   Cucu (124)   Cucu (215)   d’Alessandro   Davel   Dias   Dmitriev   Do, Cong-Thanh   Do, Thi-Ngoc-Diep   Dupoux   Estève   Evgrafova   Fegyó   Ferreira   GALES   Gauvain (112)   Gauvain (176)   Goudi   Grézl   Harrat   Hartmann   van Heerden   Jokinen   Karafiát   Karpov   Khmekhem   Kipyatkova   Kirchhoff   KNILL   Lamel (53)   Lamel (112)   Lamel (146)   Lamel (161)   Lamel (176)   Laurent   Leidig   Ludusan   Lyudovyk   Mabokela   Manaileng   Manamela   Markov   Masmoudi   Meftouh   Metze   Michaud   Mihajlik   Molapo   NAKAMURA (13)   Nakamura (46)   Nguyen   Nocera   Palmer   Penagarikano   Petrică   Pham   Pylypenko   Quaschningk   RAGNI   RATH   Rilliard   Rodriguez-Fuentes   Rossato   Sahraeian   Sakti   Samson Juan   Schlippe (32)   Schlippe (73)   Schlippe (139)   Schlippe (207)   Schultz (32)   Schultz (73)   Schultz (139)   Schultz (207)   Shchemelinin   Simonchik   Skrelin   Smaili   Stahlberg   Szöke   Talanov   Tarján   Telaar   Tran   Ullakonoja   Vakil   Vasilescu   Vazhenina   Verkhodanova   Vieru   Vogel   Volskaya   Vu   de Wet (61)   de Wet (194)   de Wet (238)   Wilcock  

Names written in boldface refer to first authors, in CAPITAL letters to keynote and invited papers. Full papers can be accessed from the abstracts. Please note that each abstract opens in a separate window.



Table of Contents and Access to Abstracts

Keynotes

Nakamura, Satoshi: "Towards real-time multilingual multimodal speech-to-speech translation", 13-15.

Gales, Mark J. F. / Knill, Kate M. / Ragni, Anton / Rath, Shakti P.: "Speech recognition and keyword spotting for low-resource languages: Babel project research at CUED", 16-23.

Contributed Papers

Anguera, Xavier / Rodriguez-Fuentes, Luis J. / Szöke, Igor / Buzo, Andi / Metze, Florian / Penagarikano, Mikel: "Query-by-example spoken term detection evaluation on low-resource languages", 24-31.

Adel, Heike / Kirchhoff, Katrin / Telaar, Dominic / Vu, Ngoc Thang / Schlippe, Tim / Schultz, Tanja: "Features for factored language models for code-Switching speech", 32-38.

Grézl, Frantisek / Karafiát, Martin: "Adapting multilingual neural network hierarchy to a new language", 39-45.

Sakti, Sakriani / Nakamura, Satoshi: "Recent progress in developing grapheme-based speech recognition for Indonesian ethnic languages: Javanese, Sundanese, Balinese and Bataks", 46-52.

Adda-Decker, Martine / Lamel, Lori / Adda, Gilles: "Speech alignment and recognition experiments for Luxembourgish", 53-60.

Sahraeian, Reza / Compernolle, Dirk Van / Wet, Febe de: "On using intrinsic spectral analysis for low-resource languages", 61-65.

Samson Juan, Sarah / Besacier, Laurent / Rossato, Solange: "Semi-supervised G2p bootstrapping and its application to ASR for a very under-resourced language: Iban", 66-72.

Stahlberg, Felix / Schlippe, Tim / Vogel, Stephan / Schultz, Tanja: "Towards automatic speech recognition without pronunciation dictionary, transcribed speech and text resources in the target language using cross-lingual word-to-phoneme alignment", 73-80.

Kipyatkova, Irina / Verkhodanova, Vasilisa / Karpov, Alexey: "Rescoring n-best lists for Russian speech recognition using factored language models", 81-86.

Ferreira, José Pedro / Chesi, Cristiano / Cho, Hyongsil / Baldewijns, Daan / Braga, Daniela / Dias, Miguel: "On Mirandese language resources for text-to-speech", 87-91.

Ahmed, Zeeshan / Cabral, Joao P.: "HMM-based speech synthesiser for the Urdu language", 92-97.

Nguyen, Thi Thu Trang / Tran, Do Dat / Rilliard, Albert / d’Alessandro, Christophe / Pham, Thi Ngoc Yen: "Intonation issues in HMM-based speech synthesis for Vietnamese", 98-104.

Chistikov, Pavel / Talanov, Andrey: "High quality speech synthesis using a small speech dataset", 105-111.

Hartmann, William / Lamel, Lori / Gauvain, Jean-Luc: "Cross-word sub-word units for low-resource keyword spotting", 112-117.

Alumäe, Tanel: "Recent improvements in Estonian LVCSR", 118-123.

Cucu, Horia / Buzo, Andi / Burileanu, Corneliu: "Unsupervised acoustic model training using multiple seed ASR systems", 124-130.

Tarján, Balázs / Fegyó, Tibor / Mihajlik, Péter: "A bilingual study on the prediction of morph-based improvement", 131-138.

Schlippe, Tim / Quaschningk, Wolf / Schultz, Tanja: "Combining grapheme-to-phoneme converter outputs for enhanced pronunciation generation in low-resource scenarios", 139-145.

Laurent, Antoine / Lamel, Lori: "Development of a Korean speech recognition system with little annotated data", 146-152.

Do, Thi-Ngoc-Diep / Michaud, Alexis / Castelli, Eric: "Towards the automatic processing of Yongning Na (sino-tibetan): developing a ‘light’ acoustic model of the target language and testing ‘heavyweight’ models from five national languages", 153.

Vasilescu, Ioana / Vieru, Bianca / Lamel, Lori: "Exploring pronunciation variants for Romanian speech-to-text transcription", 161-168.

Vakil, Anjana / Palmer, Alexis: "Cross-language mapping for small-vocabulary ASR in under-resourced languages: investigating the impact of source language choice", 169-175.

Do, Cong-Thanh / Lamel, Lori / Gauvain, Jean-Luc: "Speech-to-text development for Slovak, a low-resourced language", 176-182.

Vazhenina, Daria / Markov, Konstantin: "Sequence memoizer based language model for Russian speech recognition", 183-187.

Lyudovyk, Tetyana / Pylypenko, Valeriy: "Code-Switching speech recognition for closely related languages", 188-193.

Barnard, Etienne / Davel, Marelie H. / Heerden, Charl van / Wet, Febe de / Badenhorst, Jaco: "The NCHLT speech corpus of the South African languages", 194-200.

Jokinen, Kristiina / Wilcock, Graham: "Community-based resource building and data collection", 201-206.

Leidig, Sebastian / Schlippe, Tim / Schultz, Tanja: "Automatic detection of anglicisms for the pronunciation dictionary generation: a case study on our German IT corpus", 207-214.

Petrică, Lucian / Cucu, Horia / Buzo, Andi / Burileanu, Corneliu: "A robust diacritics restoration system using unreliable raw text data", 215-220.

Simonchik, Konstantin / Shchemelinin, Vadim: "“STC spoofing” database for text-dependent speaker recognition evaluation", 221-224.

Mabokela, Koena Ronny / Manamela, Madimetja Jonas / Manaileng, Mabu: "Modeling code-Switching speech on under-resourced languages for language identification", 225-230.

Ludusan, Bogdan / Dupoux, Emmanuel: "Towards low-resource prosodic boundary detection", 231-237.

Molapo, Raymond / Barnard, Etienne / Wet, Febe de: "Speech data collection in an under-resourced language within a multilingual context", 238-242.

Skrelin, Pavel / Volskaya, Nina / Evgrafova, Karina / Ullakonoja, Riikka: "The development of new corpora for under-resourced languages using data available for well-resourced ones", 243-246.

Dmitriev, Dmitri: "Web lexicography for and by non-tech people", 247-251.

Masmoudi, Abir / Estève, Yannick / Khmekhem, Mariem Ellouze / Bougares, Fethi / Belguith, Lamia Hadrich: "Phonetic tool for the Tunisian Arabic", 253-256.

Harrat, S. / Meftouh, Karima / Abbas, M. / Smaili, K.: "Grapheme to phoneme conversion: an Arabic dialect case", 257-262.

Goudi, Maria / Nocera, Pascal: "Sounds and symbols: an overview of different types of methods dealing with letters-to-sounds relationships in a wide range of languages in automatic speech recognition", 263-267.