IberSPEECH 2018

21-23 November 2018, Barcelona, Spain

Chair: Jordi Luque, Antonio Bonafonte, Francesc Alías Pujol and António Teixeira

DOI: 10.21437/IberSPEECH.2018

Speaker Recognition


Differentiable Supervector Extraction for Encoding Speaker and Phrase Information in Text Dependent Speaker Verification
Victoria Mingote, Antonio Miguel, Alfonso Ortega, Eduardo Lleida

Phonetic Variability Influence on Short Utterances in Speaker Verification
Ignacio Viñals, Alfonso Ortega, Antonio Miguel, Eduardo Lleida

Restricted Boltzmann Machine Vectors for Speaker Clustering
Umair Khan, Pooyan Safari, Javier Hernando

Speaker Recognition under Stress Conditions
Esther Rituerto-González, Ascensión Gallardo-Antolín, Carmen Peláez-Moreno


Keynote 1


Bio signal-based Spoken Communication
Tanja Schultz


Topics on Speech Technologies


Bilingual Prosodic Dataset Compilation for Spoken Language Translation
Alp Öktem, Mireia Farrús, Antonio Bonafonte

Building an Open Source Automatic Speech Recognition System for Catalan
Baybars Külebi, Alp Öktem

Multi-Speaker Neural Vocoder
Oriol Barbany, Antonio Bonafonte, Santiago Pascual

Improving the Automatic Speech Recognition through the improvement of Laguage Models
Andrés Piñeiro-Martín, Carmen García-Mateo, Laura Docío-Fernández

Towards expressive prosody generation in TTS for reading aloud applications
Monica Dominguez, Alicia Burga, Mireia Farrús, Leo Wanner

Performance evaluation of front- and back-end techniques for ASV spoofing detection systems based on deep features
Alejandro Gomez-Alanis, Antonio M. Peinado, José Andrés González López, Angel M. Gomez

The observation likelihood of silence: analysis and prospects for VAD applications
Igor Odriozola, Inma Hernaez, Eva Navas, Luis Serrano, Jon Sanchez

On the use of Phone-based Embeddings for Language Recognition
Christian Salamea, Ricardo de Córdoba, Luis Fernando D'Haro, Rubén San-Segundo, Javier Ferreiros

End-to-End Speech Translation with the Transformer
Laura Cross Vila, Carlos Escolano, José A. R. Fonollosa, Marta R. Costa-Jussà

Audio event detection on Google's Audio Set database: Preliminary results using different types of DNNs
Javier Darna-Sequeiros, Doroteo T. Toledano

Emotion Detection from Speech and Text
Mikel de Velasco, Raquel Justo, Josu Antón, Mikel Carrilero, M. Inés Torres

Experimental Framework Design for Sign Language Automatic Recognition
Darío Tilves Santiago, Ian Benderitter, Carmen García-Mateo

Baseline Acoustic Models for Brazilian Portuguese Using Kaldi Tools
Cassio Batista, Ana Larissa Dias, Nelson Sampaio Neto


ASR & Speech Applications


Converted Mel-Cepstral Coefficients for Gender Variability Reduction in Query-by-Example Spoken Document Retrieval
Paula López Otero, Laura Docío-Fernández

A Recurrent Neural Network Approach to Audio Segmentation for Broadcast Domain Data
Pablo Gimeno, Ignacio Viñals, Alfonso Ortega, Antonio Miguel, Eduardo Lleida

Improving Transcription of Manuscripts with Multimodality and Interaction
Emilio Granell, Carlos David Martinez Hinarejos, Verónica Romero

Improving Pronunciation of Spanish as a Foreign Language for L1 Japanese Speakers with Japañol CAPT Tool
Cristian Tejedor-García, Valentín Cardeñoso-Payo, María J. Machuca, David Escudero-Mancebo, Antonio Ríos, Takuya Kimura

Exploring E2E speech recognition systems for new languages
Conrad Bernath, Aitor Alvarez, Haritz Arzelus, Carlos David Martínez


Speech & Language Technologies Applied to Health


Listening to Laryngectomees: A study of Intelligibility and Self-reported Listening Effort of Spanish Oesophageal Speech
Sneha Raman, Inma Hernaez, Eva Navas, Luis Serrano

Towards an automatic evaluation of the prosody of people with Down syndrome
Mario Corrales-Astorgano, Pastora Martínez-Castilla, David Escudero-Mancebo, Lourdes Aguilar, César González-Ferreras, Valentín Cardeñoso-Payo

Whispered-to-voiced Alaryngeal Speech Conversion with Generative Adversarial Networks
Santiago Pascual, Antonio Bonafonte, Joan Serrà, José Andrés González López

LSTM based voice conversion for laryngectomees
Luis Serrano, David Tavarez, Xabier Sarasola, Sneha Raman, Ibon Saratxaga, Eva Navas, Inma Hernaez

Sign Language Gesture Classification using Neural Networks
Zuzanna Parcheta, Carlos David Martinez Hinarejos


Synthesis, Production & Analysis


Influence of tense, modal and lax phonation on the three-dimensional finite element synthesis of vowel [A]
Marc Freixes, Marc Arnela, Joan Claudi Socoró, Francesc Alías Pujol, Oriol Guasch

Exploring Advances in Real-time MRI for Speech Production Studies of European Portuguese
Conceicao Cunha, Samuel Silva, António Teixeira, Catarina Oliveira, Paula Martins, Arun Joseph, Jens Frahm

A postfiltering approach for dual-microphone smartphones
Juan M. Martín-Doñas, Iván López-Espejo, Angel M. Gomez, Antonio M. Peinado

Speech and monophonic singing segmentation using pitch parameters
Xabier Sarasola, Eva Navas, David Tavarez, Luis Serrano, Ibon Saratxaga

Self-Attention Linguistic-Acoustic Decoder
Santiago Pascual, Antonio Bonafonte, Joan Serrà



Special Session: Show & Tell


Japañol: a mobile application to help improving Spanish pronunciation by Japanese native speakers
Cristian Tejedor-García, Valentín Cardeñoso-Payo, David Escudero-Mancebo


Special Session: Ongoing Research Projects


Towards the Application of Global Quality-of-Service Metrics in Biometric Systems
Juan Manuel Espín, Roberto Font, Juan Francisco Inglés-Romero, Cristina Vicente-Chicote

Incorporation of a Module for Automatic Prediction of Oral Productions Quality in a Learning Video Game
David Escudero-Mancebo, Valentín Cardeñoso-Payo

Silent Speech: Restoring the Power of Speech to People whose Larynx has been Removed
José Andrés González López, Phil D. Green, Damian Murphy, Amelia Gully, James M. Gilbert

RESTORE Project: REpair, STOrage and REhabilitation of speech
Inma Hernaez, Eva Navas, Jose Antonio Municio Martín, Javier Gomez Suárez

Corpus for Cyberbullying Prevention
Asuncion Moreno, Antonio Bonafonte, Igor Jauk, Laia Tarrés, Victor Pereira

EMPATHIC, Expressive, Advanced Virtual Coach to Improve Independent Healthy-Life-Years of the Elderdy
M. Inés Torres, Gérard Chollet, César Montenegro, Jofre Tenorio-Laranga, Olga Gordeeveva, Anna Esposito, Cornelius Glackin, Stephan Schlögl, Olivier Deroo, Begoña Fernández-Ruanova, Riberto Santana, Maria S. Kornes, Fred Lindner, Daria Kyslitska, Miriam Reiner, Gennaro Cordasco, Mari Aksnes, Raquel Justo



Albayzin Challenges: Multimodal Diarization


ODESSA/PLUMCOT at Albayzin Multimodal Diarization Challenge 2018
Benjamin Maurice, Hervé Bredin, Ruiqing Yin, Jose Patino, Héctor Delgado, Claude Barras, Nicholas Evans, Camille Guinaudeau

UPC Multimodal Speaker Diarization System for the 2018 Albayzin Challenge
Miquel Angel India Massana, Itziar Sagastiberri, Ponç Palau, Elisa Sayrol, Josep Ramon Morros, Javier Hernando

The GTM-UVIGO System for Audiovisual Diarization
Eduardo Ramos-Muguerza, Laura Docío-Fernández, José Luis Alba-Castro


Albayzin Challenges: Speaker Diarization


The SRI International STAR-LAB System Description for IberSPEECH-RTVE 2018 Speaker Diarization Challenge
Diego Castan, Mitchell McLaren, Mahesh Kumar Nandwana

ODESSA at Albayzin Speaker Diarization Challenge 2018
Jose Patino, Héctor Delgado, Ruiqing Yin, Hervé Bredin, Claude Barras, Nicholas Evans

EML Submission to Albayzin 2018 Speaker Diarization Challenge
Omid Ghahabi, Volker Fischer

In-domain Adaptation Solutions for the RTVE 2018 Diarization Challenge
Ignacio Viñals, Pablo Gimeno, Alfonso Ortega, Antonio Miguel, Eduardo Lleida

DNN-based Embeddings for Speaker Diarization in the AuDIaS-UAM System for the Albayzin 2018 IberSPEECH-RTVE Evaluation
Alicia Lozano-Diez, Beltran Labrador, Diego de Benito, Pablo Ramirez, Doroteo T. Toledano

CENATAV Voice-Group Systems for Albayzin 2018 Speaker Diarization Evaluation Campaign
Edward L. Campbell, Gabriel Hernandez, José R. Calvo de Lara

The Intelligent Voice System for the IberSPEECH-RTVE 2018 Speaker Diarization Challenge
Abbas Khosravani, Cornelius Glackin, Nazim Dugan, Gérard Chollet, Nigel Cannings

JHU Diarization System Description
Zili Huang, L. Paola García-Perera, Jesús Villalba, Daniel Povey, Najim Dehak



Albayzin Challenges: Speech to Text


MLLP-UPV and RWTH Aachen Spanish ASR Systems for the IberSpeech-RTVE 2018 Speech-to-Text Transcription Challenge
Javier Jorge, Adrià Martínez-Villaronga, Pavel Golik, Adrià Giménez, Joan Albert Silvestre-Cerdà, Patrick Doetsch, Vicent Andreu Císcar, Hermann Ney, Alfons Juan, Albert Sanchis

Exploring Open-Source Deep Learning ASR for Speech-to-Text TV program transcription
Juan M. Perero-Codosero, Javier Antón-Martín, Daniel Tapias Merino, Eduardo López-Gonzalo, Luis A. Hernández-Gómez

The Vicomtech-PRHLT Speech Transcription Systems for the IberSPEECH-RTVE 2018 Speech to Text Transcription Challenge
Haritz Arzelus, Aitor Alvarez, Conrad Bernath, Eneritz García, Emilio Granell, Carlos David Martinez Hinarejos

Intelligent Voice ASR system for Iberspeech 2018 Speech to Text Transcription Challenge
Nazim Dugan, Cornelius Glackin, Gérard Chollet, Nigel Cannings

The GTM-UVIGO System for Albayzin 2018 Speech-to-Text Evaluation
Laura Docío-Fernández, Carmen García-Mateo


Text & NLP Applications


Topic coherence analysis for the classification of Alzheimer's disease
Anna Pompili, Alberto Abad, David Martins de Matos, Isabel Pavão Martins

Building a global dictionary for semantic technologies
Iklódi Eszter, Gábor Recski, Gábor Borbély, Maria Jose Castro-Bleda

TransDic, a public domain tool for the generation of phonetic dictionaries in standard and dialectal Spanish and Catalan
Juan-María Garrido, Marta Codina, Kimber Fodge

Wide Residual Networks 1D for Automatic Text Punctuation
Jorge Llombart, Antonio Miguel, Alfonso Ortega, Eduardo Lleida

End-to-End Multi-Level Dialog Act Recognition
Eugénio Ribeiro, Ricardo Ribeiro, David Martins de Matos