COST278 and ISCA Tutorial and Research Workshop (ITRW) on Robustness Issues in Conversational Interaction

University of East Anglia, Norwich, UK
August 30-31, 2004

Statistical-Based Reconstruction Methods for Speech Recognition in IP Networks

Angel M. Gómez (1), Antonio M. Peinado (1), Victoria Sánchez (1), Ben P. Milner (2), Antonio J. Rubio (1)

(1) Dpt. Electrónica y Tecnología de Computadores, University of Granada, Spain
(2) School of Computing Sciences, University of East Anglia, UK

This work shows the performance of statistical-based reconstruction techniques when a burst-like packet loss network is used to transmit speech feature vectors on a DSR architecture. Two different approaches to exploit prior information about the speech are outlined. The first models the sequence of quantized vectors through transition probabilities to make estimations based on data-source information, while the second uses prior knowledge of the means and covariances of the feature vector stream to make a maximum a-posteriori (MAP) estimate of lost vectors. These methods provide better results than those obtained by the AURORA nearest repetition, especially in the presence of bursts of losses. However, they require either a notable amount of memory or a high time complexity. Therefore, a novel solution based on the previous methods is proposed and evaluated.

Full Paper

Bibliographic reference.  Gómez, Angel M. / Peinado, Antonio M. / Sánchez, Victoria / Milner, Ben P. / Rubio, Antonio J. (2004): "Statistical-based reconstruction methods for speech recognition in IP networks", In Robust2004, paper 32.