Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Distributed ASR Using Speech Coder Data for Efficient Feature Vector Representation

Trond Skogstad, Torbjørn Svendsen

Norwegian University of Science & Technology, Norway

This paper proposes an alternative approach to distributed speech recognition in scenarios where both reliable feature vectors and the reconstruction of the speech signal are required. By transmitting the difference between speech coded information and the desired feature vectors, this system achieves both excellent quality speech reconstruction and ASR recognition performance. Experiments show that a transparent recognition rate is achieved with as little as 0.6 kbps of additional information supplementing the AMR speech coder operating at 4.75 kbps. The total rate is comparable to the ETSI 202 211 extended front-end standard.

Full Paper

Bibliographic reference.  Skogstad, Trond / Svendsen, Torbjørn (2005): "Distributed ASR using speech coder data for efficient feature vector representation", In INTERSPEECH-2005, 2861-2864.