EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Missing Feature Theory Applied to Robust Speech Recognition Over IP Network

Toshiki Endo (1), Shingo Kuroiwa (2), Satoshi Nakamura (1)

(1) ATR-SLT, Japan
(2) University of Tokushima, Japan

This paper addresses the problems involved in performing speech recognition over mobile and IP networks. The main problem is speech data loss caused by packet loss in the network. We present two missing-feature-based approaches that recover lost regions of speech data. These approaches are based on reconstruction of missing frames or on marginal distributions. For comparison, we also use a tacking method, which recognizes only received data. We evaluate these approaches with packet loss models, i.e., random loss and Gilbert loss models. The results show that the marginal-distributions-based approach is most effective for a packet loss environment; the degradation of word accuracy is only 5% when the packet loss rate is 30% and only 3% when mean burst loss length is 24 frames.

Full Paper

Bibliographic reference.  Endo, Toshiki / Kuroiwa, Shingo / Nakamura, Satoshi (2003): "Missing feature theory applied to robust speech recognition over IP network", In EUROSPEECH-2003, 3081-3084.