Recognizing speech transmitted over mobile or computer networks poses new challenges such as packet loss in transmission. Viterbi algorithm, the most common speech recognition approach, seaches for the most likely state sequence that explains all observation. However, because it implicitly sums the log observation probabilities, the resulting solution is sensitive to outlier frames. In this paper, we propose a robust approach that searches the state sequence that best explains x percent of the observation and is insensitive to the corruption of a limited number of frames. We evaluated the proposed algorithm on the TI-digits task. With 10% of the data loss, the proposed algorithm achieves improvement of 71.6% for isolated digit recognition and 32.2% for connected digit recognition.
Cite as: Siu, M., Chan, Y.-C. (2001) Robust speech recognition against packet loss. Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001), 1095-1098, doi: 10.21437/Eurospeech.2001-275
@inproceedings{siu01_eurospeech, author={Manhung Siu and Yu-Chung Chan}, title={{Robust speech recognition against packet loss}}, year=2001, booktitle={Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001)}, pages={1095--1098}, doi={10.21437/Eurospeech.2001-275} }