EUROSPEECH 2003 - INTERSPEECH 2003
In October 2002, European Telecommunications Standards Institute (ETSI) recommended a standard Distributed Speech Recognition (DSR) advanced front-end, ETSI ES202 050 version 1.1.1 (ES202). Many studies use this front-end in noise environments on several languages on connected digit recognition tasks. However, we have not seen the reports of large vocabulary continuous speech recognition using this front-end on a Japanese speech corpus. Since the DSR system is used on several languages and tasks, we conducted large vocabulary continuous speech recognition experiments using ES202 on a Japanese speech corpus in noise environments. Experimental results show that ES202 has better recognition performance than previous DSR front-end, ETSI ES201 050 version 1.1.2 under all conditions. In addition, we focus on the influence on recognition performance of DSR with acoustic mismatches caused by input devices. DSR employs a vector quantization (VQ) algorithm for feature compression so that the VQ distortion is increased by these mismatches. Large VQ distortion increases the speech recognition error rate. To overcome increases in VQ distortion, we have proposed the Bias Removal method (BRM) in previous work. However, this method can not be applied in real-time. Hence, we have proposed the Real-time Bias Removal Method (RBRM) in this paper. The continuous speech recognition experiments on a Japanese speech corpus show that RBRM achieves an 8.7% improvement in the error rate compared to ES202 under noise conditions (SNR=20dB with convolutional noise).
Bibliographic reference. Tsuge, Satoru / Kuroiwa, Shingo / Kita, Kenji (2003): "Evaluation of ETSI advanced DSR front-end and bias removal method on the Japanese newspaper article sentences speech corpus", In EUROSPEECH-2003, 2145-2148.