Interspeech'2005 - Eurospeech
In this paper a variety of front-end configurations are evaluated on Hungarian telephone speech databases. Our aim was to measure directly the efficiency of the front-ends on real noisy and normal speech data. As a baseline the ETSI ADSR standard front-end is used. Some simplification on the standard is introduced resulting in better performance on our databases than the original front-end in terms of both speed and recognition rate. Besides, another recently proposed feature extraction approach is also investigated. Finally the effect of the novel voice activity detection approach is evaluated. The best front-end configuration augmented with this voice activity detector outperformed significantly the baseline in each recognition test and by 24,7% relative in average.
Bibliographic reference. Mihajlik, Péter / Tobler, Zoltán / Tüske, Zoltán / Gordos, Géza (2005): "Evaluation and optimization of noise robust front-end technologies for the automatic recognition of Hungarian telephone speech", In INTERSPEECH-2005, 2677-2680.