INTERSPEECH 2008
9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

On the Combination of Auditory and Modulation Frequency Channels for ASR Applications

Fabio Valente, Hynek Hermansky

IDIAP Research Institute, Switzerland

This paper investigates the combination of evidence coming from different frequency channels obtained filtering the speech signal at different auditory and modulation frequencies. In our previous work [1], we showed that combination of classifiers trained on different ranges of modulation frequencies is more effective if performed in sequential (hierarchical) fashion. In this work we verify that combination of classifiers trained on different ranges of auditory frequencies is more effective if performed in parallel fashion. Furthermore we propose an architecture based on neural networks for combining evidence coming from different auditorymodulation frequency sub-bands that takes advantages of previous findings. This reduces the final WER by 6.2% (from 45.8% to 39.6%) w.r.t the single classifier approach in a LVCSR task.

Full Paper

Bibliographic reference.  Valente, Fabio / Hermansky, Hynek (2008): "On the combination of auditory and modulation frequency channels for ASR applications", In INTERSPEECH-2008, 2242-2245.