ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

On the combination of auditory and modulation frequency channels for ASR applications

Fabio Valente, Hynek Hermansky

This paper investigates the combination of evidence coming from different frequency channels obtained filtering the speech signal at different auditory and modulation frequencies. In our previous work [1], we showed that combination of classifiers trained on different ranges of modulation frequencies is more effective if performed in sequential (hierarchical) fashion. In this work we verify that combination of classifiers trained on different ranges of auditory frequencies is more effective if performed in parallel fashion. Furthermore we propose an architecture based on neural networks for combining evidence coming from different auditorymodulation frequency sub-bands that takes advantages of previous findings. This reduces the final WER by 6.2% (from 45.8% to 39.6%) w.r.t the single classifier approach in a LVCSR task.


doi: 10.21437/Interspeech.2008-445

Cite as: Valente, F., Hermansky, H. (2008) On the combination of auditory and modulation frequency channels for ASR applications. Proc. Interspeech 2008, 2242-2245, doi: 10.21437/Interspeech.2008-445

@inproceedings{valente08_interspeech,
  author={Fabio Valente and Hynek Hermansky},
  title={{On the combination of auditory and modulation frequency channels for ASR applications}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={2242--2245},
  doi={10.21437/Interspeech.2008-445}
}