11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Using a DBN to Integrate Sparse Classification and GMM-Based ASR

Yang Sun, Jort F. Gemmeke, Bert Cranen, Louis ten Bosch, Lou Boves

Radboud Universiteit Nijmegen, The Netherlands

The performance of an HMM-based speech recognizer using MFCCs as input is known to degrade dramatically in noisy conditions. Recently, an exemplar-based noise robust ASR approach, called sparse classification (SC), was introduced. While very successfully at lower SNRs, the performance at high SNRs suffered when compared to HMM-based systems. In this work, we propose to use a Dynamic Bayesian Network (DBN) to implement an HMM-model that uses both MFCCs and phone predictions extracted from the SC system as input. By doing experiments on the AURORA-2 connected digit recognition task, we show that our approach successfully combines the strengths of both systems, resulting in competitive recognition accuracies at both high and low SNRs.

Full Paper

Bibliographic reference.  Sun, Yang / Gemmeke, Jort F. / Cranen, Bert / Bosch, Louis ten / Boves, Lou (2010): "Using a DBN to integrate sparse classification and GMM-based ASR", In INTERSPEECH-2010, 2098-2101.