12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Maximum Confidence Measure Based Interaural Phase Difference Estimation for Noise Masking in Dual-Microphone Robust Speech Recognition

Hsien-Cheng Liao (1), Yuan-Fu Liao (2), Chin-Hui Lee (3)

(1) ITRI, Taiwan
(2) National Taipei University of Technology, Taiwan
(3) Georgia Institute of Technology, USA

A new one-stage maximum confidence measure (MCM) based interaural phase difference estimation framework for noise masking is proposed to closely integrate the underline speech models into dual-microphone array noise filtering for robust speech recognition. The main ideas are: (1) utilizing both the speech and filler models of the recognizer to feedback confidence measures (CMs) that indicate the degree of separation between filtered speech and interference noises, and (2) automatically optimizing the parameters of the microphone array with an expectation maximization (EM) algorithm based on the proposed MCM criterion. Experimental results on a Mandarin voice command task show that the proposed approach significantly improves the final speech recognition rates. Moreover the observed performance degradation is usually graceful under low signal-to-noise ratios (SNRs) and close interference noises conditions.

Full Paper

Bibliographic reference.  Liao, Hsien-Cheng / Liao, Yuan-Fu / Lee, Chin-Hui (2011): "Maximum confidence measure based interaural phase difference estimation for noise masking in dual-microphone robust speech recognition", In INTERSPEECH-2011, 473-476.