In this paper, we present a new two-microphone approach that improves speech recognition accuracy when speech is masked by other speech. The algorithm improves on previous systems that have been successful in separating signals based on differences in arrival time of signal components from two microphones. The present algorithm differs from these efforts in that the signal selection takes place in the frequency domain. We observe that additional smoothing of the phase estimates over time and frequency is needed to support adequate speech recognition performance. We demonstrate that the algorithm described in this paper provides better recognition accuracy than time-domain-based signal separation algorithms, and at less than 10 percent of the computation cost.
Full Paper Multimedia Files
Bibliographic reference. Kim, Chanwoo / Kumar, Kshitiz / Raj, Bhiksha / Stern, Richard M. (2009): "Signal separation for robust speech recognition based on phase difference information obtained in the frequency domain", In INTERSPEECH-2009, 2495-2498.