In the context of ambient assisted living, automatic speech recognition (ASR) has the potential to provide textual support for hearing aid users in challenging acoustic conditions. In this paper we therefore investigate possibilities to improve ASR based on binaural hearing aid signals in complex acoustic scenes. Particularly, information about the spatial configuration of sound sources is exploited and estimated using a recently developed method that employs probabilistic information about the location of a target speaker (and a simultaneous localized masker) for robust real-time localization. Two different strategies are investigated: straightforward better-ear listening and a multi-channel beamforming system aiming at enhancement of a target speech source with additional suppression of localized masking sound. The latter method is also complemented by better-ear listening. Both approaches are evaluated in different acoustic scenarios containing moving target and interfering speakers or noise sources. Compared to using non-preprocessed signals, we obtain average relative reductions in word error rate of 28.4% in the presence of a localized interfering noise, 19.2% in the case of a concurrent talker and 23.7% in presence of a concurrent talker in spatially diffuse noise. A post-analysis assesses the relation of localization performance and beamforming for improved speech recognition in complex acoustic scenes.
Bibliographic reference. Kayser, Hendrik / Spille, Constantin / Marquardt, Daniel / Meyer, Bernd T. (2015): "Improving automatic speech recognition in spatially-aware hearing aids", In INTERSPEECH-2015, 175-179.