Channel Selection for Distant Speech Recognition Exploiting Cepstral Distance

Cristina Guerrero, Georgina Tryfou, Maurizio Omologo


In a multi-microphone distant speech recognition task, the redundancy of information that results from the availability of multiple instances of the same source signal can be exploited through channel selection. In this work, we propose the use of cepstral distance as a means of assessment of the available channels, in an informed and a blind fashion. In the informed approach the distances between the close-talk and all of the channels are calculated. In the blind method, the cepstral distances are computed using an estimated reference signal, assumed to represent the average distortion among the available channels. Furthermore, we propose a new evaluation methodology that better illustrates the strengths and weaknesses of a channel selection method, in comparison to the sole use of word error rate. The experimental results suggest that the proposed blind method successfully selects the least distorted channel, when sufficient room coverage is provided by the microphone network. As a result, improved recognition rates are obtained in a distant speech recognition task, both in a simulated and a real context.


DOI: 10.21437/Interspeech.2016-865

Cite as

Guerrero, C., Tryfou, G., Omologo, M. (2016) Channel Selection for Distant Speech Recognition Exploiting Cepstral Distance. Proc. Interspeech 2016, 1986-1990.

Bibtex
@inproceedings{Guerrero+2016,
author={Cristina Guerrero and Georgina Tryfou and Maurizio Omologo},
title={Channel Selection for Distant Speech Recognition Exploiting Cepstral Distance},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-865},
url={http://dx.doi.org/10.21437/Interspeech.2016-865},
pages={1986--1990}
}