11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Sparse Component Analysis for Speech Recognition in Multi-Speaker Environment

Afsaneh Asaei, Hervé Bourlard, Philip N. Garner

Idiap Research Institute, Switzerland

Sparse Component Analysis is a relatively young technique that relies upon representation of a signal occupying only a small part of a larger space. Mixtures of sparse components are disjoint in that space. As a particular application of sparsity of speech signals, we investigate the DUET blind source separation algorithm in the context of speech recognition for multi-party recordings. We show how DUET can be tuned to the particular case of speech recognition with interfering sources, and evaluate the limits of performance as the number of sources increases. We show that the separated speech fits a common metric for sparsity, and conclude that sparsity assumptions lead to good performance in speech separation and hence ought to benefit other aspects of the speech recognition chain.

Full Paper

Bibliographic reference.  Asaei, Afsaneh / Bourlard, Hervé / Garner, Philip N. (2010): "Sparse component analysis for speech recognition in multi-speaker environment", In INTERSPEECH-2010, 1704-1707.