ITRW on Non-Linear Speech Processing (NOLISP 05)

Barcelona, Spain
April 19-22, 2005

Separation of Multispeaker Speech using Excitation Information

B. Yegnanarayana (1), R. KumaraSwamy (1), S. R. Mahadeva Prasanna (2)

(1) Speech and Vision Laboratory, Department of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai, India
(2) Department of Electronics and Communication Engineering, Indian Institute of Technology Guwahati, India

In this paper, we propose an approach for separating speech of individual speakers from a multispeaker speech signal using excitation source information. The proposed approach is demonstrated in a two-microphone case. The main issue in the two-microphone case is the estimation of delay of each speaker. We propose a method for delay estimation in multispeaker case using the knowledge of excitation source information. The estimated delays are used for deriving weight functions for each speaker. The weight functions are used for extracting the excitation sequences for each of the speakers. The separated speech for each speaker is synthesized using the extracted excitation sequence. The proposed approach is illustrated for three speaker speech data collected over two spatially distributed microphones.

Full Paper

Bibliographic reference.  Yegnanarayana, B. / KumaraSwamy, R. / Prasanna, S. R. Mahadeva (2005): "Separation of multispeaker speech using excitation information", In NOLISP-2005, 11-18.