The problem of separating speech signals out of monaural mixtures (with other non-speech or speech signals) has become increasingly popular in recent times. Among the various solutions proposed, the most popular methods are based on compositional models such as non-negative matrix factorization (NMF) and latent variable models. Although these techniques are highly effective they largely ignore the inherently phonetic nature of speech. In this paper we present a phoneme-dependent NMF-based algorithm to separate speech from monaural mixtures. Experiments performed on speech mixed with music indicate that the proposed algorithm can result in significant improvement in separation performance, over conventional NMF-based separation.
Bibliographic reference. Raj, Bhiksha / Singh, Rita / Virtanen, Tuomas (2011): "Phoneme-dependent NMF for speech enhancement in monaural mixtures", In INTERSPEECH-2011, 1217-1220.