INTERSPEECH 2006 - ICSLP
This paper presents a new subspace modeling and selection approach for noisy speech recognition. In subspace modeling, we develop factor analysis (FA) for representing noisy speech. FA is a data generation model where the common factors are extracted with factor loading matrix and specific factors. We bridge the connection of FA to signal subspace (SS) approach. Interestingly, FA partitions noisy speech space into a principal subspace containing speech and noise and a minor subspace containing residual speech and residual noise. To estimate clean speech, we minimize the energies of speech distortion in principal subspace as well as minor subspace. More importantly, in subspace selection, we explore optimal subspace partition via solving hypothesis test problems. We test the equivalence of eigenvalues in minor subspace so as to determine subspace dimension. To fulfill FA spirit, we further examine the hypothesis of uncorrelated residual speech. Optimal solutions are realized through likelihood ratio test with the approximated chi-square distributions as test statistics. Subspace partition is performed according to the confidence towards rejecting null hypotheses. In the experiments on Aurora2 database, FA outperforms SS in subspace modeling. New selection algorithms effectively determine subspace dimension for noisy speech recognition.
Bibliographic reference. Chien, Jen-Tzung / Ting, Chuan-Wei (2006): "Subspace modeling and selection for noisy speech recognition", In INTERSPEECH-2006, paper 1333-Tue1A2O.6.