9th Annual Conference of the International Speech Communication Association

Brisbane, Australia
September 22-26, 2008

Feature Vector Normalization with Combined Standard and Throat Microphones for Robust ASR

Luis Buera, Antonio Miguel, Oscar Saz, Alfonso Ortega, Eduardo Lleida

Universidad de Zaragoza, Spain

We propose on-line unsupervised compensation technique for robust speech recognition that combines standard and throat microphone feature vectors. The solution, called Multi-Environment Model-based LInear Normalization with Throat microphone information, MEMLINT, is an extension of MEMLIN formulation. Hence, standard microphone noisy space and throat microphone space are modelled as GMMs and a set of linear transformations are learnt from data associated to each pair of Gaussians (one for each GMM) using training stereo data. On the other hand, to compensate some kinds of degradation which are not considered in MEMLINT, we propose to use jointly an on-line unsupervised acoustic model adaptation method based on rotation transformations over an expanded HMM-state space (augMented stAte space acousTic dEcoder, MATE). Some experiments with an own recorded database were carried out, showing that the proposed approach significantly outperforms the single microphone approach.

Full Paper

Bibliographic reference.  Buera, Luis / Miguel, Antonio / Saz, Oscar / Ortega, Alfonso / Lleida, Eduardo (2008): "Feature vector normalization with combined standard and throat microphones for robust ASR", In INTERSPEECH-2008, 1289-1292.