8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Improved Histogram-Based Feature Compensation for Robust Speech Recognition and Unsupervised Speaker Adaptation

Yasunari Obuchi

Hitachi Ltd., Japan

Feature compensation for noise robust speech recognition becomes more effective if normalization of time-derivative parameters is taken into account. This paper describes an implementation of Delta- Cepstrum Normalization (DCN) that runs with only minimum response time. The proposed algorithm, referred to as Recursive DCN, provides word error rate improvements comparable to conventional DCN. Since DCN includes the procedure that adjusts the mismatch between the cepstrum part and the delta-cepstrum part, it works effectively even if only small amount of data can be used. We also investigate the possibility of applying DCN to unsupervised speaker adaptation. It is shown that DCN adaptation improves the recognition accuracy even without reference transcription of the adaptation data. Finally, DCN adaptation is combined with Feature-space Maximum Likelihood Linear Regression (FMLLR). It shows promising results in the batch mode experiments, although the improvement is rather small in the recursive mode.

Full Paper

Bibliographic reference.  Obuchi, Yasunari (2004): "Improved histogram-based feature compensation for robust speech recognition and unsupervised speaker adaptation", In INTERSPEECH-2004, 2065-2068.