EUROSPEECH 2003 - INTERSPEECH 2003
In this paper we describe a new framework of feature compensation for robust speech recognition. We introduce Delta-Cepstrum Normalization (DCN) that normalizes not only cepstral coefficients, but also their time-derivatives. In previous work, the mean and the variance of cepstral coefficients are normalized to reduce the irrelevant information, but such a normalization was not applied to time-derivative parameters because the reduction of the irrelevant information was not enough. However, Histogram Equalization provides better compensation and can be applied even to delta and delta-delta cepstra. We investigate various implementation of DCN, and show that we can achieve the best performance when the normalization of the cepstra and delta cepstra can be mutually interdependent. We evaluate the performance of DCN using speech data recorded by a PDA. DCN provides significant improvements compared to HEQ. We also examine the possibility of combining Vector Taylor Series (VTS) and DCN. Even though some combinations do not improve the performance of VTS, it is shown that the best combination gives better performance than VTS alone. Finally, the advantages of DCN in terms of the computation speed are also discussed.
Bibliographic reference. Obuchi, Yasunari / Stern, Richard M. (2003): "Normalization of time-derivative parameters using histogram equalization", In EUROSPEECH-2003, 665-668.