10th Annual Conference of the International Speech Communication Association

Brighton, United Kingdom
September 6-10, 2009

Dynamic Features in the Linear Domain for Robust Automatic Speech Recognition in a Reverberant Environment

Osamu Ichikawa, Takashi Fukuda, Ryuki Tachibana, Masafumi Nishimura

IBM Tokyo Research Lab, Japan

Since the MFCC are calculated from logarithmic spectra, the delta and delta-delta are considered as difference operations in a logarithmic domain. In a reverberant environment, speech signals have trailing reverberations, whose power is plotted as a long-term exponential decay. This means the logarithmic delta value tends to remain large for a long time. This paper proposes a delta feature calculated in the linear domain, due to the rapid decay in reverberant environments. In an experiment using an evaluation framework (CENSREC-4), significant improvements were found in reverberant situations by simply replacing the MFCC dynamic features with the proposed dynamic features.

Full Paper

Bibliographic reference.  Ichikawa, Osamu / Fukuda, Takashi / Tachibana, Ryuki / Nishimura, Masafumi (2009): "Dynamic features in the linear domain for robust automatic speech recognition in a reverberant environment", In INTERSPEECH-2009, 44-47.