Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Forward Masking on a Generalized Logarithmic Scale for Robust Speech Recognition

Yoshihiro Ito, Hiroshi Matsumoto, Kazumasa Yamamoto

Dept. of Electrical & Electronic Eng., Faculty of Engineering, Shinshu University, Nagano-shi, Nagano, Japan

This paper examines the forward masking on the generalized logarithmic scale for robust speech recognition to both additive and convolutional noise. The forward masking in the dynamic cepstral (DyC) representation is based upon subtraction of a masking pattern from a current spectrum on a logarithmic spectral domain, whereas the proposed method intends to make a compromise between the logarithmic and linear spectral domains by choosing an appropriate value of the power. This technique is incorporated into a modified MFCC-based frontend. The connected- digit recognition tests showed that in noisy conditions this technique outperforms the conventional techniques such as the DyC, the continuous spectral subtraction method, the cepstral mean subtraction while maintaining the robustness to the convolutional noise.


Full Paper

Bibliographic reference.  Ito, Yoshihiro / Matsumoto, Hiroshi / Yamamoto, Kazumasa (2000): "Forward masking on a generalized logarithmic scale for robust speech recognition", In ICSLP-2000, vol.3, 530-533.