EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Noise-Robust ASR by Using Distinctive Phonetic Features Approximated with Logarithmic Normal Distribution of HMM

Takashi Fukuda, Tsuneo Nitta

Toyohashi University of Technology, Japan

Various approaches focused on noise-robustness have been investigated with the aim of using an automatic speech recognition (ASR) system in practical environments. We have previously proposed a distinctive phonetic feature (DPF) parameter set for a noise-robust ASR system, which reduced the effect of high-level additive noise[1]. This paper describes an attempt to replace normal distributions (NDs) of DPFs with logarithmic normal distributions (LNDs) in HMMs because DPFs show skew symmetry, or positive and negative skewness. The HMM with the LNDs was firstly evaluated in comparison with a standard HMM with NDs in an experiment using an isolated spoken-word recognition task with clean speech. Then noise robustness was tested with four types of additive noise. In the case of DPFs as an input feature vector set, the proposed HMM with the LNDs can outperform the standard HMM with the NDs in the isolated spoken-word recognition task both with clean speech and with speech contaminated by additive noise. Furthermore, we achieved significant improvements over a baseline system with MFCC and dynamic feature-set when combining the DPFs with static MFCCs and (Delta)P.

Full Paper

Bibliographic reference.  Fukuda, Takashi / Nitta, Tsuneo (2003): "Noise-robust ASR by using distinctive phonetic features approximated with logarithmic normal distribution of HMM", In EUROSPEECH-2003, 2185-2188.