8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Canonicalization of Feature Parameters for Automatic Speech Recognition

Takashi Fukuda, Tsuneo Nitta

Toyohashi University of Technology, Japan

Acoustic models (AMs) of an HMM-based classifier include various types of hidden variables such as gender type, speaking rate, and acoustic environment. If there exists a canonicalization process that reduces the influence of the hidden variables from the AMs, a robust automatic speech recognition (ASR) system can be realized. In this paper, we describe the configuration of a canonicalization process targeting gender type as a hidden variable. The proposed canoncalization process is composed of multiple distinctive phonetic feature (DPF) extractors corresponding to the hidden variable and a DPF selector. Experiments are carried out by comparing (A) the combination of the canonicalized DPF and a single HMM classifier, and (B) the combination of a single acoustic feature (MFCC) and multiple HMM classifiers. The result shows that the proposed canonicalization method outperforms both of conventional ASR with MFCC and a single HMM and ASR with multiple HMMs.

Full Paper

Bibliographic reference.  Fukuda, Takashi / Nitta, Tsuneo (2004): "Canonicalization of feature parameters for automatic speech recognition", In INTERSPEECH-2004, 2537-2540.