INTERSPEECH 2004 - ICSLP
Acoustic models (AMs) of an HMM-based classifier include various types of hidden variables such as gender type, speaking rate, and acoustic environment. If there exists a canonicalization process that reduces the influence of the hidden variables from the AMs, a robust automatic speech recognition (ASR) system can be realized. In this paper, we describe the configuration of a canonicalization process targeting gender type as a hidden variable. The proposed canoncalization process is composed of multiple distinctive phonetic feature (DPF) extractors corresponding to the hidden variable and a DPF selector. Experiments are carried out by comparing (A) the combination of the canonicalized DPF and a single HMM classifier, and (B) the combination of a single acoustic feature (MFCC) and multiple HMM classifiers. The result shows that the proposed canonicalization method outperforms both of conventional ASR with MFCC and a single HMM and ASR with multiple HMMs.
Bibliographic reference. Fukuda, Takashi / Nitta, Tsuneo (2004): "Canonicalization of feature parameters for automatic speech recognition", In INTERSPEECH-2004, 2537-2540.