8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Score Fusion for Articulatory Feature Detection

Brian M. Ore (1), Raymond E. Slyh (2)

(1) General Dynamics Advanced Information Systems Dayton, USA

Articulatory Features (AFs) describe the way in which the speech organs are used when producing speech sounds. Research has shown that incorporating this information into speech recognizers can lead to an increase in system performance. This paper considers English AF detection using Gaussian Mixture Models (GMMs) and Multi-Layer Perceptrons (MLPs). The scores from the GMM- and MLP-based detectors are fused using a second MLP, resulting in an average reduction of 8.24% in equal error rate compared to the individual systems. These detector outputs are used to form the feature set for a Hidden Markov Model (HMM) phone recognizer. It is shown that monophone models created using the proposed feature set perform comparably to triphone models trained using Mel-Frequency Cepstral Coefficients (MFCCs).

Full Paper

Bibliographic reference.  Ore, Brian M. / Slyh, Raymond E. (2007): "Score fusion for articulatory feature detection", In INTERSPEECH-2007, 1845-1848.