Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Variability of Automatic Speech Recognition Systems Using Different Features

Loic Barrault (1), Renato de Mori (1), Roberto Gemello (2), Franco Mana (2), Driss Matrouf (1)

(1) LIA-CNRS, Avignon, France; (2) Loquendo, Italy

The paper describes the use of two recognizers fed by different acoustic features. The first recognizer performs Multiple Resolution Analysis (MRA) while the other recognizer computes JRASTA Perceptual Linear Prediction Coefficients (JRASTAPLP). The two recognizers use the same denoising method but perform different partitions of their acoustic spaces. Experiments with the Italian and Spanish components of the AURORA3 corpus show that the two systems provide, in a significant proportion of cases, substantially different posterior probabilities for the same phoneme in the same time interval. A decision rule is proposed when two different words are hypothesized by the two recognizers. It is based on the probability that a hypothesis is correct, given the identity of the word hypotheses that are in competition. Significant word error rate (WER) reductions have been found for the CH1 proportion of the Italian and Spanish components of the AURORA3 corpus.

Full Paper

Bibliographic reference.  Barrault, Loic / Mori, Renato de / Gemello, Roberto / Mana, Franco / Matrouf, Driss (2005): "Variability of automatic speech recognition systems using different features", In INTERSPEECH-2005, 221-224.