Interspeech'2005 - Eurospeech

Lisbon, Portugal
September 4-8, 2005

Outlier Detection for Acoustic Model Training Using Robust Statistics

Shigeki Matsuda, Wolfgang Herbordt, Satoshi Nakamura

ATR-SLT, Japan

In this paper, we propose an acoustic model training technique which is robust against outliers such as clipping, unexpected noise, poorly pronounced word segments, or mis-transcriptions, which deteriorate the quality of the acoustic models and in turn decrease speech recognition performance. The outlier-robust acoustic model training technique is based on a maximum likelihood (ML) criterion and automatically detects and removes outliers from the training data. Experiments with artificially contaminated mis-transcribed training data show that nearly the same word error rate can be obtained for contaminated data using the proposed technique as for uncontaminated data. Application to a dialogue speech database with unknown outliers reduces the errors by 4.03%.

Full Paper

Bibliographic reference.  Matsuda, Shigeki / Herbordt, Wolfgang / Nakamura, Satoshi (2005): "Outlier detection for acoustic model training using robust statistics", In INTERSPEECH-2005, 3337-3340.