INTERSPEECH 2010
11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Automatic Speech Recognition of Multiple Accented English Data

Dimitra Vergyri (1), Lori Lamel (2), Jean-Luc Gauvain (2)

(1) SRI International, USA
(2) LIMSI, France

Accent variability is an important factor in speech that can significantly degrade automatic speech recognition performance. We investigate the effect of multiple accents on an English broadcast news recognition system. A multi-accented English corpus is used for the task, including broadcast news segments from 6 different geographic regions: US, Great Britain, Australia, North Africa, Middle East and India. There is significant performance degradation of a baseline system trained on only US data when confronted with shows from other regions. The results improve significantly when data from all the regions are included for accent-independent acoustic model training. Further improvements are achieved when MAP-adapted accent-dependent models are used in conjunction with a GMM accent classifier.

Full Paper

Bibliographic reference.  Vergyri, Dimitra / Lamel, Lori / Gauvain, Jean-Luc (2010): "Automatic speech recognition of multiple accented English data", In INTERSPEECH-2010, 1652-1655.