8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003


Using Accent Information in ASR Models for Swedish

Giampiero Salvi

KTH, Sweden

In this study accent information is used in an attempt to improve acoustic models for automatic speech recognition (ASR). First, accent dependent Gaussian models were trained independently. The Bhattacharyya distance was then used in conjunction with agglomerative hierarchical clustering to define optimal strategies for merging those models. The resulting allophonic classes were analyzed and compared with the phonetic literature. Finally, accent "aware" models were built, in which the parametric complexity for each phoneme corresponds to the degree of variability across accent areas and to the amount of training data available for it. The models were compared to models with the same, but evenly spread, overall complexity showing in some cases a slight improvement in recognition accuracy.

Full Paper

Bibliographic reference.  Salvi, Giampiero (2003): "Using accent information in ASR models for Swedish", In EUROSPEECH-2003, 2677-2680.