15th Annual Conference of the International Speech Communication Association

September 14-18, 2014

On Recognition of Non-Native Speech Using Probabilistic Lexical Model

Marzieh Razavi, Mathew Magimai Doss

Idiap Research Institute, Switzerland

Despite various advances in automatic speech recognition (ASR) technology, recognition of speech uttered by non-native speakers is still a challenging problem. In this paper, we investigate the role of different factors such as type of lexical model and choice of acoustic units in recognition of speech uttered by non-native speakers. More precisely, we investigate the influence of the probabilistic lexical model in the framework of Kullback-Leibler divergence based hidden Markov model (KL-HMM) approach in handling pronunciation variabilities by comparing it against hybrid HMM/artificial neural network (ANN) approach where the lexical model is deterministic. Moreover, we study the effect of acoustic units (being context-independent or clustered context-dependent phones) on ASR performance in both KL-HMM and hybrid HMM/ANN frameworks. Our experimental studies on French part of MediaParl as a bilingual corpus indicate that the probabilistic lexical modeling approach in the KL-HMM framework can capture the pronunciation variations present in non-native speech effectively. More precisely, the experimental results show that the KL-HMM system using context-dependent acoustic units and trained solely on native speech data can lead to better ASR performance than adaptation techniques such as maximum likelihood linear regression.

Full Paper

Bibliographic reference.  Razavi, Marzieh / Doss, Mathew Magimai (2014): "On recognition of non-native speech using probabilistic lexical model", In INTERSPEECH-2014, 26-30.