11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Augmented Context Features for Arabic Speech Recognition

Ahmad Emami, Hong-Kwang J. Kuo, Imed Zitouni, Lidia Mangu

IBM T.J. Watson Research Center, USA

We investigate different types of features for language modeling in Arabic automatic speech recognition. While much effort in language modeling research has been directed at designing better models or smoothing techniques for n-gram language models, in this paper we take the approach of augmenting the context in the n-gram model with different sources of information. We start by adding word class labels to the context. The word classes are automatically derived from un-annotated training data. As a contrast, we also experiment with POS tags which require a tagger trained on annotated data. An amalgam of these two methods uses class labels defined on word and POS tag combinations. Other context features include super-tags derived from the syntactic tree structure as well as semantic features derived from PropBank. Experiments on the DARPA GALE Arabic speech recognition task show that augmented context features often improve both perplexity and word error rate.

Full Paper

Bibliographic reference.  Emami, Ahmad / Kuo, Hong-Kwang J. / Zitouni, Imed / Mangu, Lidia (2010): "Augmented context features for Arabic speech recognition", In INTERSPEECH-2010, 1832-1835.