ISCA Workshop on Multilingual Speech and Language Processing (MULTILING 2006)

Center for Language and Speech Technology, Stellenbosch University, Stellenbosch, South Africa
April 9-11, 2006

Detection of Non-Native Named Entities Using Prosodic Features for Improved Speech Recognition and Translation

Vivek Rangarajan, Shrikanth Narayanan

Speech Analysis and Interpretation Lab, Department of Electrical Engineering, University of Southern California Viterbi School of Engineering, California, USA

In this work, we describe the use of acoustic-prosodic features to detect and localize non-native named entities spoken by a native speaker in the target language (English) for the purpose of improved speech recognition and translation. The exaggerated variation in accent and duration introduced by the speaker for non-native names is exploited in the detection process through the use of prosodic features like f0 excursions, durational variations and pause information. First, we validate the use of prosodic features in classifying non-native named entities (person names in Chinese, Japanese, Russian, Spanish, Italian, Persian, Indian) in the first mention spoken by native English speakers. We set up the problem as a binary classification task between the non-native named entities and other content words spoken by the speakers in the native language. Results based on a Support Vector Machine (SVM) classifier indicate a 80% classification accuracy for such events. Second, we use the prosody-based SVM classifier to detect and localize named entities at the output of an Automatic Speech Recognizer (ASR).

Full Paper

Bibliographic reference.  Rangarajan, Vivek / Narayanan, Shrikanth (2006): "Detection of non-native named entities using prosodic features for improved speech recognition and translation", In MULTILING-2006, paper 003.