STiLL - Speech Technology in Language Learning

May 25-27, 1998
Marholmen, Sweden

Is Automatic Speech Recognition Ready for Non-Native Speech? A Data Collection Effort and Initial Experiments in Modeling Conversational Hispanic English

William Byrne (1), Eva Knodt (2), Sanjeev Khudanpur (1), Jared Bernstein (3)

(1) Center for Language and Speech Processing, The Johns Hopkins University, Baltimore, MD, USA
(2) Entropic Research Laboratory, Inc. Menlo Park, CA, USA
(3) Ordinate Corporation, Menlo Park, CA, USA

We describe the protocol used for collecting a corpus of conversational English speech from non-native speakers at several levels of proficiency, and report the results of preliminary automatic speech recognition (ASR) experiments on this corpus using HTK-based ASR systems. The speech corpus contains both read and conversational speech recorded simultaneously on wide-band and telephone channels, and has detailed time aligned transcriptions. The immediate goal of the ASR experiments is to assess the difficulty of the ASR problem in language learning exercises and thus to gauge how current ASR technology may be used in conversational computer assisted language learning (CALL) systems. The long-term goal of this research, of which the data collection and experiments are a first step, is to incorporate ASR into computer-based conversational language instruction systems.

Full Paper

Bibliographic reference.  Byrne, William / Knodt, Eva / Khudanpur, Sanjeev / Bernstein, Jared (1998): "Is automatic speech recognition ready for non-native speech? a data collection effort and initial experiments in modeling conversational Hispanic English", In STiLL-1998, 37-40.