INTERSPEECH 2006 - ICSLP
This paper proposes a methodology to automatically generate statistical language models (SLM)s for the recognition of utterances in Interactive Voice Response (IVR) systems. The paper aims at creating SLMs for each IVR prompt  with minimum amount of human intervention and prior knowledge regarding the expected user utterances at a particular prompt. A combination of prefiller patterns based on spontaneous speech utterances, WordNet  and Rogetís thesaurus based content word extraction and, world wide web based statistical validation is used to generate SLMs automatically. The created SLM not only reduces the manual labor involved in IVR application development but also focuses on minimizing the Word Error Rate (WER) and the Semantic Error Rate (SemER) of the ASR transcriptions. We use a WordNet  lexical chain based semantic categorizer to classify ASR transcriptions into semantic categories representing each IVR prompt.
Bibliographic reference. Balakrishna, Mithun / Cerovic, Cyril / Moldovan, Dan / Cave, Ellis (2006): "Automatic generation of statistical language models for interactive voice response applications", In INTERSPEECH-2006, paper 1648-Wed2CaP.12.