In several languages, words can be aggregated into compound words. In present speech recognition systems, compound words are treated as as additional single words. This creates redundancies in the phonetic word models that have to be stored and searched during recognition. Moreover, it leads to weaknesses in word or n-gram frequency estimates in language models. - This paper describes a novel approach to speech recognition with vocabularies that contain only the composing words of compounds. The recognition of a compound word is performed via a dedicated accessory language model that evaluates compound word hypotheses only. In this way, very large vocabularies (> 100,000 words) can be handled efficiently. In preliminary recognition tests, the model performed well.
Bibliographic reference. Spies, Marcus (1995): "A language model for compound words in speech recognition", In EUROSPEECH-1995, 1767-1770.