This paper presents a new two-pass algorithm for Extra Large (more than 1M words) Vocabulary Continuous Speech recognition based on the Information Retrieval (ELVIRCOS). The principle of this approach is to decompose a recognition process into two passes where the first pass builds the word subset for the second pass recognition by using information retrieval procedure. Word graph composition for continuous speech is presented. With this approach a high performance for large vocabulary speech recognition can be obtained.
Bibliographic reference. Pylypenko, Valeriy (2007): "Extra large vocabulary continuous speech recognition algorithm based on information retrieval", In INTERSPEECH-2007, 1461-1464.