ISCA Archive Interspeech 2006

Robust phone lattice decoding

Kris Demuynck, Dirk Van Compernolle, Hugo Van hamme

Most ASR systems adopt an all-in-one approach: acoustic model, lexicon and language model are all applied simultaneously, thus forming a single large search space. This way, both lexicon and language model help in constraining the search at an early stage which greatly improves its efficiency. However, such close integration comes at a cost: all resources must be kept simple. Achieving higher accuracy in unconstrained LVCSR tasks will require more complex resources while at the same time the ‘unconstrainedness’ of the task reduces the effectiveness of the all-in-one approach. Therefore, we propose a modular two-layered architecture. First, a pure acoustic-phonemic search generates a dense phone network. Next a robust decoder finds those words from the lexicon that match well with the phone sequences encoded in the phone network. In this paper we investigate the properties the robust word decoder must have and we propose an efficient search algorithm.

