![]() |
Speech Recognition and Intrinsic Variation (SRIV2006)Toulouse, France |
![]() |
Almost all current automatic speech recognition (ASR) systems use a similar paradigm, which will be referred to here briefly as the ‘invariant approach’. Despite intensive research, ASR performance is still at least an order of magnitude lower than that of human speech recognition (HSR). The difficulties encountered in improving ASR performance, in combination with the awareness that current ASR systems have some shortcomings, have led many to believe that a new paradigm for ASR is needed. In this paper a novel paradigm for ASR is presented.
The invariant approach has also dominated (psycho-) linguistics. However, recent findings that indexical and detailed (sub-phonemic) information influence lexical access, have started a debate in (psycho-)linguistics on how these findings could be incorporated in HSR theories and models. On the basis of these findings episodic theories have been proposed. Although the episodic speech recognition (ESR) model is mainly inspired by HSR research, it is also very interesting and promising for ASR, since it has the potential to resolve some shortcomings of the mainstream ASR approach.
Full Paper
Presentation (.ppt)
Bibliographic reference. Strik, Helmer (2006): "How to handle pronunciation variation in ASR: by storing episodes in memory?", In SRIV-2006, 33-38.