In this paper we explore the feasibility of the Memory-Prediction Theory, implemented in the form of a Hierarchical Temporal Memory (HTM), for automatic speech recognition. Up to now HTMs have almost exclusively been applied to image processing. However, the underlying theory can also be used as an approach to active perception of audio signals. Using the software platform under development by Numenta we implemented a system for isolated digit recognition, the speech recognition task that can be most easily cast in a form similar to image recognition. Our results show that the HTM approach holds promises for speech recognition. At the same time it is clear that the present implementation is not ideally suited for processing signals that encode information mainly in dynamic changes.
Bibliographic reference. Doremalen, Joost van / Boves, Lou (2008): "Spoken digit recognition using a hierarchical temporal memory", In INTERSPEECH-2008, 2566-2569.