A 50-Year Retrospective on Speech and Language Processing

John Makhoul

This talk is a retrospective of speech and language processing as witnessed by the speaker during the last 50 years. From exploratory scientific beginnings that emphasized the discovery of how speech is produced and perceived by humans to today’s plethora of applications using our technology, our field has witnessed explosive growth. The talk will review the historical development of our community and some of the key technical ideas that have shaped our field. Some of the ideas were influenced by developments in other fields, while some of the developments in our field have been instrumental in key advances in other fields, such as optical character recognition and machine translation. Important developments include the source-filter model, digital signal processing, linear prediction, vector quantization, deep neural networks, and statistical modeling methods, especially hidden Markov models (HMMs), with primary applications to speech analysis, synthesis, coding, and recognition. The talk will be sprinkled with lessons learned in the importance of various factors in performing our research, and will be peppered with interesting tidbits about key moments in the development of our technology. The talk will end with a brief prospective peek at the next 50 years.

