When research on automatic speech recognition started, the statistical
(or data-driven) approach was associated with methods like Bayes decision
rule, hidden Markov models, Gaussian models and expectation-maximization
algorithm. Later extensions included discriminative training and hybrid
hidden Markov models using multi-layer perceptrons and recurrent neural
networks. Some of the methods originally developed for speech recognition
turned out to be seminal for other language processing tasks like machine
translation, handwritten character recognition and sign language processing.
Today’s research on speech and language processing is dominated
by deep learning, which is typically identified with methods like attention
modelling, sequence-to-sequence processing and end-to-end processing.
In this talk, I will present my personal view of the historical
developments of research on speech and language processing. I will
put particular emphasis on the framework of Bayes decision rule and
on the question of how the various approaches developed fit into this
framework.
Cite as: Ney, H. (2021) Forty Years of Speech and Language Processing: From Bayes Decision Rule to Deep Learning. Proc. Interspeech 2021
@inproceedings{ney21_interspeech, author={Hermann Ney}, title={{Forty Years of Speech and Language Processing: From Bayes Decision Rule to Deep Learning}}, year=2021, booktitle={Proc. Interspeech 2021} }