The paper argues on examples of selected past works that stochastic and knowledge-based approaches to automatic speech recognition do not contradict each other. Frequency resolution of human hearing decreases with increasing frequency. Spectral basis designed for optimal discrimination among different phonemes of speech have similar property. Further, human hearing is most sensitive to modulations with frequency around 4 Hz. Filters on feature trajectories, designed for optimal discrimination among phonemes of speech are bandpass with central frequency around 4 Hz.
Cite as: Hermansky, H. (2004) Stochastic techniques in deriving perceptual knowledge. Proc. ITRW on Statistical and Perceptual Audio Processing (SAPA 2004), paper 136
@inproceedings{hermansky04_sapa, author={Hynek Hermansky}, title={{Stochastic techniques in deriving perceptual knowledge}}, year=2004, booktitle={Proc. ITRW on Statistical and Perceptual Audio Processing (SAPA 2004)}, pages={paper 136} }