Workshop on the Auditory Basis of Speech Perception

Keele University, UK
July 15-19, 1996

Prediction-Driven Computational Auditory Scene Analysis for Dense Sound Mixtures

Daniel P.W. Ellis

International Computer Science Institute, Berkeley, CA, U.S.A.

We interpret the sound reaching our ears as the combined effect of independent, sound-producing entities in the external world; hearing would have limited usefulness if were defeated by over-lapping sounds. Computer systems that are to interpret real-world sounds - for speech recognition or for multimedia indexing - must similarly interpret complex mixtures. However, existing functional models of audition employ only data-driven processing incapable of making context-dependent inferences in the face of interference. We propose a.prediction-driven approach to this problem, raising numerous issues including the need to represent any kind of sound, and to handle multiple competing hypotheses. Results from an implementation of this approach illustrate its ability to analyze complex, ambient sound scenes that would confound previous systems.

Full Paper

Bibliographic reference.  Ellis, Daniel P.W. (1996): "Prediction-driven computational auditory scene analysis for dense sound mixtures", In ABSP-1996, 198-203.