Sixth European Conference on Speech Communication and Technology
A sound classification algorithm is presented which estimates the signal-to-noise ratio between speech and noise in 15 different frequency channels. The algorithm bases on the extraction of spectro-temporal features from the acoustical waveform. The approach is motivated by neurophysiological findings on periodicity coding in the auditory system of mammals. The extracted feature patterns are called Amplitude Modulation Spectrograms (AMS), as each AMS pattern contains information on both center frequencies and amplitude modulations in a short segment (32ms) of the input signal. An artificial neural network is trained on a large set of AMS patterns from mixtures of speech and noise and is then used to predict the narrow-band signal-to-noise ratio of “unknown“ sounds.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Tchorz, Jürgen / Kollmeier, Birger (1999): "Speech detection and SNR prediction basing on amplitude modulation pattern recognition", In EUROSPEECH'99, 2399-2402.