Auditory spectro-temporal representations of reverberant speech are investigated for blind estimation of reverberation time ( RT) and for single-ended measurement of speech quality. The auditory representations are obtained from an eight-filter filterbank which is used to extract the modulation spectra from temporal envelopes of the speech signal. Gaussian mixture models (GMM), one for each modulation channel and trained on clean speech signals, serve as reference models of normative speech behavior. Consistency measures, computed between reverberant test signals and each GMM, are mapped to an estimated RT and to an estimated quality score. Experiments show that the proposed measures achieve superior performance relative to current "state-of-art" algorithms.
Bibliographic reference. Falk, Tiago H. / Yuan, Hua / Chan, Wai-Yip (2007): "Spectro-temporal processing for blind estimation of reverberation time and single-ended quality measurement of reverberant speech", In INTERSPEECH-2007, 514-517.