Sixth European Conference on Speech Communication and Technology
(EUROSPEECH'99)

Budapest, Hungary
September 5-9, 1999

Temporal Constraints on Speech Intelligibility as Deduced from Exceedingly Sparse Spectral Representations

Rosaria Silipo (1), Steven Greenberg (1), Takayuki Arai (2)

(1) International Computer Science Institute, Berkeley, CA, USA
(2) Department of Electrical and Electronics Engineering, Sophia University, Chiyoda-ku, Tokyo, Japan

A novel means of quantifying the contribution of specific spectral bands for intelligibility is described. The spectrum of spoken English sentences is partitioned into one-third octave bands ("slits") and the contribution of each of four slits ascertained independently and in combination with other slits distributed across the spectrum. The intelligibility baseline (four concurrent slits) yields ca. 85% intelligibility. The current study demonstrates that intelligibility progressively declines as the two central slits (2+3) are desynchronized between 25 and 250 ms. Beyond 250 ms, intelligibility often declines even further but then begins to increase for greater degrees of asynchrony, suggesting the presence of a perceptual processing buffer of ca. 200-300 ms in duration. The utility of the spectral slit technique is also demonstrated for estimating the contribution towards intelligibility of different regions of the modulation spectrum. The mid-frequency (10-25 Hz) modulations are shown to be of particular significance for encoding speech information above 1.5 kHz. These two experiments demonstrate the power and utility of using circumscribed portions of the spectrum for quantitative evaluation of the contribution made by specific spectro-temporal properties of the speech signal.


Full Paper (PDF)

Acoustic Example #1
Acoustic Example #2 [0I]
Acoustic Example #3 [1I]
Acoustic Example #4 [3I]
Acoustic Example #5 [6I]
Acoustic Example #6 [I0]
Acoustic Example #7 [I1]
Acoustic Example #8 [I3]
Acoustic Example #9 [I6]
Acoustic Example #10 [II]
Acoustic Example #11 [S1]
Acoustic Example #12 [S2]
Acoustic Example #13 [S3]
Acoustic Example #14 [S4]
Acoustic Example #15 [S5]
Acoustic Example #16 [SI]

Bibliographic reference.  Silipo, Rosaria / Greenberg, Steven / Arai, Takayuki (1999): "Temporal constraints on speech intelligibility as deduced from exceedingly sparse spectral representations", In EUROSPEECH'99, 2687-2690.