EUROSPEECH 2003 - INTERSPEECH 2003
In this paper, we present a novel method to detect sound onsets and offsets, and apply it to detect and segment syllables from high-speed speech according to the Mandarin characteristic. Our system detects onsets and offsets in 8 frequency bands by a two-layer integrate-and-fire neural network. The continuous speech is segmented based on the timing of onsets and offsets. And the energy is used as another cue to locate the segmentation point. In order to improve the accuracy of segmenting, we introduce three time constraints by defining three refractory periods of neurons, which make syllable length no less than the minimum. Although the boundaries between syllables in high-speed speech are not salient, our system can still segment individual syllables from speech robustly and accurately.
Bibliographic reference. Ying, D.W. / Gao, W. / Wang, W.Q. (2003): "A new approach to segment and detect syllables from high-speed speech", In EUROSPEECH-2003, 765-768.