ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Accurate vocal event detection method based on a fixed-point analysis of mapping from time to weighted average group delay

Hideki Kawahara, Yoshinori Atake, Parham Zolfaghari

A new procedure for event detection and characterization is proposed based on group delay and fixed point analysis. This method enables the detection of precise timing and spread of speech events such as a vocal fold closure. A mapping from the center of a Gaussian time window to the mean time provides event locations as its fixed points. Refining these initial estimates using minimum phase group delay functions derived from the amplitude spectra provides accurate estimates of event locations and durations of excitations of each event. The proposed algorithm was tested using synthetic speech samples and natural speech database of simultaneously recorded sound waveforms and EGG signals. These tests revealed that the proposed method provides estimates of vocal fold closure instants with timing accuracy within 60 µs to 210 µs standard deviations. This algorithm is implemented to be suitable for real-time operation by making extensive use of FFTs without introducing any iterative procedures. It is potentially a very powerful tool for speech diagnosis and construction of very high quality speech manipulation systems.


Cite as: Kawahara, H., Atake, Y., Zolfaghari, P. (2000) Accurate vocal event detection method based on a fixed-point analysis of mapping from time to weighted average group delay. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 4, 664-667

@inproceedings{kawahara00b_icslp,
  author={Hideki Kawahara and Yoshinori Atake and Parham Zolfaghari},
  title={{Accurate vocal event detection method based on a fixed-point analysis of mapping from time to weighted average group delay}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 4, 664-667}
}