ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

Speaker independent voiced-unvoiced detection evaluated in different speaking styles

Martin Heckmann, Marco Moebus, Frank Joublin, Christian Goerick

We propose a new algorithm for voiced/unvoiced classification of speech on a phoneme or sample level. The algorithm is inspired by auditory based approaches and combines two cues. One cue is based on the energy distribution of the signal and the other on the harmonicity. In order to extract the harmonicity of the signal we calculate a histogram of the zero crossings of the filter channels after applying a Gammatone filterbank to the signal. A measure similar to the variance of the zero crossings yields the harmonicity cue. The performance of the algorithm was measured on several minutes of read and spontaneous speech with various speakers. An algorithm proposed by Mustafa et al. [1] served as benchmark. The results show that our algorithm performs significantly better as well on read as on spontaneous speech and seems in particular be better able to cope with different speaking styles.


doi: 10.21437/Interspeech.2006-465

Cite as: Heckmann, M., Moebus, M., Joublin, F., Goerick, C. (2006) Speaker independent voiced-unvoiced detection evaluated in different speaking styles. Proc. Interspeech 2006, paper 1249-Wed1FoP.5, doi: 10.21437/Interspeech.2006-465

@inproceedings{heckmann06_interspeech,
  author={Martin Heckmann and Marco Moebus and Frank Joublin and Christian Goerick},
  title={{Speaker independent voiced-unvoiced detection evaluated in different speaking styles}},
  year=2006,
  booktitle={Proc. Interspeech 2006},
  pages={paper 1249-Wed1FoP.5},
  doi={10.21437/Interspeech.2006-465}
}