Sixth European Conference on Speech Communication and Technology
In this work, a time-scale framework for analysis of glottal closure instants is proposed. As glottal closure can be soft or sharp, depending on the type of vocal activity, the analysis method should be able to deal with both wide-band and low-pass signals. Thus, a multi-scale analysis seems well-suited. The analysis is based on a dyadic wavelet filterbank. Then, the amplitude maxima of the wavelet transform are computed, at each scale. These maxima are organized into lines of maximal amplitude (LOMA) using a dynamic programming algorithm. These lines are forming ``trees'' in the time-scale domain. Glottal closure instants are then interpreted as the top of the strongest branch, or trunk, of these trees. Interesting features of the LOMA are their amplitudes. The LOMA are strong and well organized for voiced speech, and rather weak and widespread for unvoiced speech. The accumulated amplitude along the LOMA gives a very good measure of the degree of voicing.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Tuan, Vu Ngoc / d'Alessandro, Christophe (1999): "Robust glottal closure detection using the wavelet transform", In EUROSPEECH'99, 2805-2808.