11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Perceptual Wavelet Decomposition for Speech Segmentation

Mariusz Ziółko, Jakub Gałka, Bartosz Ziółko, Tomasz Drwiȩga

AGH University of Science & Technology, Krakow, Poland

A non-uniform speech segmentation method based on wavelet packet transform is used for the localisation of phoneme boundaries. Eleven subbands are chosen by applying the mean best basis algorithm. Perceptual scale is used for decomposition of speech via Meyer wavelet in the wavelet packet structure. A real valued vector representing the digital speech signal is decomposed into phone-like units by placing segment borders according to the result of the multiresolution analysis. The final decision on localisation of the boundaries is made by analysis of the energy flows among the decomposition levels.

Full Paper

Bibliographic reference.  Ziółko, Mariusz / Gałka, Jakub / Ziółko, Bartosz / Drwiȩga, Tomasz (2010): "Perceptual wavelet decomposition for speech segmentation", In INTERSPEECH-2010, 2234-2237.