A non-uniform speech segmentation method based on wavelet packet transform is used for the localisation of phoneme boundaries. Eleven subbands are chosen by applying the mean best basis algorithm. Perceptual scale is used for decomposition of speech via Meyer wavelet in the wavelet packet structure. A real valued vector representing the digital speech signal is decomposed into phone-like units by placing segment borders according to the result of the multiresolution analysis. The final decision on localisation of the boundaries is made by analysis of the energy flows among the decomposition levels.
Bibliographic reference. Ziółko, Mariusz / Gałka, Jakub / Ziółko, Bartosz / Drwiȩga, Tomasz (2010): "Perceptual wavelet decomposition for speech segmentation", In INTERSPEECH-2010, 2234-2237.