Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

An Automatic Algorithm for Segmenting and Labelling a Connected Digit Sequence

V. Kamakshi Prasad, Hema A. Murthy

Department of Computer Science and Engineering, Indian Institute of Technology, Madras, India

Group delay functions provide an alternative representation of signal information. The main features of group delay functions are the additive and high resolution properties. The Fourier transform (FT) phase is generally featureless due to random polority and wrapping. But the group delay function which is defined as the negative derivative of phase, can be processed to derive significant information such as peaks and valleys in the spectral envelope. In this paper, we show an application of group delay function to solve the segmentation problem in speech. In the proposed method a new signal is generated by symmetrising the short term energy function. The minimum phase group delay function of this signal is computed, the valleys of which correspond to segment boundaries. The proposed technique was tested on manually segmented digit utterances of the TI-DIGITS database. The overall correct segmentation performance is 77.8%. Digitwise recognition performance on the correctly segmented database is 87.1%.

