The research reported in this paper is aiming at acoustic-phonetic segmentation of speech signals to be used in a continuous speech recognition system. The goal of segmentation is to transform the continuous speech signal into a discrete set of segments each describing an acoustic event which corresponds to a homogeneous sound element. Recent years of research into multi-level segmentation of continuous speech has used either a neurophysiological auditory model or a Fourier Transform as front-end processing. This paper describes and presents results obtained from a system configuration consisting of a phychoacoustic auditory model and a multi-level segmentation algorithm. Furthermore this alternative system is modified and compared to multi-level segmentation using an original/modified neurophysiological auditory model. All results are based on analysis of a large database of naturally spoken continuous speech.
Bibliographic reference. Sorensen, Helge B. D. / Dalsgaard, Paul (1989): "Multi-level segmentation of natural continuous speech using different auditory front-ends", In EUROSPEECH-1989, 2079-2082.