First European Conference on Speech Communication and Technology

Paris, France
September 27-29, 1989

Multi-Level Segmentation of Natural Continuous Speech Using Different Auditory Front-Ends

Helge B. D. Sorensen, Paul Dalsgaard

Speech Technology Centre, Institute of Electronic Systems, University of Aalborg, Aalborg, Denmark

The research reported in this paper is aiming at acoustic-phonetic segmentation of speech signals to be used in a continuous speech recognition system. The goal of segmentation is to transform the continuous speech signal into a discrete set of segments each describing an acoustic event which corresponds to a homogeneous sound element. Recent years of research into multi-level segmentation of continuous speech has used either a neurophysiological auditory model or a Fourier Transform as front-end processing. This paper describes and presents results obtained from a system configuration consisting of a phychoacoustic auditory model and a multi-level segmentation algorithm. Furthermore this alternative system is modified and compared to multi-level segmentation using an original/modified neurophysiological auditory model. All results are based on analysis of a large database of naturally spoken continuous speech.

