5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Context-Dependent Duration Modelling for Continuous Speech Recognition

Tan Lee (1), Rolf Carlson (2), Björn Granström (2)

(1) Department of Electronic Engineering, The Chinese University of Hong Kong, Hong
(2) Centre for Speech Technology, Royal Institute of Technology, Sweden

This paper presents a trial study of using context-dependent segmental duration for continuous speech recognition in a domain-specific application. Different modelling strategies are proposed for function words and content words. Stress status, word position in utterance and phone position in word are identified to be the 3 most crucial factors affecting segmental duration in this particular application. In addition, speaking rate normalization is applied to further reduce the duration variabilities. Experimental results show that the normalized duration models can help improving the rank of the correct sentence in the N-best hypotheses list.

