Explicit Duration Modeling for Cantonese Connected-digit Recognition

Yu Zhu, Tan Lee

The Chinese University of Hong Kong, Hong Kong

This paper describes a study on using explicit duration models in hidden Markov model (HMM) based Cantonese connected-digit recognition. An HMM does not give explicit control to the temporal structure of speech. As a result, the recognition output may exhibit unreasonable duration pattern, which is often accompanied with the presence of recognition errors. We propose to use a duration model that models the relative duration of the tail part of a Cantonese digit, together with conventional word-level duration models. The duration models are integrated into the Viterbi search algorithm for speech recognition. Experimental results show that proposed method leads to substantial reduction of recognition errors, especially for slowly spoken utterances.

