Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Efficient Search Strategy in Large Vocabulary Continuous Speech Recognition Using Prosodic Boundary Information

Shi-Wook Lee (1), Keikichi Hirose (2), Nobuaki Minematsu (1)

(1) Department of Information and Communication Engineering, School of Engineering;
(2) Department of Frontier Informatics, School of Frontier Sciences; University of Tokyo, Japan

Prosodic-syntactic boundary as an information source can be used to improve the performance of Large Vocabulary Continuous Speech Recognition (LVCSR) in both efficiency and accuracy. This paper presents a study of two effective methods to exploit prosodic boundary information in a multi-pass decoder. In this paper, we address the effect of a language model on setting pruning beam width and how to control the Cross-word Context Dependent (CCD) models by prosodic boundary information. In the first pass decoding, dynamic beam search strategy regarding inner-word and cross-word paths is proposed to reduce search space efficiently, and then cross-word context dependent models are optimized using prosodic boundary information in the second pass decoding. The recognition experiments, which were carried out on the Japanese Newspaper Article Sentences (JNAS) 20k word task using a multi-pass decoder, demonstrated that the proposed method led to significant reduction in the search space with accuracy improvement.


Full Paper

Bibliographic reference.  Lee, Shi-Wook / Hirose, Keikichi / Minematsu, Nobuaki (2000): "Efficient search strategy in large vocabulary continuous speech recognition using prosodic boundary information", In ICSLP-2000, vol.4, 274-277.