Second International Conference on Spoken Language Processing (ICSLP'92)

Banff, Alberta, Canada
October 13-16, 1992

A Real-Time Speaker-Independent Continuous Speech Recognition System Based on Demi-Syllable Units

Shinji Koga, Ryosuke Isotani, Satoshi Tsukada, Kazunaga Yoshida, Kaichiro Hatazaki, Takao Watanabe

C & G Information Technology Research Laboratories, NEC Corporation, Kawasaki, Japan

This paper describes a real-time speaker-independent continuous speech recognition system. In order to achieve speaker-independent continuous speech recognition with real-time response, demi-syllable speech units, a bundle search algorithm, and multi-processing techniques were used. The use of demisyllables allows all transitions between phonemes to be represented, thus improving the recognition accuracy. Bundle search is a frame synchronous technique used with Viterbi search to reduce the computational load needed to search the finite state grammar network used. The search process is divided into three pipelined stages : frame-level likelihood calculation, word-level search and sentence level network search. Each of these three pipelined stages is further split into several sub-processes. These processes are run in parallel on a multi-processor machine. Real-time performance was achieved in the evaluation experiments using a 500 word vocabulary. 83.0% sentence accuracy and 95.5% word recognition accuracy were achieved.

