Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
This paper describes a real-time speaker-independent continuous speech recognition system. In order to achieve speaker-independent continuous speech recognition with real-time response, demi-syllable speech units, a bundle search algorithm, and multi-processing techniques were used. The use of demisyllables allows all transitions between phonemes to be represented, thus improving the recognition accuracy. Bundle search is a frame synchronous technique used with Viterbi search to reduce the computational load needed to search the finite state grammar network used. The search process is divided into three pipelined stages : frame-level likelihood calculation, word-level search and sentence level network search. Each of these three pipelined stages is further split into several sub-processes. These processes are run in parallel on a multi-processor machine. Real-time performance was achieved in the evaluation experiments using a 500 word vocabulary. 83.0% sentence accuracy and 95.5% word recognition accuracy were achieved.
Bibliographic reference. Koga, Shinji / Isotani, Ryosuke / Tsukada, Satoshi / Yoshida, Kazunaga / Hatazaki, Kaichiro / Watanabe, Takao (1992): "A real-time speaker-independent continuous speech recognition system based on demi-syllable units", In ICSLP-1992, 1483-1486.