A prototype of an extension telephone exchange system was developed for a company of about 200 employees, which connects a call to an extension telephone automatically by recognizing input sentences and, if necessary, asks questions to solve ambiguities. Insofar as concerns real-time speaker-independent continuous speech recognition, special hardware was designed using 9 DSPs (TMS320C30) for parallel processing of acoustic analysis, HMM probability calculation and Viterbi network search controlled by FSN grammar. An echo canceller is used to cancel the system announcement returned through the telephone-hybrid and to enable the system to detect user utterances at any time. As for speech output, a Klatt type text-to-speech hardware is used as well as the ADPCM decoder for the fixed form output. A preliminary test shows that 1) S3 % of the calls are correctly connected and 15 % of calls are rejected, and 2) the average time spent for the connection is about 50 seconds. In addition, the system has additional speech data storage facilities for constructing a large scale telephone speech database. Ongoing data collection is also described in the paper.
Keywords: Telephone speech recognition, Real-time hardware
Bibliographic reference. Kuroiwa, Shingo / Takeda, Kazuya / Inoue, Naomi / Nogaito, Izuru / Yamamoto, Seiichi / Shouzakai, Makoto / Owa, Kunihiko / Takahashi, Masahiko / Matsumoto, Ryuuji (1993): "A voice-activated extension telephone exchange system", In EUROSPEECH'93, 1793-1796.