Third International Conference on Spoken Language Processing (ICSLP 94)
A prototype of a speech-to-text transcription system for medical diagnoses is described. This system recognizes continuous phrase speech and transcribes it into Japanese text. We devised a prototype based on consonant-vowel spotting using the DP matching technique, as demonstrated at ICSLP'90, and improved this system by using a Japanese character trigram model, a phrase syntax, and phoneme-based hidden Markov models (HMMs), as reported in ICSLP'92. This paper outlines recognition methods, and describes the system configuration and results of performance evaluation tests. The system consists of 2 main parts: specially designed speech recognition hardware and a SUN SPARC Station IPX. Using 2 RISC CPUs for the LR parser and 12 DSPs for HMM calculation, the processing time is reduced by 83%. A performance evaluation test was carried out for X-ray CT scanning reports. Before the test, a character trigram model was extracted by analyzing 1,500 scanning reports which contained a total of about 70,000 phrases. The current dictionary vocabulary size (about 3,600 words) was established at this time. Transcription accuracies of 91% and 85% were obtained for normal and abnormal CT reports, respectively, compared to 80% and 65% as reported at ICSLF90.
Bibliographic reference. Tsuboi, Toshiaki / Homma, Shigeru / Matsunaga, Sho-ichi (1994): "A speech-to-text transcription system for medical diagnoses", In ICSLP-1994, 687-690.