We present a novel approach towards a multimodal analysis of natural speech and handwriting input for entering mathematical expressions into a computer. It utilizes an integrated, multilevel probabilistic architecture with a joint semantic and two distinct syntactic models describing speech and script properties, respectively. Compared to classical multistage solutions our single-stage strategy benefits from an implicit transfer of higher level contextual information into the lower level segmentation and pattern recognition processes involved. For visualization and postprocessing purposes, a transformation into Adobe® FrameMaker® documents is performed. Fully spoken or handwritten realistic formulas were examined, yielding a structural recognition accuracy of 61.1 % for speech (speaker independent) and 83.3 % for handwriting (writer dependent).
Cite as: Hunsinger, J., Lang, M. (2000) A single-stage top-down probabilistic approach towards understanding spoken and handwritten mathematical formulas. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 4, 386-389
@inproceedings{hunsinger00_icslp, author={Jörg Hunsinger and Manfred Lang}, title={{A single-stage top-down probabilistic approach towards understanding spoken and handwritten mathematical formulas}}, year=2000, booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)}, pages={vol. 4, 386-389} }