INTERSPEECH 2011
12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

A Multimodal Approach to Dictation of Handwritten Historical Documents

Vicent Alabau, Verónica Romero, Antonio-L. Lagarda, Carlos-D. Martínez-Hinarejos

Universidad Politécnica de Valencia, Spain

Handwritten Text Recognition is a problem that has gained attention in the last years due to the interest in the transcription of historical documents. Handwritten Text Recognition employs models that are similar to those employed in Automatic Speech Recognition (Hidden Markov Models and n-grams). Dictation of the contents of the document is an alternative to text recognition. In this work, we explore the performance of a Handwritten Text Recognition system against that of two speech dictation systems: a non-multimodal system that only uses speech and a multimodal system that performs a text recognition which is used in the posterior speech recognition. Results show that the multimodal combination outperforms any of the other considered non-multimodal systems.

Full Paper

Bibliographic reference.  Alabau, Vicent / Romero, Verónica / Lagarda, Antonio-L. / Martínez-Hinarejos, Carlos-D. (2011): "A multimodal approach to dictation of handwritten historical documents", In INTERSPEECH-2011, 2245-2248.