12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Asynchronous Multimodal Text Entry Using Speech and Gesture Keyboards

Per Ola Kristensson (1), Keith Vertanen (2)

(1) University of St Andrews, UK
(2) Princeton University, USA

We propose reducing errors in text entry by combining speech and gesture keyboard input. We describe a merge model that combines recognition results in an asynchronous and flexible manner. We collected speech and gesture data of users entering both short email sentences and web search queries. By merging recognition results from both modalities, word error rate was reduced by 53% relative for email sentences and 29% relative for web searches. For email utterances with speech errors, we investigated providing gesture keyboard corrections of only the erroneous words. Without the user explicitly indicating the incorrect words, our model was able to reduce the word error rate by 44% relative.

Full Paper

Bibliographic reference.  Kristensson, Per Ola / Vertanen, Keith (2011): "Asynchronous multimodal text entry using speech and gesture keyboards", In INTERSPEECH-2011, 581-584.