Grounded speech communication

Deb Roy

Language is grounded in sensory-motor experience. Grounding connects concepts to the physical world enabling humans to acquire and use words and sentences in context. Currently, machines which process text and spoken language are not grounded in human-like ways. Instead, semantic representations in machines are highly abstract and have meaning only when interpreted by humans. We are interested in developing computational systems which represent words, utterances, and underlying concepts in terms of sensory-motor experiences, leading to richer levels of understanding by machines. Inspired by theories of infant cognition, we present a computational model which learns from untranscribed multisensory input. Acquired words are represented in terms associations between acoustic and visual sensory experience. The system has been tested in a robotic embodiment which supports interactive language learning and understanding. Successful learning has also been demonstrated using infant-directed speech and images.

  author={Deb Roy},
  title={{Grounded speech communication}},
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 4, 69-72}