Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Language Acquisition through a Human-Robot Interface

Naoto Iwahashi

Sony Computer Science Labs Inc., Higashi-Gotanda, Shinagawa-ku, Tokyo, Japan

This paper describes an algorithm for spoken language acquisition through a human-robot interface based on speech, vision, and behavior. In this algorithm the grounded language knowledge is represented by graphical statistical models consisting of hidden Markov models and stochastic context- free grammar. The learning of the lexicon is based on the independence between speech and visual features in each of lexical items. In the grammar-learning process, the syntactic structure of each spoken utterance is inferred from the conceptual structure extracted from the visual observation. The algorithm is robust against ambiguity and sparseness of learning data because it is based on information-theoretical learning.

