Third International Conference on Spoken Language Processing (ICSLP 94)
We propose a system which acquires concepts and grammar from visual and acoustic information without a priori knowledge by comparing input information with acquired concepts. For the task domain, we choose the concept acquisition of simple figures. A concept consists of a set of relations between a figure feature and a speech event. And the system acquires the grammar for ordering the concepts. After acquiring the concepts and grammar, the system generates an utterance which explains input images by using acquired concepts and grammar. Furthermore, the system also generates image concepts corresponding to input speech by using acquired concepts. Our system acquired 11 out of 12 types of concepts from 100 pairs of utterance and images. Using the left-to-right HMM for grammar acquisition, the rate our system will generate correct sentences for input images is about 50%, We have realized the first stage of human's concept acquisition process on a computer system.
Bibliographic reference. Masukata, Mikio / Nakagawa, Seiichi (1994): "Concept and grammar acquisition based on combining with visual and auditory information", In ICSLP-1994, 1163-1166.