ISCA Archive AVSP 2003
ISCA Archive AVSP 2003

Evolution of language from action understanding

Leonardo Fogassi

In my presentation I will try to provide hypotheses and neurophysiological evidences about the possible evolution of language from a neural motor system involved in action recognition.

There are different theories about language evolution. Some authors claim that language evolved from monkey vocalization, others from gestural communication. Others even denies a direct evolution of language from some monkey precursor, considering the former as a completely new acquisition of the human gender, with characteristics that are completely different from any other animal cognitive function.

A number of comparative anatomical and neurophysiological data seem to support a gradual evolution of language from gestural communication.

Vocalization could at first glance appear as a very good candidate for evolution of verbal communication. However it must be noted that vocalization in monkeys is, in general, a function that an individual uses for sending some important messages to its group (for example about the presence of a specific predator). Furthermore, vocalization is under the neural control of the primitive mesial cingulate circuit, that is mainly involved in emotional behavior.

Differently from these forms of communication, that between two individuals is less linked to emotional messages, often conveying a relational content. This content is often expressed with gestures.

Where in the cortex are located the neural circuits involved in gesture production? Gestures can be considered belonging to the field of goal-directed actions, because, although often devoid of an object target, they are associated to a meaning. Actions are coded in several areas of premotor cortex, and, more extensively, in the parieto-frontal circuits formed by interconnected specific premotor and parietal areas. Coding of goal-directed actions at the single neuron level was studied in more detail in the ventral premotor cortex of the macaque monkey, that is in area F4 and F5. Area F4 is mainly involved in axial and proximal actions toward spatial targets. Area F5 code different types of hand and mouth actions such as grasping, manipulating, holding, tearing. In area F5 there is also a class of visuomotor neurons that become active not only when the monkey performs hand actions, but also when it observes another individual making similar actions. The interest of these neurons consists in the fact that, by matching action observation with action execution, they allow understanding of actions made by others. This capacity is not simply limited to a recognition of motor patterns, but it extends also to the goal of the observed action. The observation/execution matching system represented by mirror neurons could be crucial for an inter-individual communication system originally based on gestures. A communicative gesture made by an actor (the sender) retrieves in the observer (the receiver) the neural circuit encoding the motor representation of same gesture. This allows the receiver to understand the message of the observer and, perhaps, to begin a response (see Rizzolatti and Arbib, TINS 21: 188- 194,1998).

Recently in the monkey we discovered some new categories of mirror neurons that could explain the transition between a basic action understanding neural system to a system endowed with features typical of language.

First of all, we found that in area F5, beyond hand mirror neurons, there are also mouth mirror neurons (Ferrari et al. Eur. J. Neurosci. 17: 1703-1714, 2003). Most of these neurons activate during observation and execution of mouth ingestive action, such as biting, tearing with the teeth, sucking, etc. A small percent of mouth mirror neurons respond also to the observation of mouth communicative actions belonging to the monkey repertoire. All of these actions have affiliative meaning. Interestingly, neurons responding to the observation of mouth communicative actions are also active during the execution of mouth ingestive actions motorically similar to the observed ones. These results suggest that in ventral premotor cortex the action understanding system starts to evolve in a oro-facial communicative system that is not linked anymore to emotional behavior, as that represented in the mesial cortex, but to affiliative behavior. This evolution occurs in a region endowed with the apparatus controlling the execution of ingestive behavior. The visuo-motor properties of mouth mirror neurons suggest that the structures of ingestive behavior could have been taken up in order to be used for communicative behavior. This transformation is in line with the proposal by Mc Neilage (Behav. Brain Sci. 21: 499-546, 1998) of a derivation, in the lateral cortical system, of the syllabic frame from the cyclic mandibular open-close alternation.

Second, another category of hand-related mirror neurons was described, that become active when monkeys not only observe, but also hear the sound of an action (audio-visual mirror neurons) (Kohler et al. Science 297:846-848, 2002). The response of these neurons are specific for the type of action. For example, they respond to peanut breaking when the action is only observed or only heard or both heard and observed, and do not respond to the vision and sound of another action, for example paper tearing. Thus it appears that in these neurons action understanding can occur also through the acoustic channel. This result has two important implications: a) the acoustic input to a motor area allows the individual to retrieve the action representation present in this area, thus accessing to action meaning. Note that this is probably the process occurring during listening to spoken language; b) mirror neurons responding to the observation and to the sound of actions belong to the category of hand mirror neurons. This could be an important element for language evolution, because the association between a gesture and a sound could have occurred, in phylogenesis, before in hand-related neurons than in those related to mouth actions. This would be explained by the larger variety of gestures present in the monkey brachio-manual motor repertoire in respect to mouth gestures (see Rizzolatti and Arbib, TINS 21: 188-194, 1998). The coupling between oro-facial gestures and sounds (not vocalization) could have been a subsequent acquisition.

The discovery in area F5 of mouth communicative mirror neurons and of audio-visual mirror neurons is in good agreement with the proposed homology between F5 and Broca’s area. Area 44 (part of Broca) and area F5 are both dysgranular. They both have a mouth and hand representation (many brain imaging experiments demonstrate involvement of Broca’s area in hand movements, beyond its classical role in speech). They both respond to the observation of hand and mouth actions (see Rizzolatti et al. Nat. Rev. Neurosci. 2:661- 670, 2001). Recently it was demonstrated that Broca’s area is activated both when subjects observe biting action and when they observe other individuals performing silent speech. Another support for the derivation of Broca’s area from F5 is the recently described left asymmetry present in the ventral premotor cortex of apes (Cantalupo and Hopkins, Nature 414: 505, 2001). Thus Broca’s area could be the result of the evolution of a neural system initially able to produce and understand hand and mouth actions. Part of this system would have generated a population of neurons with communicative properties and another sensitive to meaningful acoustic stimuli. The process through which the oro-facial system achieved the properties to emit meaningful sound has to be investigated.


Rizzolatti G. and Arbib M.A. (1998) Language within our grasp. TINS 21: 188-194. Ferrari P.F., Gallese V., Rizzolatti G. and Fogassi, L. (2003) Mirror neurons responding to the observation of ingestive and communicative mouth actions. Eur. J. Neurosci. 17: 1703-1714. Mac Neilage P.F. (1998) The frame/content theory of evolution of speech production. Behav. Brain Sci. 21: 499-546. Kohler E., Keysers C., Umiltà M.A., Fogassi L., Gallese V. and Rizzolatti G. (2002) Hearing sounds, understanding actions: action representation in mirror neurons. Science 297:846-848. Rizzolatti G., Fogassi L. and Gallese V. (2001) Neurophysiological mechanisms underlying the understanding and imitation of action. Nat. Rev. Neurosci. 2:661-670. Cantalupo C. and Hopkins W.D (2001) Asymmetric Broca's area in great apes. Nature 414: 505.

Cite as: Fogassi, L. (2003) Evolution of language from action understanding. Proc. Auditory-Visual Speech Processing, 1-2

  author={Leonardo Fogassi},
  title={{Evolution of language from action understanding}},
  booktitle={Proc. Auditory-Visual Speech Processing},