Notes on TMAD Vol. 3-2

Grounding language in action

by Katharina J. Rohlfing and Jun Tani

The main topic of this issue is that action and language are interwoven. On the contrary to our intuitive understanding of language as it is a symbolic system which connects the entities in the world with their conception in our mind, cognitive development research suggests that language develops in parallel with the interactions and the actions that we generate. In other words, action perception and action’s label in our mind co-develop with the language and they have impact on each other with the interaction interface.

For an AI system to self-develop intelligence, various mechanisms/modules are required to interact with each other, as well as interacting with the physical world. Learning from experience is an important learning mechanism which provides invaluable information to improve already known/learned concepts.

There are various studies on allowing robots to learn the meaning of their environment. For instance learning the affordances of the entities in the world through interactions with the world. This enables robots to associate the sensory-motor patterns with the change being made in the world after an interaction episode. By using this information robots can learn higher level object categories [1] which may lead to development of concepts and even language [2].

In a similar study [3] Roy et al. developed a system that can translate spoken commands into situated actions. Adjectives, which describe object properties, are associated with sensory expectations relative to specific actions. Verbs, on the other hand, are grounded in sensory-motor control programs.

However, these studies do not explain how language and sensory-motor abilities co-develop by utilizing sub-symbolic level activities in the systems.

Connectionist approaches are trying to close this gap by focusing on self-organization mostly in sub-symbolic level. For instance, Sugita and Tani [4], proposed a connectionist model which includes coupled RNNs that are trained to learn structural mapping between a simple linguistic representation and a sensory-motor system. They claim that the situated compositional semantics can be achieved simply through the self-organizing processes of dynamical structures on the contrary to the symbol grounding studies as explained earlier.

Similar to the Tani’s argument, mirror neuron theory also supports that the language and action do not develop independently but they have notable overlaps. For instance, it is shown that when a person hears or reads a text involving an action, the region responsible for generating corresponding action signals is also activated, this is where the referential semantic content maps to [5], [6].

The mutualist relationship between the development of language and action competency, and the resultant mental development is heavily investigated by developmental studies. They mostly showed that labels or words ease the object categorization or extracting commonalities between objects and situations. They also showed that these labels or encrypted representations facilitate the knowledge transfer.

To recapitulate, connectionist approaches focus on self-organization of sub-symbolic structures and they deal with small scale problems and representations. Computational approaches, on the other hand, mostly perform better than connectionist models but it is not clear how much they simulate the human cognitive architectures.

Language Does Something

by Iris Nomikou and Katharina J. Rohlfling

This paper particularly focuses on the idea of “acoustic packaging” and if actions co-occur with those packages. Several experiments are conducted on German mothers and their three-month-old infants during a routine activity, diaper changing.

Authors show that German mothers adjust their vocal interaction clues and their actions so that they happen concurrently which makes the vocal signal both perceivable and tangible to the infants.

Main assumption of the paper is that the perception of young infants is educated through social interactions which starts with education of the attention mechanism.

The role of the communication and the way infants participate in social interactions are crucial. Many studies show that infant eye gaze coordination is at the roots of this ability. Besides, parents make use of attention-directing gestures which are shown to be effective in acquisition of new words [7]. Furthermore, there are a great number of studies referred to show the current evidences that supports the theory that action and language are closely inter-coupled.


1. N. Dag, I. Atil, S. Kalkan, E. Sahin (2010), “Learning Affordances for Categorizing Objects and Their Properties”, Int. Conference on Pattern Recognition, 2010

2. I. Atil, N. Dag, S. Kalkan, E. Sahin (2010), “Affordances and Emergence of Concepts”, Proceedings of the Tenth Intl. Conf. on Epigenetic Robotics, pp. 11-18.

3. D.Roy,K.Y.Hsiao, and N. Mavridis,“Mental imageryfor aconver- sational robot,” IEEE Trans. Syst.Man Cybern., B Cybern., vol. 34, pp. 1374–1383, 2004.

4. Y. Sugita and J. Tani, “Learning semantic combinatoriality from the in- teraction between linguistic and behavioral processes,” Adapt. Behav., vol. 13, no. 1, pp. 33–52, 2005.

5. M. Tettamanti, G. Buccino, M. C. Saccuman, V. Gallese,M. Danna, P. Scifo, F. Fazio, G. Rizzolatti, S. F. Cappa, and D. Perani, “Listening to action-related sentences activates fronto-parietal motor circuits,” J. Cogn. Neurosci., vol. 2, no. 17, pp. 273–281, 2005.

6. D. Kemmerer, J. G. Castillo, T. Talavage, S. Patterson, and C.Wiley, “Neuroanatomical distribution of five semantic components of verbs: Evidence from fMRI,” Brain Lang., no. 107, pp. 16–43, 2008.

7. A. E. Booth, K. K. McGregor, and K. J. Rohlfing, “Socio-pragmatics and attention: Contributions to gesturally guided word learning in tod- dlers,” J. Lang. Learn. Develop., vol. 4, pp. 179–202, 2008.