Notes on Theory of Mind by Astington and Dack (2008)


Flase-belief task:

Measures the ability to attribute a belief to the others that is different from ones own, and false from self point of view.

Intentional Causation

Desires may be fulfilled however the outcome is achieved but, importantly, intentions are fulfilled only if the person’s intention causes the action that brings about the outcome.

Interpretive Diversity

The understanding that two people may make different interpretations of the same external stimulus and that both interpretations may be legitimate.

Metarepresentational Understanding

The ability to represent one’s own and another person’s different relationships to the same situation.

Modularity Theory

ToM depends on maturation of a particular brain structure — an innate cognitive ToM module. Experience might be required as a trigger.

Simulation Theory

Mental-state concepts are derived from children’s own direct experience of such states. Children can imagine having the beliefs and desires that the other person has, and imagine what they themselves would do if they possessed those imagined beliefs and desires.


People’s understanding of themselves and others as psychological beings, whose beliefs, desires, intentions, and emotions differ.


Children’s ToM develops via a process of theory construction and change, analogous to construction and change in scientific theorizing. With this view, children construct a theory about the mind, whereby their concepts of mental states are abstract and unobservable theoretical postulates used to explain and predict observable human behavior.


Children’s understanding of mental life was first investigate by jean Piaget early in the last century and it has been of interest to psychologists ever since, for example, in studies of perspective taking and metacognition.

What is a Theory of Mind (ToM)?

Intentional states are always `about’ something. One does not just have a belief, but rather have belief about something — this is the content, or propositional content, of the intentional state. Such states are often described as attitudes to propositions. That is, a person has a certain attitude toward the propositional content — such as holding it to be true or wanting it to happen — and this attitude denotes what type of mental state it is.

A person can hold different attitudes to the same propositional content, resulting in different mental states, For example, the boy can `believe’ the chocolate is in the cupboard, `hope’ the chocolate is in the cupboard, `want’ the chocolate to be in the cupboard, and so on.

Belief-type states are true or false, whereas desire-type states are fulfilled or unfulfilled. If the propositional content of a desire does not correspond to the way things actually are in the world, then the desire is unfilled. However, it cannot be fulfilled by changing the desire. In order to fulfill the desire, things in the world have to change to fit the representation that is held in mind. That is, desires and intentions have a world-to-mind direction of fit whereas beliefs have a mind-to-world direction of fit.

ToM is used to explain and predict human behavior. The basic premise is that actions are produced by desire and belief in combination. That is, people act to fulfill their desires in light of their beliefs

Development of ToM

Development of theory of mind

Infants can match their own actions to those of another individual (tongue protrusion). This ability — referred to as ‘cross-modal matching’ — shows that infants can make a connection between self and other, at least at some primitive level, which is important because the similarity between self and other is the heart of ToM.

Dyadic interactions are around at the age of 2 months. Triadic interactions appear at around 9 months of age. Continuously switching focus between each other and toy. This is more than just looking at the the same thing but involves mutual awareness (at some level) that both are engaged with the same object.

ToM in infancy

Between 9 and 12 months of age, infants develop, infants develop the ability to follow an adult’s eye gaze or an adult’s point even to objects not in their line of sight. (if no object, infant looks at adult to check). Likewise, when infants point, they will look toward the adult as well as toward the object to monitor the other’s attention.

ToM in toddler and early preschool period.

ToM in older preschool period.

ToM in school-age children

Differences in Development

Individual Differences

A number of factors, such as executive functioning, language ability, and social competence, are correlated with the understanding of false belief — both contemporaneously and across time in longitudinal studies (e.g. false-belief understanding 3-5 years).

Executive functioning. Executive functions are self-regulatory cognitive processes, such as inhibition, planning, resistance to interference, and control of attention and motor responses.

Executive function tasks require suppression of a habitual response in favor of a new response and, likewise, in standard false-belief tasks children must resist making the more salient (incorrect) response. This suggests that there are executive functioning demands embedded within false-belief tasks.

Fantasy and pretense. There is evidence that acting out roles in pretend play precedes and supports false-belief understanding, whereas explicit assignment of roles and plans for joint action in pretend play follow and result from false-belief understanding.

Language ability. It’s role is complex, reflecting the multifaceted nature of language, which includes pragmatics, semantics, and syntax.

Family environment. Children whose parents explain and discuss, rather than only punish unacceptable behavior, score more highly on false-belief tasks. Mothers’ ‘mind-mindedness’, that is, their propensity to treat their infants as individuals with minds, is an important factor in determining attachment security, as well as underlying their children’s developing awareness of other minds.

Social competence. ToM -> nice ToM and nasty ToM

Atypical Development

Autism. The difficulty that children with autism have in passing ToM tasks is not due to a lack of intelligence. Evidence for this comes from the fact that children with Down syndrome tend to be successful on false-belief tasks, despite the fact that their intelligence scores are, on average, significantly lower than those of individuals with autism.

Sensory impairments. Deaf children with hearing parents are delayed in their false-belief understanding, whereas deaf children with deaf parents are not. The reason is that the children with hearing parents are delayed in acquisition of sign language, which again shows the important role of language in ToM development.

Blind children cannot see facial expressions and gestures and tend to have delayed language development. These children too show delays in ToM development, particularly in understanding false-belief.

Explanations of ToM Development

Theoretical explanations of ToM development

Notes on TMAD Vol. 3-2

Grounding language in action

by Katharina J. Rohlfing and Jun Tani

The main topic of this issue is that action and language are interwoven. On the contrary to our intuitive understanding of language as it is a symbolic system which connects the entities in the world with their conception in our mind, cognitive development research suggests that language develops in parallel with the interactions and the actions that we generate. In other words, action perception and action’s label in our mind co-develop with the language and they have impact on each other with the interaction interface.

For an AI system to self-develop intelligence, various mechanisms/modules are required to interact with each other, as well as interacting with the physical world. Learning from experience is an important learning mechanism which provides invaluable information to improve already known/learned concepts.

There are various studies on allowing robots to learn the meaning of their environment. For instance learning the affordances of the entities in the world through interactions with the world. This enables robots to associate the sensory-motor patterns with the change being made in the world after an interaction episode. By using this information robots can learn higher level object categories [1] which may lead to development of concepts and even language [2].

In a similar study [3] Roy et al. developed a system that can translate spoken commands into situated actions. Adjectives, which describe object properties, are associated with sensory expectations relative to specific actions. Verbs, on the other hand, are grounded in sensory-motor control programs.

However, these studies do not explain how language and sensory-motor abilities co-develop by utilizing sub-symbolic level activities in the systems.

Connectionist approaches are trying to close this gap by focusing on self-organization mostly in sub-symbolic level. For instance, Sugita and Tani [4], proposed a connectionist model which includes coupled RNNs that are trained to learn structural mapping between a simple linguistic representation and a sensory-motor system. They claim that the situated compositional semantics can be achieved simply through the self-organizing processes of dynamical structures on the contrary to the symbol grounding studies as explained earlier.

Similar to the Tani’s argument, mirror neuron theory also supports that the language and action do not develop independently but they have notable overlaps. For instance, it is shown that when a person hears or reads a text involving an action, the region responsible for generating corresponding action signals is also activated, this is where the referential semantic content maps to [5], [6].

The mutualist relationship between the development of language and action competency, and the resultant mental development is heavily investigated by developmental studies. They mostly showed that labels or words ease the object categorization or extracting commonalities between objects and situations. They also showed that these labels or encrypted representations facilitate the knowledge transfer.

To recapitulate, connectionist approaches focus on self-organization of sub-symbolic structures and they deal with small scale problems and representations. Computational approaches, on the other hand, mostly perform better than connectionist models but it is not clear how much they simulate the human cognitive architectures.

Language Does Something

by Iris Nomikou and Katharina J. Rohlfling

This paper particularly focuses on the idea of “acoustic packaging” and if actions co-occur with those packages. Several experiments are conducted on German mothers and their three-month-old infants during a routine activity, diaper changing.

Authors show that German mothers adjust their vocal interaction clues and their actions so that they happen concurrently which makes the vocal signal both perceivable and tangible to the infants.

Main assumption of the paper is that the perception of young infants is educated through social interactions which starts with education of the attention mechanism.

The role of the communication and the way infants participate in social interactions are crucial. Many studies show that infant eye gaze coordination is at the roots of this ability. Besides, parents make use of attention-directing gestures which are shown to be effective in acquisition of new words [7]. Furthermore, there are a great number of studies referred to show the current evidences that supports the theory that action and language are closely inter-coupled.


1. N. Dag, I. Atil, S. Kalkan, E. Sahin (2010), “Learning Affordances for Categorizing Objects and Their Properties”, Int. Conference on Pattern Recognition, 2010

2. I. Atil, N. Dag, S. Kalkan, E. Sahin (2010), “Affordances and Emergence of Concepts”, Proceedings of the Tenth Intl. Conf. on Epigenetic Robotics, pp. 11-18.

3. D.Roy,K.Y.Hsiao, and N. Mavridis,“Mental imageryfor aconver- sational robot,” IEEE Trans. Syst.Man Cybern., B Cybern., vol. 34, pp. 1374–1383, 2004.

4. Y. Sugita and J. Tani, “Learning semantic combinatoriality from the in- teraction between linguistic and behavioral processes,” Adapt. Behav., vol. 13, no. 1, pp. 33–52, 2005.

5. M. Tettamanti, G. Buccino, M. C. Saccuman, V. Gallese,M. Danna, P. Scifo, F. Fazio, G. Rizzolatti, S. F. Cappa, and D. Perani, “Listening to action-related sentences activates fronto-parietal motor circuits,” J. Cogn. Neurosci., vol. 2, no. 17, pp. 273–281, 2005.

6. D. Kemmerer, J. G. Castillo, T. Talavage, S. Patterson, and C.Wiley, “Neuroanatomical distribution of five semantic components of verbs: Evidence from fMRI,” Brain Lang., no. 107, pp. 16–43, 2008.

7. A. E. Booth, K. K. McGregor, and K. J. Rohlfing, “Socio-pragmatics and attention: Contributions to gesturally guided word learning in tod- dlers,” J. Lang. Learn. Develop., vol. 4, pp. 179–202, 2008.