Sally–Anne test

The Sally–Anne test is a

false beliefs to others.^[1] The flagship implementation of the Sally–Anne test was by Simon Baron-Cohen, Alan M. Leslie, and Uta Frith (1985);^[2] in 1988, Leslie and Frith repeated the experiment with human actors (rather than dolls) and found similar results.^[3]

Test description

To develop an efficacious test, Baron-Cohen et al. modified the puppet play paradigm of Wimmer and Perner (1983), in which puppets represent tangible characters in a story, rather than hypothetical characters of pure storytelling.

In the test process, after introducing the dolls, the child is asked the control question of recalling their names (the Naming Question). A short skit is then enacted; Sally takes a marble and hides it in her basket. She then "leaves" the room and goes for a walk. While she is away, Anne takes the marble out of Sally's basket and puts it in her own box. Sally is then reintroduced and the child is asked the key question, the Belief Question: "Where will Sally look for her marble?"^[2]

In the Baron-Cohen, Leslie, and Frith study of

clinically unimpaired—were tested with "Sally" and "Anne".^[2]

Outcomes

For a participant to pass this test, they must answer the Belief Question correctly by indicating that Sally believes that the marble is in her own basket. This answer is continuous with Sally's perspective, but not with the participant's own. If the participant cannot take an alternative perspective, they will indicate that Sally has cause to believe, as the participant does, that the marble has moved. Passing the test is thus seen as the manifestation of a participant understanding that Sally has her own beliefs that may not correlate with reality; this is the core requirement of theory of mind.^[4]

In the Baron-Cohen et al. (1985) study, 23 of the 27 clinically unimpaired children (85%) and 12 of the 14 children with Down Syndrome (86%) answered the Belief Question correctly. However, only four of the 20 children with Autism (20%) answered correctly. Overall, children under the age of four, along with most

autistic children (of older ages), answered the Belief Question with "Anne's box", seemingly unaware that Sally does not know her marble has been moved.^[2]

Criticism

While Baron-Cohen et al.'s data have been purported to indicate a lack of theory of mind in autistic children, there are other possible factors affecting them. For instance, autistic individuals may pass the cognitively simpler recall task, but language issues in both autistic children and deaf controls tend to confound results.[5]

Ruffman, Garnham, and Rideout (2001) further investigated links between the Sally–Anne test and autism in terms of eye gaze as a social communicative function. They added a third possible location for the marble: the pocket of the investigator. When autistic children and children with moderate learning disabilities were tested in this format, they found that both groups answered the Belief Question equally well; however, participants with moderate learning disabilities reliably looked at the correct location of the marble, while autistic participants did not, even if the autistic participant answered the question correctly.^[6] These results may be an expression of the social deficits relevant to autism.

Tager-Flusberg (2007) states that in spite of the empirical findings with the Sally–Anne task, there is a growing uncertainty among scientists about the importance of the underlying theory-of-mind hypothesis of autism. In all studies that have been done, some children with autism pass false-belief tasks such as Sally–Anne.^[7]

In other hominids

Eye tracking of chimpanzees, bonobos, and orangutans suggests that all three anticipate the false beliefs of a subject in a King Kong suit, and pass the Sally–Anne test.^[8]^[9]

Artificial intelligence

machine-readable format

.

On March 22, 2023, a research team from Microsoft released a paper showing that the LLM-based AI system GPT-4 could pass an instance of the Sally–Anne test, which the authors interpret as "suggest[ing] that GPT-4 has a very advanced level of theory of mind."^[13] However, the generality of this finding has been disputed by several other papers, which indicate that GPT-4's ability to reason about the beliefs of other agents remains limited (59% accuracy on the ToMi benchmark),^[14] and is not robust to "adversarial" changes to the Sally-Anne test that humans flexibly handle.^[15]^[16] While some authors argue that the performance of GPT-4 on Sally-Anne-like tasks can be increased to 100% via improved prompting strategies,^[17] this approach appears to improve accuracy to only 73% on the larger ToMi dataset.^[15] In related work, researchers have found that LLMs do not exhibit human-like intuitions about the goals that other agents reach for,^[18] and that they do not reliably produce graded inferences about the goals of other agents from observed actions.^[19] The degree to which LLMs such as GPT-4 can perform social reasoning thus remains an active area of research.

References

S2CID 17014009
.

^
S2CID 14955234. Pdf.

doi:10.1111/j.2044-835X.1988.tb01104.x
.

doi:10.1017/S0140525X00076512
.

^ "Autism and Theory of Mind: A Theory in Transition". www.jeramyt.org. Retrieved 9 October 2016.

PMID 11806690
.

S2CID 16474678
.

PMID 27846501
.

ISSN 0261-3077
. Retrieved 2016-10-09.

^ Rabinowitz, Neil; Perbet, Frank; Song, Francis; Zhang, Chiyuan; Eslami, S. M. Ali; Botvinick, Matthew (2018-07-03). "Machine Theory of Mind". Proceedings of the 35th International Conference on Machine Learning. PMLR: 4218–4227.

ISBN 978-1-4503-7518-4
.

S2CID 3338320
.

arXiv:2303.12712v5 [cs.CL
].

arXiv:2210.13312 [cs.CL
].

^
arXiv:2305.14763 [cs.CL
].

arXiv:2302.08399 [cs.AI
].

arXiv:2304.11490 [cs.AI
].

^ Ruis, Laura; Findeis, Arduin; Bradley, Herbie; Rahmani, Hossein A.; Choe, Kyoung Whan; Grefenstette, Edward; Rocktäschel, Tim (2023-06-29). "Do LLMs selectively encode the goal of an agent's reach?". {{cite journal}}: Cite journal requires |journal= (help)

arXiv:2306.14325 [cs.AI
].

Retrieved from "https://en.wikipedia.org/w/index.php?title=Sally–Anne_test&oldid=1201399105"

[1] S2CID 17014009
.

[baron-2] 
S2CID 14955234. Pdf.

[3] :10.1111/j.2044-835X.1988.tb01104.x
.

[4] :10.1017/S0140525X00076512
.

[5] "Autism and Theory of Mind: A Theory in Transition". www.jeramyt.org. Retrieved 9 October 2016.

[6] PMID 11806690
.

[7] S2CID 16474678
.

[8] PMID 27846501
.

[9] ISSN 0261-3077
. Retrieved 2016-10-09.

[10] Rabinowitz, Neil; Perbet, Frank; Song, Francis; Zhang, Chiyuan; Eslami, S. M. Ali; Botvinick, Matthew (2018-07-03). "Machine Theory of Mind". Proceedings of the 35th International Conference on Machine Learning. PMLR: 4218–4227.

[11] ISBN 978-1-4503-7518-4
.

[12] S2CID 3338320
.

[13] rXiv:2303.12712v5 [cs.CL
].

[14] rXiv:2210.13312 [cs.CL
].

[:0-15] 
arXiv:2305.14763 [cs.CL
].

[16] rXiv:2302.08399 [cs.AI
].

[17] rXiv:2304.11490 [cs.AI
].

[18] Ruis, Laura; Findeis, Arduin; Bradley, Herbie; Rahmani, Hossein A.; Choe, Kyoung Whan; Grefenstette, Edward; Rocktäschel, Tim (2023-06-29). "Do LLMs selectively encode the goal of an agent's reach?". {{cite journal}}: Cite journal requires |journal= (help)

[19] rXiv:2306.14325 [cs.AI
].

[1]

[2]

[3]

[4]

[6]

[7]

[8]

[9]

[13]

[14]

[15]

[16]

[17]

[18]

[19]