Journal Home > Volume 1 , Issue 2

In addition to a physical comprehension of the world, humans possess a high social intelligence—the intelligence that senses social events, infers the goals and intents of others, and facilitates social interaction. Notably, humans are distinguished from their closest primate cousins by their social cognitive skills as opposed to their physical counterparts. We believe that artificial social intelligence (ASI) will play a crucial role in shaping the future of artificial intelligence (AI). This article begins with a review of ASI from a cognitive science standpoint, including social perception, theory of mind (ToM), and social interaction. Next, we examine the recently-emerged computational counterpart in the AI community. Finally, we provide an in-depth discussion on topics related to ASI.


menu
Abstract
Full text
Outline
About this article

Artificial Social Intelligence: A Comparative and Holistic View

Show Author's information Lifeng Fan1Manjie Xu1,2Zhihao Cao1,3Yixin Zhu4( )Song-Chun Zhu1,3,4( )
National Key Laboratory of General Artificial Intelligence, Beijing Institute for General Artificial Intelligence (BIGAI), Beijing 100080, China
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
Department of Automation, Tsinghua University, Beijing 100084, China
Institute for Artificial Intelligence, Peking University, Beijing 100871, China

Abstract

In addition to a physical comprehension of the world, humans possess a high social intelligence—the intelligence that senses social events, infers the goals and intents of others, and facilitates social interaction. Notably, humans are distinguished from their closest primate cousins by their social cognitive skills as opposed to their physical counterparts. We believe that artificial social intelligence (ASI) will play a crucial role in shaping the future of artificial intelligence (AI). This article begins with a review of ASI from a cognitive science standpoint, including social perception, theory of mind (ToM), and social interaction. Next, we examine the recently-emerged computational counterpart in the AI community. Finally, we provide an in-depth discussion on topics related to ASI.

Keywords: communication, social intelligence, theory of mind (ToM), human-machine teaming

References(193)

[1]
J. McCarthy, What is AI?, http://www-formal.stanford.edu/jmc/whatisai.html, 2004.
[2]

T. Ziemke and S. Thellman, Do we really want AI to be human-like? Sci. Robot., vol. 7, no. 68, p. eadd0641, 2022.

[3]

A. M. Turing, Computing machinery and intelligence, Mind, vol. 59, pp. 443–460, 1950.

[4]
M. I. Posner, Foundations of Cognitive Science. Cambridge, MA, USA: MIT Press, 1989.
[5]

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al., Human-level control through deep reinforcement learning, Nature, vol. 518, no. 7540, pp. 529–533, 2015.

[6]

J. B. Tenenbaum, C. Kemp, T. L. Griffiths, and N. D. Goodman, How to grow a mind: Statistics, structure, and abstraction, Science, vol. 331, no. 6022, pp. 1279–1285, 2011.

[7]

B. M. Lake, R. Salakhutdinov, and J. B. Tenenbaum, Human-level concept learning through probabilistic program induction, Science, vol. 350, no. 6266, pp. 1332–1338, 2015.

[8]

B. M. Lake, T. D. Ullman, J. B. Tenenbaum, and S. J. Gershman, Building machines that learn and think like people, Behav. Brain Sci., vol. 40, p. e253, 2017.

[9]

Y. Zhu, T. Gao, L. Fan, S. Huang, M. Edmonds, H. Liu, F. Gao, C. Zhang, S. Qi, Y. N. Wu, et al., Dark, beyond deep: A paradigm shift to cognitive AI with humanlike common sense, Engineering, vol. 6, no. 3, pp. 310–345, 2020.

[10]

T. Shu, Y. Peng, S. C. Zhu, and H. Lu, A unified psychological space for human perception of physical and social events, Cogn. Psychol., vol. 128, p. 101398, 2021.

[11]
T. Gerstenberg and J. B. Tenenbaum, Intuitive theories, in Oxford Handbook of Causal Reasoning, M. R. Waldmann, Ed. Oxford, UK: Oxford University Press, 2017, pp. 515–547.
DOI
[12]
E. S. Spelke, What Babies Know: Core Knowledge and Composition Volume 1. New York, NY, USA: Oxford University Press, 2022.
DOI
[13]

E. S. Spelke and K. D. Kinzler, Core knowledge, Dev. Sci., vol. 10, no. 1, pp. 89–96, 2007.

[14]

J. R. Kubricht, K. J. Holyoak, and H. Lu, Intuitive physics: Current research and controversies, Trends Cogn. Sci., vol. 21, no. 10, pp. 749–759, 2017.

[15]

A. Newell, Physical symbol systems, Cogn. Sci., vol. 4, no. 2, pp. 135–183, 1980.

[16]

I. Biederman, R. J. Mezzanotte, and J. C. Rabinowitz, Scene perception: Detecting and judging objects undergoing relational violations, Cogn. Psychol., vol. 14, no. 2, pp. 143–177, 1982.

[17]

P. W. Battaglia, J. B. Hamrick, and J. B. Tenenbaum, Simulation as an engine of physical scene understanding, Proc. Natl. Acad. Sci., vol. 110, no. 45, pp. 18327–18332, 2013.

[18]
C. Bates, P. W. Battaglia, I. Yildirim, and J. B. Tenenbaum, Humans predict liquid dynamics using probabilistic simulation, in Proc. 37th Annu. Meeting of the Cognitive Science Society, Pasadena, CA, USA, 2015, pp. 172–176.
[19]
W. Liang, Y. Zhao, Y. Zhu, and S. C. Zhu, Evaluating human cognition of containing relations with physical simulation, in Proc. 37th Annu. Meeting of the Cognitive Science Society, Pasadena, CA, USA, 2015, pp. 1356–1361.
[20]
J. Kubricht, C. Jiang, Y. Zhu, S. C. Zhu, D. Terzopoulos, and H. Lu, Probabilistic simulation predicts human performance on viscous fluid-pouring problem, in Proc. 38th Annu. Meeting of the Cognitive Science Society, Philadelphia, PA, USA, 2016, pp. 1805–1810.
[21]
J. Kubricht, Y. Zhu, C. Jiang, D. Terzopoulos, S. C. Zhu, and H. Lu, Consistent probabilistic simulation underlying human judgment in substance dynamics, in Proc. 39th Annu. Meeting of the Cognitive Science Society, London, UK, 2017, pp. 700–705.
[22]

T. D. Ullman, E. Spelke, P. Battaglia, and J. B. Tenenbaum, Mind games: Game engines as an architecture for intuitive physics, Trends Cogn. Sci., vol. 21, no. 9, pp. 649–665, 2017.

[23]

T. Ye, S. Qi, J. Kubricht, Y. Zhu, H. Lu, and S. C. Zhu, The Martian: Examining human physical judgments across virtual gravity fields, IEEE Trans. Vis. Comput. Graph., vol. 23, no. 4, pp. 1399–1408, 2017.

[24]
K. Smith, L. Mei, S. Yao, J. Wu, E. S. Spelke, J. Tenenbaum, and T. D. Ullman, The fine structure of surprise in intuitive physics: When, why, and how much?, in Proc. 42nd Annu. Meeting of the Cognitive Science Society, virtual, 2020, pp. 3048–3054.
[25]

L. S. Piloto, A. Weinstein, P. Battaglia, and M. Botvinick, Intuitive physics learning in a deep-learning model inspired by developmental psychology, Nat. Hum. Behav., vol. 6, no. 9, pp. 1257–1267, 2022.

[26]
S. Li, K. Wu, C. Zhang, and Y. Zhu, On the learning mechanisms in physical reasoning, presented at the 36th Conf. Neural Information Processing Systems, New Orleans, LA, USA, 2022.
[27]
J. Wu, I. Yildirim, J. J. Lim, W. T. Freeman, and J. B. Tenenbaum, Galileo: Perceiving physical object properties by integrating a physics engine with deep learning, in Proc. 28th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2015, pp. 127–135.
[28]
Y. Zhu, C. Jiang, Y. Zhao, D. Terzopoulos, and S. C. Zhu, Inferring forces and learning human utilities from videos, in Proc. 2016 Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 3823–3833.
DOI
[29]
J. Wu, E. Lu, P. Kohli, W. T. Freeman, and J. B. Tenenbaum, Learning to see physics via visual de-animation, in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 152–163.
[30]
S. Qi, Y. Zhu, S. Huang, C. Jiang, and S. C. Zhu, Human-centric indoor scene synthesis using stochastic grammar, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 5899–5908.
DOI
[31]

C. Jiang, S. Qi, Y. Zhu, S. Huang, J. Lin, L. F. Yu, D. Terzopoulos, and S. C. Zhu, Configurable 3D scene synthesis and 2D image rendering with per-pixel ground truth using stochastic grammars, Int. J. Comput. Vis., vol. 126, no. 9, pp. 920–941, 2018.

[32]
W. Liang, Y. Zhu, and S. C. Zhu, Tracking occluded objects and recovering incomplete trajectories by reasoning about containment relations and human actions, in Proc. 32nd AAAI Conf. Artificial Intelligence and 30th Innovative Applications of Artificial Intelligence Conf. and 8th AAAI Symp. Educational Advances in Artificial Intelligence, New Orleans, LA, USA, 2018, pp. 7106–7113.
DOI
[33]
S. Huang, S. Qi, Y. Zhu, Y. Xiao, Y. Xu, and S. C. Zhu, Holistic 3D scene parsing and reconstruction from a single RGB image, in Proc. 15th European Conf. Computer Vision (ECCV), Munich, Germany, 2018, pp. 194–211.
DOI
[34]
S. Huang, S. Qi, Y. Xiao, Y. Zhu, Y. N. Wu, and S. C. Zhu, Cooperative holistic scene understanding: Unifying 3D object, layout, and camera pose estimation, in Proc. 32nd Conf. Neural Information Processing Systems, Montréal, Canada, 2018, pp. 206–217.
[35]
Y. Chen, S. Huang, T. Yuan, Y. Zhu, S. Qi, and S. C. Zhu, Holistic++ scene understanding: Single-view 3D holistic scene parsing and human pose estimation with human-object interaction and physical commonsense, in Proc. 2019 IEEE/CVF Int. Conf. Computer Vision (ICCV), Seoul, Republic of Korea, 2019, pp. 8647–8656.
DOI
[36]
B. Zheng, Y. Zhao, J. C. Yu, K. Ikeuchi, and S. C. Zhu, Beyond point clouds: Scene understanding by reasoning geometry and physics, in Proc. Conf. Computer Vision and Pattern Recognition, Portland, OR, USA, 2013, pp. 3127–3134.
DOI
[37]

B. Zheng, Y. Zhao, J. Yu, K. Ikeuchi, and S. C. Zhu, Scene understanding by reasoning stability and safety, Int. J. Comput. Vis., vol. 112, no. 2, pp. 221–238, 2015.

[38]
Y. Zhu, Y. Zhao, and S. C. Zhu, Understanding tools: Task-oriented object modeling, learning and recognition, in Proc. 2015 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 2015, pp. 2855–2864.
DOI
[39]
M. Han, Z. Zhang, Z. Jiao, X. Xie, Y. Zhu, S. C. Zhu, and H. Liu, Reconstructing interactive 3D scenes by panoptic mapping and CAD model alignments, in Proc. 2021 IEEE Int’l Conf. Robotics and Automation (ICRA), Xi'an, China, 2021, pp. 12199–12206.
DOI
[40]

Z. Zhang, Z. Jiao, W. Wang, Y. Zhu, S. C. Zhu, and H. Liu, Understanding physical effects for effective tool-use, IEEE Robot. Automat. Lett., vol. 7, no. 4, pp. 9469–9476, 2022.

[41]

M. Han, Z. Zhang, Z. Jiao, X. Xie, Y. Zhu, S. C. Zhu, and H. Liu, Scene reconstruction with functional objects for robot autonomy, Int. J. Comput. Vis., vol. 130, no. 12, pp. 2940–2961, 2022.

[42]

T. L. Griffiths and J. B. Tenenbaum, Theory-based causal induction, Psychological Review, vol. 116, no. 4, pp. 661–716, 2009.

[43]
M. Edmonds, J. Kubricht, C. Summers, Y. Zhu, B. Rothrock, S. C. Zhu, and H. Lu, Human causal transfer: Challenges for deep reinforcement learning, in Proc. 40th Annu. Meeting of the Cognitive Science Society, Madison, WI, USA, 2018, pp. 324–329.
[44]
M. Edmonds, S. Qi, Y. Zhu, J. Kubricht, S. C. Zhu, and H. Lu, Decomposing human causal learning: Bottom-up associative learning and top-down schema reasoning, in Proc. 41st Annu. Meeting of the Cognitive Science Society, Montreal, Canada, 2019, pp. 1696–1702.
[45]

M. Edmonds, F. Gao, H. Liu, X. Xie, S. Qi, B. Rothrock, Y. Zhu, Y. N. Wu, H. Lu, and S. C. Zhu, A tale of two explanations: Enhancing human trust by explaining robot behavior, Sci. Robot., vol. 4, no. 37, p. eaay4663, 2019.

[46]
M. Edmonds, X. Ma, S. Qi, Y. Zhu, H. Lu, and S. C. Zhu, Theory-based causal transfer: Integrating instance-level induction and abstract-level structure learning, in Proc. AAAI Conference on Artificial Intelligence, New York, NY, USA, 2020, pp. 1283–1291.
DOI
[47]
C. Zhang, B. Jia, M. Edmonds, S. C. Zhu, and Y. Zhu, Acre: Abstract causal reasoning beyond covariation, in Proc. 2021 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, pp. 10638–10648.
DOI
[48]
M. Xu, G. Jiang, C. Zhang, S. C. Zhu, and Y. Zhu, EST: Evaluating scientific thinking in artificial agents, arXiv preprint arXiv: 2206.09203, 2022.
[49]
C. Zhang, S. Xie, B. Jia, Y. N. Wu, S. C. Zhu, and Y. Zhu, Learning algebraic representation for systematic generalization in abstract reasoning, in Proc. 17th European Conf. Computer Vision, Tel Aviv, Israel, 2022, pp. 692–709.
DOI
[50]

B. Falkenhainer, K. D. Forbus, and D. Gentner, The structure-mapping engine: Algorithm and examples, Artif. Intell., vol. 41, no. 1, pp. 1–63, 1989.

[51]

P. N. Johnson-Laird, Mental models and human reasoning, Proc. Natl. Acad. Sci., vol. 107, no. 43, pp. 18243–18250, 2010.

[52]
C. Zhang, F. Gao, B. Jia, Y. Zhu, and S. C. Zhu, Raven: A dataset for relational and analogical visual reasoning, in Proc. 2019 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 5312–5322.
DOI
[53]
C. Zhang, B. Jia, F. Gao, Y. Zhu, H. Lu, and S. C. Zhu, Learning perceptual inference by contrasting, in Proc. 33rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 1075–1087.
[54]
W. Zhang, C. Zhang, Y. Zhu, and S. C. Zhu, Machine number sense: A dataset of visual arithmetic problems for abstract and relational reasoning, in Proc. AAAI Conf. Artificial Intelligence, New York, NY, USA, 2020, pp. 1332–1340.
DOI
[55]
C. Zhang, B. Jia, S. C. Zhu, and Y. Zhu, Abstract spatial-temporal reasoning via probabilistic abduction and execution, in Proc. 2021 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, pp. 9731–9741.
DOI
[56]

A. Hafri and C. Firestone, The perception of relations, Trends Cogn. Sci., vol. 25, no. 6, pp. 475–492, 2021.

[57]
M. Tomasello, Do apes ape, in Social Learning in Animals: The Roots of Culture, C. M. Heyes and B. G. Galef, Jr. , Eds. San Diego, CA, USA: Academic Press, 1996, pp. 319–346.
DOI
[58]
M. Tomasello, Origins of Human Communication. Cambridge, MA, USA: MIT Press, 2010.
[59]
S. Kita, Pointing: Where Language, Culture, and Cognition Meet. New York, NY, USA: Psychology Press, 2003.
DOI
[60]
R. M. Scott, E. Roby, and R. Baillargeon, How sophisticated is infants’ theory of mind?, in The Cambridge Handbook of Cognitive Development, O. Houdé and G. Borst, Eds. Cambridge, UK: Cambridge University Press, 2022, pp. 242–268.
DOI
[61]

E. Herrmann, J. Call, M. V. Hernández-Lloreda, B. Hare, and M. Tomasello, Humans have evolved specialized skills of social cognition: The cultural intelligence hypothesis, Science, vol. 317, no. 5843, pp. 1360–1366, 2007.

[62]

E. L. Thorndike, Intelligence and its uses, Harper’s Magazine, vol. 140, pp. 227–235, 1920.

[63]

J. Williams, S. M. Fiore, and F. Jentsch, Supporting artificial social intelligence with theory of mind, Front. Artif. Intell., vol. 5, p. 750763, 2022.

[64]

D. Silvera, M. Martinussen, and T. I. Dahl, The tromsø social intelligence scale, a self-report measure of social intelligence, Scand. J. Psychol., vol. 42, no. 4, pp. 313–319, 2001.

[65]
J. Launchbury, A DARPA perspective on artificial intelligence,https://www.darpa.mil/about-us/darpa-perspective-on-ai, 2017.
[66]
K. Jiang, S. Stacy, A. Chan, C. Wei, F. Rossano, Y. Zhu, and T. Gao, Individual vs. joint perception: A pragmatic model of pointing as smithian helping, in Proc. 43rd Annu. Meeting of the Cognitive Science Society, Vienna, Austria, 2021, pp. 1781–1787.
[67]
K. Jiang, A. Dahmani, S. Stacy, B. Jiang, F. Rossano, Y. Zhu, and T. Gao, What is the point? A theory of mind model of relevance, in Proc. 44th Annu. Meeting of the Cognitive Science Society, Toronto, Ontario, Canada, 2022, pp. 3369–3375.
[68]
L. Wittgenstein, The Big Typescript: TS 213, Hoboken, NJ, USA: Wiley-Blackwell, 2012.
[69]
J. Aru, A. Labash, O. Corcoll, and R. Vicente, Mind the gap: Challenges of deep learning approaches to theory of mind, Artif. Intell. Rev.,doi: 10.1007/s10462-023-10401-x.
DOI
[70]

K. Frankish, Dual-process and dual-system theories of reasoning, Philosophy Compass, vol. 5, no. 10, pp. 914–926, 2010.

[71]

B. J. Scholl and P. D. Tremoulet, Perceptual causality and animacy, Trends Cogn. Sci., vol. 4, no. 8, pp. 299–309, 2000.

[72]
A. Michotte, The emotions regarded as functional connections, in Michottes Experimental Phenomenology of Perception, G. Thinés, A. Costall, and G. Butterworth, Eds. Abingdon, UK: Routledge, 1991, pp. 103–116.
[73]

F. Heider and M. Simmel, An experimental study of apparent behavior, Amer. J. Psychol., vol. 57, no. 2, pp. 243–259, 1944.

[74]

D. H. Rakison and D. Poulin-Dubois, Developmental origin of the animate–inanimate distinction, Psychol. Bull., vol. 127, no. 2, pp. 209–228, 2001.

[75]

H. M. Wellman and D. Estes, Early understanding of mental entities: A reexamination of childhood realism, Child Dev., vol. 57, no. 4, pp. 910–923, 1986.

[76]
A. Michotte, The Perception of Causality. London: Routledge, 2017.
DOI
[77]

D. S. Berry, S. J. Misovich, K. J. Kean, and R. M. Baron, Effects of disruption of structure and motion on perceptions of social causality, Pers. Soc. Psychol. Bull., vol. 18, no. 2, pp. 237–244, 1992.

[78]

Y. Luo, L. Kaufman, and R. Baillargeon, Young infants’ reasoning about physical events involving inert and self-propelled objects, Cogn. Psychol., vol. 58, no. 4, pp. 441–486, 2009.

[79]

G. Gergely, Z. Nádasdy, G. Csibra, and S. Bíró, Taking the intentional stance at 12 months of age, Cognition, vol. 56, no. 2, pp. 165–193, 1995.

[80]

G. Csibra, G. Gergely, S. Bíró, O. Koós, and M. Brockbank, Goal attribution without agency cues: The perception of ‘pure reason’ in infancy, Cognition, vol. 72, no. 3, pp. 237–267, 1999.

[81]

T. Gao, G. E. Newman, and B. J. Scholl, The psychophysics of chasing: A case study in the perception of animacy, Cogn. Psychol., vol. 59, no. 2, pp. 154–179, 2009.

[82]

T. Gao, G. McCarthy, and B. J. Scholl, The wolfpack effect: Perception of animacy irresistibly influences interactive behavior, Psychol. Sci., vol. 21, no. 12, pp. 1845–1853, 2010.

[83]

T. Gao and B. J. Scholl, Chasing vs. stalking: Interrupting the perception of animacy, J. Exp. Psychol. :Hum. Percept. Perform., vol. 37, no. 3, pp. 669–684, 2011.

[84]

B. van Buren, T. Gao, and B. J. Scholl, What are the underlying units of perceived animacy? Chasing detection is intrinsically object-based, Psychon. Bull. Rev., vol. 24, no. 5, pp. 1604–1610, 2017.

[85]

D. Premack and G. Woodruff, Does the chimpanzee have a theory of mind? Behav. Brain Sci., vol. 1, no. 4, pp. 515–526, 1978.

[86]

T. Rusch, S. Steixner-Kumar, P. Doshi, M. Spezio, and J. Gläscher, Theory of mind and decision science: Towards a typology of tasks and computational models, Neuropsychologia, vol. 146, p. 107488, 2020.

[87]
N. Chevalier and A. Blaye, False-belief representation and attribution in preschoolers: Testing a graded-representation hypothesis, Curr. Psychol. Lett., vol. 18, no. 1, 2006.
DOI
[88]

Y. Barnes-Holmes, L. McHugh, and D. Barnes-Holmes, Perspective-taking and theory of mind: A relational frame account, Behav. Anal. Today, vol. 5, no. 1, pp. 15–25, 2004.

[89]

R. Fjelland, Why general artificial intelligence will not be realized, Humanit. Soc. Sci. Commun., vol. 7, p. 10, 2020.

[90]

H. Wimmer and J. Perner, Beliefs about beliefs: Representation and constraining function of wrong beliefs in young children’s understanding of deception, Cognition, vol. 13, no. 1, pp. 103–128, 1983.

[91]

S. Baron-Cohen, A. M. Leslie, and U. Frith, Does the autistic child have a “theory of mind”? Cognition, vol. 21, no. 1, pp. 37–46, 1985.

[92]
H. M. Wellman, Developing a theory of mind, in The Wiley-Blackwell Handbook of Childhood Cognitive Development, U. Goswami, Ed. Chichester, West Sussex: John Wiley & Sons, 2011, pp. 258–284.
DOI
[93]

R. Saxe, S. Carey, and N. Kanwisher, Understanding other minds: Linking developmental psychology and functional neuroimaging, Annu. Rev. Psychol., vol. 55, pp. 87–124, 2004.

[94]

C. Langley, B. I. Cirstea, F. Cuzzolin, and B. J. Sahakian, Theory of mind and preference learning at the interface of cognitive science, neuroscience, and AI: A review, Front. Artif. Intell., vol. 5, p. 778852, 2022.

[95]

C. Westby and L. Robinson, A developmental perspective for promoting theory of mind, Top. Lang. Disord., vol. 34, no. 4, pp. 362–382, 2014.

[96]

J. Perner and B. Lang, Development of theory of mind and executive control, Trends Cogn. Sci., vol. 3, no. 9, pp. 337–344, 1999.

[97]

D. A. Baldwin and J. A. Baird, Discerning intentions in dynamic human action, Trends Cogn. Sci., vol. 5, no. 4, pp. 171–178, 2001.

[98]

A. L. Woodward, Infants selectively encode the goal object of an actor’s reach, Cognition, vol. 69, no. 1, pp. 1–34, 1998.

[99]
A. N. Meltzoff and R. Brooks, “Like me” as a building block for understanding other minds: Bodily acts, attention, and intention, in Intentions and Intentionality: Foundations of Social Cognition, B. F. Malle, L. J. Moses, and D. A. Baldwin, Eds. Cambridge, MA, USA: The MIT Press, 2001, pp. 171–191.
[100]

D. A. Baldwin, J. A. Baird, M. M. Saylor, and M. A. Clark, Infants parse dynamic action, Child Dev., vol. 72, no. 3, pp. 708–717, 2001.

[101]

M. Tomasello, M. Carpenter, J. Call, T. Behne, and H. Moll, Understanding and sharing intentions: The origins of cultural cognition, Behav. Brain Sci., vol. 28, no. 5, pp. 675–691, 2005.

[102]

A. N. Meltzoff, Understanding the intentions of others: Re-enactment of intended acts by 18-month-old children, Dev. Psychol., vol. 31, no. 5, pp. 838–850, 1995.

[103]

G. Gergely, H. Bekkering, and I. Király, Rational imitation in preverbal infants, Nature, vol. 415, no. 6873, pp. 755–755, 2002.

[104]

A. L. Woodward, J. A. Sommerville, S. Gerson, A. M. E. Henderson, and J. Buresh, The emergence of intention attribution in infancy, Psychol. Learn. Motiv., vol. 51, pp. 187–222, 2009.

[105]
Z. X. Tan, J. L. Mann, T. Silver, J. B. Tenenbaum, and V. K. Mansinghka, Online Bayesian goal inference for boundedly-rational planning agents, in Proc. 34th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2020, pp. 19238–19250.
[106]

F. Warneken and M. Tomasello, Altruistic helping in human infants and young chimpanzees, Science, vol. 311, no. 5765, pp. 1301–1303, 2006.

[107]

J. P. Roiser and B. J. Sahakian, Hot and cold cognition in depression, CNS Spectr., vol. 18, no. 3, pp. 139–149, 2013.

[108]

A. Gopnik and H. M. Wellman, Why the child’s theory of mind really is a theory, Mind & Language, vol. 7, no. 1-2, pp. 145–171, 1992.

[109]

R. M. Gordon, Folk psychology as simulation, Mind & Language, vol. 1, no. 2, pp. 158–171, 1986.

[110]
R. M. Gordon, ‘Radical’ simulationism, in Theories of Theories of Mind, P. Carruthers and P. K. Smith, Eds. Cambridge, UK: Cambridge University Press, 1996, pp. 11–21.
DOI
[111]

N. J. Emery and N. S. Clayton, Comparative social cognition, Annu. Rev. Psychol., vol. 60, pp. 87–113, 2009.

[112]

S. M. Schaafsma, D. W. Pfaff, R. P. Spunt, and R. Adolphs, Deconstructing and reconstructing theory of mind, Trends Cogn. Sci., vol. 19, no. 2, pp. 65–72, 2015.

[113]
T. J. Wiltshire, E. J. Lobato, J. Velez, F. Jentsch, and S. M. Fiore, An interdisciplinary taxonomy of social cues and signals in the service of engineering robotic social intelligence, in Proc. SPIE 9084, Unmanned Systems Technology XVI, Baltimore, Maryland, United States, 2014, p. 90840F.
DOI
[114]

N. J. Emery, The eyes have it: The neuroethology, function and evolution of social gaze, Neurosci. Biobehav. Rev., vol. 24, no. 6, pp. 581–604, 2000.

[115]

L. Fan, W. Wang, S. C. Zhu, X. Tang, and S. Huang, Understanding human gaze communication by spatio-temporal graph reasoning, in Proc, 2019 IEEE/CVF International Conference on Computer Vision, no. ICCV, p. 5723, 5732.

[116]

H. Admoni and B. Scassellati, Social eye gaze in human-robot interaction: A review, J. Hum. -Robot Int., vol. 6, no. 1, pp. 25–63, 2017.

[117]
C. Moore, P. J. Dunham, and P. Dunham, Joint Attention: Its Origins and Role in Development, London, UK, Psychology Press, 2014.
DOI
[118]

C. Moore and V. Corkum, Social understanding at the end of the first year of life, Dev. Rev., vol. 14, no. 4, pp. 349–372, 1994.

[119]
Y. Nagai, Understanding the development of joint attention from a viewpoint of cognitive developmental robotics, Ph. D. dissertation, Osaka University, Osaka, Japan, 2004.
[120]
L. Fan, Y. Chen, P. Wei, W. Wang, and S. C. Zhu, Inferring shared attention in social scene videos, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 6460–6468.
DOI
[121]

I. Brinck, The pragmatics of imperative and declarative pointing, Cogn. Sci. Quart., vol. 3, no. 4, pp. 429–446, 2004.

[122]

E. Bates, L. Camaioni, and V. Volterra, The acquisition of performatives prior to speech, Merrill-Palmer Quart., vol. 21, no. 3, pp. 205–226, 1975.

[123]
S. C. Levinson, On the human “interaction engine”, in Roots of Human Sociality, S. C. Levinson and N. J. Enfield, Eds. London: Routledge, 2020, pp. 39–69.
DOI
[124]

F. Rossano, J. Terwilliger, A. Bangerter, E. Genty, R. Heesen, and K. Zuberbühler, How 2- and 4-year-old children coordinate social interactions with peers, Phil. Trans. Roy. Soc. B, vol. 377, no. 1859, p. 20210100, 2022.

[125]

G. Pezzulo, The “interaction engine”: A common pragmatic competence across linguistic and nonlinguistic interactions, IEEE Trans. Autonom. Mental Dev., vol. 4, no. 2, pp. 105–123, 2012.

[126]

A. Cichocki and A. P. Kuleshov, Future trends for human-AI collaboration: A comprehensive taxonomy of AI/AGI using multiple intelligences and learning styles, Comput. Intell. Neurosci., vol. 2021, p. 8893795, 2021.

[127]
M. Tomasello, Why We Cooperate, Cambridge, MA, USA: MIT Press, 2009.
DOI
[128]
N. Tang, S. Stacy, M. Zhao, G. Marquez, and T. Gao, Bootstrapping an imagined we for cooperation, in Proc. 42nd Annu. Meeting of the Cognitive Science Society, virtual, 2020, pp. 2453–2456.
[129]
S. Stacy, Q. Zhao, M. Zhao, M. Kleiman-Weiner, and T. Gao, Intuitive signaling through an “imagined we”, in Proc. 42nd Annu. Meeting of the Cognitive Science Society, virtual, 2020, p. 1880.
[130]
S. E. T. Stacy, The imagined we: Shared Bayesian theory of mind for modeling communication, Ph. D. dissertation, University of California, Los Angeles, CA, USA, 2022.
[131]
N. Tang, S. Gong, Z. Liao, H. Xu, J. Zhou, M. Shen, and T. Gao, Jointly perceiving physics and mind: Motion, force and intention, in Proc. 43rd Annu. Meeting of the Cognitive Science Society, Vienna, Austria, 2021, pp. 735–741.
[132]
T. Shu, M. Kryven, T. D. Ullman, and J. Tenenbaum, Adventures in flatland: Perceiving social interactions under physical dynamics, in Proc. 42nd Annu. Meeting of the Cognitive Science Society, virtual, 2020, pp. 2901–2907.
[133]

T. Shu, Y. Peng, L. Fan, H. Lu, and S. C. Zhu, Perception of human interaction based on motion trajectories: From aerial videos to decontextualized animations, Top. Cogn. Sci., vol. 10, no. 1, pp. 225–241, 2018.

[134]

T. Gao, C. L. Baker, N. Tang, H. Xu, and J. B. Tenenbaum, The cognitive architecture of perceived animacy: Intention, attention, and memory, Cogn. Sci., vol. 43, no. 8, p. e12775, 2019.

[135]
P. Wei, Y. Liu, T. Shu, N. Zheng, and S. C. Zhu, Where and why are they looking? Jointly inferring human attention and intentions in complex tasks, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 6801–6809.
DOI
[136]

D. Xie, T. Shu, S. Todorovic, and S. C. Zhu, Learning and inferring “dark matter” and predicting human intents and trajectories in videos, IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 7, pp. 1639–1652, 2018.

[137]
S. Holtzen, Y. Zhao, T. Gao, J. B. Tenenbaum, and S. C. Zhu, Inferring human intent from video by sampling hierarchical plans, in Proc. 2016 IEEE/RSJ Int. Conf. Intelligent Robots and Systems (IROS), Daejeon, Republic of Korea, 2016, pp. 1489–1496.
DOI
[138]
B. González and L. J. Chang, Computational models of mentalizing, in The Neural Basis of Mentalizing, M. Gilead and K. N. Ochsner, Eds. Cham, Switzerland: Springer, 2021, pp. 299–315.
DOI
[139]

W. Yoshida, R. J. Dolan, and K. J. Friston, Game theory of mind, PLoS Comput. Biol., vol. 4, no. 12, p. e1000254, 2008.

[140]

S. V. Albrecht and P. Stone, Autonomous agents modelling other agents: A comprehensive survey and open problems, Artif. Intell., vol. 258, pp. 66–95, 2018.

[141]

S. Arora and P. Doshi, A survey of inverse reinforcement learning: Challenges, methods and progress, Artif. Intell., vol. 297, p. 103500, 2021.

[142]
C. L. Baker, R. Saxe, and J. B. Tenenbaum, Bayesian theory of mind: Modeling joint belief-desire attribution, in Proc. 33rd Annu. Meeting of the Cognitive Science Society, Boston, MA, USA, 2011, pp. 2469–2474.
[143]
T. Yuan, H. Liu, L. Fan, Z. Zheng, T. Gao, Y. Zhu, and S. C. Zhu, Joint inference of states, robot knowledge, and human (false-) beliefs, in Proc. 2020 IEEE Int. Conf. Robotics and Automation (ICRA), Paris, France, 2020, pp. 5972–5978.
DOI
[144]
L. Fan, S. Qiu, Z. Zheng, T. Gao, S. C. Zhu, and Y. Zhu, Learning triadic belief dynamics in nonverbal communication from videos, in Proc. 2021 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, pp. 7308–7317.
DOI
[145]

B. Arslan, N. A. Taatgen, and R. Verbrugge, Five-year-olds’ systematic errors in second-order false belief tasks are due to first-order theory of mind strategy selection: A computational modeling study, Front. Psychol., vol. 8, p. 275, 2017.

[146]
I. Oguntola, D. Hughes, and K. Sycara, Deep interpretable models of theory of mind, in Proc. 2021 30th Int. Conf. Robot and Human Interactive Communication (RO-MAN), Vancouver, Canada, 2021, pp. 657–664.
DOI
[147]

Y. Zeng, Y. Zhao, T. Zhang, D. Zhao, F. Zhao, and E. Lu, A brain-inspired model of theory of mind, Front. Neurorobot., vol. 14, p. 60, 2020.

[148]

C. L. Baker, J. Jara-Ettinger, R. Saxe, and J. B. Tenenbaum, Rational quantitative attribution of beliefs, desires and percepts in human mentalizing, Nat. Hum. Behav., vol. 1, no. 4, p. 0064, 2017.

[149]
Y. Wen, Y. Yang, R. Luo, J. Wang, and W. Pan, Probabilistic recursive reasoning for multi-agent reinforcement learning, arXiv: 1901.09207, 2019.
[150]
P. Moreno, E. Hughes, K. R. McKee, B. A. Pires, and T. Weber, Neural recursive belief states in multi-agent reinforcement learning, arXiv preprint arXiv: 2102.02274, 2021.
[151]
A. Hakimzadeh, Y. Xue, and P. Setoodeh, Interpretable reinforcement learning inspired by piaget’s theory of cognitive development, arXiv preprint arXiv: 2102.00572, 2021.
[152]

J. Jara-Ettinger, Theory of mind as inverse reinforcement learning, Curr. Opin. Behav. Sci., vol. 29, pp. 105–110, 2019.

[153]

L. Yuan, X. Gao, Z. Zheng, M. Edmonds, Y. N. Wu, F. Rossano, H. Lu, Y. Zhu, and S. C. Zhu, In situ bidirectional human-robot value alignment, Sci. Robot., vol. 7, no. 68, p. eabm4183, 2022.

[154]

H. de Weerd, R. Verbrugge, and B. Verheij, Negotiating with other minds: The role of recursive theory of mind in negotiation with incomplete information, Auton. Agents Multi-Agent Syst., vol. 31, no. 2, pp. 250–287, 2017.

[155]

H. de Weerd, D. Diepgrond, and R. Verbrugge, Estimating the use of higher-order theory of mind using computational agents, B. E. J. Theor. Econ., vol. 18, no. 2, p. 20160184, 2018.

[156]

A. Kanwal, W. M. Qazi, M. A. Altaf, A. Athar, M. Hussain, S. T. S. Bukhari, and A. T. Apasiba, A step towards the development of socio-cognitive agent, Lahore Garrison Univ. Res. J. Comput. Sci. Informat. Technol., vol. 4, no. 3, pp. 23–38, 2020.

[157]
R. Tejwani, Y. L. Kuo, T. Shu, B. Stankovits, D. Gutfreund, J. B. Tenenbaum, B. Katz, and A. Barbu, Incorporating rich social interactions into MDPs, in Proc. 2022 Int. Conf. Robotics and Automation (ICRA), Philadelphia, PA, USA, 2022, pp. 7395–7401.
DOI
[158]
R. Tejwani, Y. L. Kuo, T. Shu, B. Katz, and A. Barbu, Social interactions as recursive MDPs, in Proc. 5th Conf. Robot Learning, London, UK, 2022, pp. 949–958.
[159]

A. Panella and P. Gmytrasiewicz, Interactive POMDPs with finite-state models of other agents, Auton. Agents Multi-Agent Syst., vol. 31, no. 4, pp. 861–904, 2017.

[160]

M. Zhao, N. Tang, A. L. Dahmani, Y. Zhu, F. Rossano, and T. Gao, Sharing rewards undermines coordinated hunting, J. Comput. Biol., vol. 29, no. 9, pp. 1022–1030, 2022.

[161]
S. Stacy, C. Li, M. Zhao, Y. Yun, Q. Zhao, M. Kleiman-Weiner, and T. Gao, Modeling communication to coordinate perspectives in cooperation, in Proc. 43rd Annu. Meeting of the Cognitive Science Society, Vienna, Austria, 2021, pp. 1851–1857.
[162]
X. Gao, R. Gong, Y. Zhao, S. Wang, T. Shu, and S. Chun. Zhu, Joint mind modeling for explanation generation in complex human-robot collaborative tasks, in Proc. 2020 29th IEEE Int. Symp. Robot and Human Interactive Communication (RO-MAN), Naples, Italy, 2020, pp. 1119–1126.
DOI
[163]
M. C. Buehler, J. Adamy, and T. H. Weisswange, Theory of mind based assistive communication in complex human robot cooperation, arXiv preprint arXiv: 2109.01355, 2021.
[164]
Y. Wang, F. Zhong, J. Xu, and Y. Wang, ToM2C: Target-oriented multi-agent communication and cooperation with theory of mind, arXiv: 2111.09189, 2022.
[165]

J. Pöppel, S. Kahl, and S. Kopp, Resonating minds-emergent collaboration through hierarchical active inference, Cogn. Comput., vol. 14, no. 2, pp. 581–601, 2022.

[166]
V. Chidambaram, Y. H. Chiang, and B. Mutlu, Designing persuasive robots: How robots might persuade people using vocal and nonverbal cues, in Proc. 2012 7th ACM/IEEE Int. Conf. Human-Robot Interaction (HRI), Boston, MA, USA, 2012, pp. 293–300.
DOI
[167]

O. Nocentini, L. Fiorini, G. Acerbi, A. Sorrentino, G. Mancioppi, and F. Cavallo, A survey of behavioral models for social robots, Robotics, vol. 8, no. 3, p. 54, 2019.

[168]

T. J. Wiltshire, S. F. Warta, D. Barber, and S. M. Fiore, Enabling robotic social intelligence by engineering human social-cognitive mechanisms, Cogn. Syst. Res., vol. 43, pp. 190–207, 2017.

[169]
J. E. Laird, Introduction to soar, arXiv preprint arXiv: 2205.03854, 2022.
[170]

A. Lieto, M. Bhatt, A. Oltramari, and D. Vernon, The role of cognitive architectures in general artificial intelligence, Cogn. Syst. Res., vol. 48, pp. 1–3, 2018.

[171]
M. Vircikova, G. Magyar, and P. Sincak, The affective loop: A tool for autonomous and adaptive emotional human-robot interaction, in Robot Intelligence Technology and Applications 3, J. H. Kim, W. Yang, J. Jo, P. Sincak, and H. Myung, Eds. Cham, Switzerland: Springer, 2015, pp. 247–254.
DOI
[172]
J. Snaider, R. McCall, and S. Franklin, The LIDA framework as a general tool for AGI, in Proc. 4th Int. Conf. Artificial General Intelligence, Mountain View, CA, USA, 2011, pp. 133–142.
DOI
[173]
J. E. Laird, The Soar Cognitive Architecture, Cambridge, MA, USA: MIT Press, 2019.
[174]

J. R. Anderson, C. Lebiere, M. Lovett, and L. Reder, ACT-R: A higher-level account of processing capacity, Behav. Brain Sci., vol. 21, no. 6, pp. 831–832, 1998.

[175]

J. R. Anderson, D. Bothell, M. D. Byrne, S. Douglass, C. Lebiere, and Y. Qin, An integrated theory of the mind, Psychol. Rev., vol. 111, no. 4, pp. 1036–1060, 2004.

[176]

C. Breazeal, J. Gray, and M. Berlin, An embodied cognition approach to mindreading skills for socially intelligent robots, Int. J. Robot. Res., vol. 28, no. 5, pp. 656–680, 2009.

[177]

W. G. Kennedy, M. D. Bugajska, A. M. Harrison, and J. G. Trafton, “Like-me” simulation as an effective and cognitively plausible basis for social robotics, Int. J. Soc. Robot., vol. 1, no. 2, pp. 181–194, 2009.

[178]

C. Moulin-Frier, T. Fischer, M. Petit, G. Pointeau, J. Y. Puigbo, U. Pattacini, S. C. Low, D. Camilleri, P. Nguyen, M. Hoffmann, et al., Dac-H3: A proactive robot cognitive architecture to acquire and express knowledge about the world and the self, IEEE Trans. Cogn. Dev. Syst., vol. 10, no. 4, pp. 1005–1022, 2018.

[179]

A. M. Franchi, F. Mutti, and G. Gini, From learning to new goal generation in a bioinspired robotic setup, Adv. Robot., vol. 30, no. 11-12, pp. 795–805, 2016.

[180]
A. Netanyahu, T. Shu, B. Katz, A. Barbu, and J. B. Tenenbaum, PHASE: Physically-grounded abstract social events for machine social perception, arXiv: 2103.01933, 2021.
DOI
[181]
T. Shu, A. Bhandwaldar, C. Gan, K. A. Smith, S. Liu, D. Gutfreund, E. Spelke, J. B. Tenenbaum, and T. D. Ullman, AGENT: A benchmark for core psychological reasoning, arXiv: 2102.12321, 2021.
[182]
X. Puig, T. Shu, S. Li, Z. Wang, Y. H. Liao, J. B. Tenenbaum, S. Fidler, and A. Torralba, Watch-and-help: A challenge for social perception and human-AI collaboration, arXiv: 2010.09890, 2021.
[183]
M. Sap, H. Rashkin, D. Chen, R. LeBras, and Y. Choi, SocialiQA: Commonsense reasoning about social interactions, arXiv preprint arXiv: 1904.09728, 2019.
DOI
[184]

N. Bard, J. N. Foerster, S. Chandar, N. Burch, M. Lanctot, H. F. Song, E. Parisotto, V. Dumoulin, S. Moitra, E. Hughes, et al., The Hanabi challenge: A new frontier for AI research, Artif. Intell., vol. 280, p. 103216, 2020.

[185]

H. Shevlin, K. Vold, M. Crosby, and M. Halina, The limits of machine intelligence, EMBO Rep., vol. 20, no. 10, p. e49177, 2019.

[186]
F. Lievens and D. Chan, Practical intelligence, emotional intelligence, and social intelligence, in Handbook of Employee Selection, J. L. Farr and N. T. Tippins, Eds. New York, NY, USA: Routledge, 2017, pp. 342–364.
DOI
[187]

M. Crosby, B. Beyret, and M. Halina, The animal-AI Olympics, Nat. Mach. Intell., vol. 1, no. 5, pp. 257–257, 2019.

[188]
M. Schurz, J. Radua, M. Aichhorn, F. Richlan, and J. Perner, Fractionating theory of mind: A meta-analysis of functional brain imaging studies, Neurosci. Biobehav. Rev., vol. 42, pp. 9–34, 2014.
DOI
[189]
L. Smith and M. Gasser, The development of embodied cognition: Six lessons from babies, Artif. Life, vol. 11, nos. 1–2, pp. 13–29, 2005.
DOI
[190]
G. M. van de Ven and A. S. Tolias, Three scenarios for continual learning, arXiv preprint arXiv: 1904.07734, 2019.
[191]
R. Caruana, Multitask learning, Mach. Learn., vol. 28, no. 1, pp. 41–75, 1997.
DOI
[192]
Y. Zhang and Q. Yang, A survey on multi-task learning, IEEE Trans. Knowl. Data Eng., vol. 34, no. 12, pp. 5586–5609, 2021.
DOI
[193]
J. Vanschoren, Meta-learning: A survey. arXiv preprint arXiv: 1810.03548, 2018.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 21 October 2022
Revised: 03 January 2023
Accepted: 07 January 2023
Published: 10 March 2023
Issue date: December 2022

Copyright

© The author(s) 2022

Acknowledgements

Acknowledgment

The authors would like to thank Prof. Tao Gao (UCLA) for brainstorming while the authors were with UCLA, Miss Zhen Chen (BIGAI) and Miss Qing Lei (PKU) for making the nice figures, and two anonymous reviews for constructive feedback. This work was supported in part by the National Key R&D Program of China (No. 2022ZD0114900) and the Beijing Nova Program.

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return