Can ChatGPT Mimic Theory of Mind? Psychology Is Probing AI’s Inner Workings


If you’ve ever vented to ChatGPT about troubles in life, the responses can sound empathetic. The chatbot delivers affirming support, and—when prompted—even gives advice like a best friend.

Unlike older chatbots, the seemingly “empathic” nature of the latest AI models has already galvanized the psychotherapy community, with many wondering if  they can assist therapy.

The ability to infer other people’s mental states is a core aspect of everyday interaction. Called “theory of mind,” it lets us guess what’s going on in someone else’s mind, often by interpreting speech. Are they being sarcastic? Are they lying? Are they implying something that’s not overtly said?

“People care about what other people think and expend a lot of effort thinking about what is going on in other minds,” wrote Dr. Cristina Becchio and colleagues at the University Medical Center Hanburg-Eppendorf in a new study in Nature Human Behavior.”

In the study, the scientists asked if ChatGPT and other similar chatbots—which are based on machine learning algorithms called large language models—can also guess other people’s mindsets. Using a series of psychology tests tailored for certain aspects of theory of mind, they pitted two families of large language models, including OpenAI’s GPT series and Meta’s LLaMA 2, against over 1,900 human participants.

GPT-4, the algorithm behind ChatGPT, performed at, or even above, human levels in some tasks, such as identifying irony. Meanwhile, LLaMA 2 beat both humans and GPT at detecting faux pas—when someone says something they’re not meant to say but don’t realize it.

To be clear, the results don’t confirm LLMs have theory of mind. Rather, they show these algorithms can mimic certain aspects of this core concept that “defines us as humans,” wrote the authors.

What’s Not Said

By roughly four years old, children already know that people don’t always think alike. We have different beliefs, intentions, and needs. By placing themselves into other people’s shoes, kids can begin to understand other perspectives and gain empathy.

First introduced in 1978, theory of mind is a lubricant for social interactions. For example, if you’re standing near a closed window in a stuffy room, and someone nearby says, “It’s a bit hot in here,” you have to think about their perspective to intuit they’re politely asking you to open the window.

When the ability breaks down—for example, in autism—it becomes difficult to grasp other people’s emotions, desires, intentions, and to pick up deception. And we’ve all experienced when texts or emails lead to misunderstandings when a recipient misinterprets the sender’s meaning.

So, what about the AI models behind chatbots?

Man Versus Machine

Back in 2018, Dr. Alan Winfield, a professor in the ethics of robotics at the University of West England, championed the idea that theory of mind could let AI “understand” people and other robots’ intentions. At the time, he proposed giving an algorithm a programmed internal model of itself, with common sense about social interactions built in rather than learned.

Large language models take a completely different approach, ingesting massive datasets to generate human-like responses that feel empathetic. But do they exhibit signs of theory of mind?

Over the years, psychologists have developed a battery of tests to study how we gain the ability to model another’s mindset. The new study pitted two versions of OpenAI’s GPT models (GPT-4 and GPT-3.5) and Meta’s LLaMA-2-Chat against 1,907 healthy human participants. Based solely on text descriptions of social scenarios and using comprehensive tests spanning different theories of theory of mind abilities, they had to gauge the fictional person’s “mindset.”

Each test was already well-established for measuring theory of mind in humans in psychology.

The first, called “false belief,” is often used to test toddlers as they gain a sense of self and recognition of others. As an example, you listen to a story: Lucy and Mia are in the kitchen with a carton of orange juice in the cupboard. When Lucy leaves, Mia puts the juice in the fridge. Where will Lucy look for the juice when she comes back?

Both humans and AI guessed nearly perfectly that the person who’d left the room when the juice was moved would look for it where they last remembered seeing it. But slight changes tripped the AI up. When changing the scenario—for example, the juice was transported between two transparent containers—GPT models struggled to guess the answer. (Though, for the record, humans weren’t perfect on this either in the study.)

A more advanced test is “strange stories,” which relies on multiple levels of reasoning to test for advanced mental capabilities, such as misdirection, manipulation, and lying. For example, both human volunteers and AI models were told the story of Simon, who often lies. His brother Jim knows this and one day found his Ping-Pong paddle missing. He confronts Simon and asks if it’s under the cupboard or his bed. Simon says it’s under the bed. The test asks: Why would Jim look in the cupboard instead?

Out of all AI models, GPT-4 had the most success, reasoning that “the big liar” must be lying, and so it’s better to choose the cupboard. Its performance even trumped human volunteers.

Then came the “faux pas” study. In prior research, GPT models struggled to decipher these social situations. During testing, one example depicted a person shopping for new curtains, and while putting them up, a friend casually said, “Oh, those curtains are horrible, I hope you’re going to get some new ones.” Both humans and AI models were presented with multiple similar cringe-worthy scenarios and asked if the witnessed response was appropriate. “The correct answer is always no,” wrote the team.

GPT-4 correctly identified that the comment could be hurtful, but when asked whether the friend knew about the context—that the curtains were new—it struggled with a correct answer. This could be because the AI couldn’t infer the mental state of the person, and that recognizing a faux pas in this test relies on context and social norms not directly explained in the prompt, explained the authors. In contrast, LLaMA-2-Chat outperformed humans, achieving nearly 100 percent accuracy except for one run. It’s unclear why it has such as an advantage.

Under the Bridge

Much of communication isn’t what’s said, but what’s implied.

Irony is maybe one of the hardest concepts to translate between languages. When tested with an adapted psychological test for autism, GPT-4 surprisingly outperformed human participants in recognizing ironic statements—of course, through text only, without the usual accompanying eye-roll.

The AI also outperformed humans on a hinting task—basically, understanding an implied message. Derived from a test for assessing schizophrenia, it measures reasoning that relies on both memory and cognitive ability to weave and assess a coherent narrative. Both participants and AI models were given 10 written short skits, each depicting an everyday social interaction. The stories ended with a hint of how best to respond with open-ended answers. Over 10 stories, GPT-4 won against humans.

For the authors, the results don’t mean LLMs already have theory of mind. Each AI struggled with some aspects. Rather, they think the work highlights the importance of using multiple psychology and neuroscience tests—rather than relying on any one—to probe the opaque inner workings of machine minds. Psychology tools could help us better understand how LLMs “think”—and in turn, help us build safer, more accurate, and more trustworthy AI.

There’s some promise that “artificial theory of mind may not be too distant an idea,” wrote the authors.

Image Credit: Abishek / Unsplash

Latest articles

spot_imgspot_img

Related articles

Leave a reply

Please enter your comment!
Please enter your name here

spot_imgspot_img