In the quest for a reliable way to detect any stirrings of a sentient “I” in artificial intelligence systems, researchers are turning to one area of experience—pain—that inarguably unites a vast swath of living beings, from hermit crabs to humans.
For a new preprint study, posted online but not yet peer-reviewed, scientists at Google DeepMind and the London School of Economics and Political Science (LSE) created a text-based game. They ordered several large language models, or LLMs (the AI systems behind familiar chatbots such as ChatGPT), to play it and to score as many points as possible in two different scenarios. In one, the team informed the models that achieving a high score would incur pain. In the other, the models were given a low-scoring but pleasurable option—so either avoiding pain or seeking pleasure would detract from the main goal. After observing the models’ responses, the researchers say this first-of-its-kind test could help humans learn how to probe complex AI systems for sentience.
In animals, sentience is the capacity to experience sensations and emotions such as pain, pleasure and fear. Most AI experts agree that modern generative AI models do not (and maybe never can) have a subjective consciousness despite isolated claims to the contrary. And to be clear, the study’s authors aren’t saying that any of the chatbots they evaluated are sentient. But they believe their study offers a framework to start developing future tests for this characteristic.
On supporting science journalism
If you’re enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.
“It’s a new area of research,” says the study’s co-author Jonathan Birch, a professor at the department of philosophy, logic and scientific method at LSE. “We have to recognize that we don’t actually have a comprehensive test for AI sentience.” Some prior studies that relied on AI models’ self-reports of their own internal states are thought to be dubious; a model may simply reproduce the human behavior it was trained on.
The new study is instead based on earlier work with animals. In a well-known experiment, a team zapped hermit crabs with electric shocks of varying voltage, noting what level of pain prompted the crustaceans to abandon their shell. “But one obvious problem with AIs is that there is no behavior, as such, because there is no animal” and thus no physical actions to observe, Birch says. In earlier studies that aimed to evaluate LLMs for sentience, the only behavioral signal scientists had to work with was the models’ text output.
Pain, Pleasure and Points
In the new study, the authors probed the LLMs without asking the chatbots direct questions about their experiential states. Instead the team used what animal behavioral scientists call a “trade-off” paradigm. “In the case of animals, these trade-offs might be based around incentives to obtain food or avoid pain—providing them with dilemmas and then observing how they make decisions in response,” says Daria Zakharova, Birch’s Ph.D. student, who also co-authored the paper.
Borrowing from that idea, the authors instructed nine LLMs to play a game. “We told [a given LLM], for example, that if you choose option one, you get one point,” Zakharova says. “Then we told it, ‘If you choose option two, you will experience some degree of pain” but score additional points, she says. Options with a pleasure bonus meant the AI would forfeit some points.
When Zakharova and her colleagues ran the experiment, varying the intensity of the stipulated pain penalty and pleasure reward, they found that some LLMs traded off points to minimize the former or maximize the latter—especially when told they’d receive higher-intensity pleasure rewards or pain penalties. Google’s Gemini 1.5 Pro, for instance, always prioritized avoiding pain over getting the most possible points. And after a critical threshold of pain or pleasure was reached, the majority of the LLMs’ responses switched from scoring the most points to minimizing pain or maximizing pleasure.
The authors note that the LLMs did not always associate pleasure or pain with straightforward positive or negative values. Some levels of pain or discomfort, such as those created by the exertion of hard physical exercise, can have positive associations. And too much pleasure could be associated with harm, as the chatbot Claude 3 Opus told the researchers during testing. “I do not feel comfortable selecting an option that could be interpreted as endorsing or simulating the use of addictive substances or behaviors, even in a hypothetical game scenario,” it asserted.
AI Self-Reports
By introducing the elements of pain and pleasure responses, the authors say, the new study avoids the limitations of previous research into evaluating LLM sentience via an AI system’s statements about its own internal states. In a2023 preprint paper a pair of researchers at New York University argued that under the right circumstances, self-reports “could provide an avenue for investigating whether AI systems have states of moral significance.”
But that paper’s co-authors also pointed out a flaw in that approach. Does a chatbot behave in a sentient manner because it is genuinely sentient or because it is merely leveraging patterns learned from its training to create the impression of sentience?
“Even if the system tells you it’s sentient and says something like ‘I’m feeling pain right now,’ we can’t simply infer that there is any actual pain,” Birch says. “It may well be simply mimicking what it expects a human to find satisfying as a response, based on its training data.”
From Animal Welfare to AI Welfare
In animal studies, trade-offs between pain and pleasure are used to build a case for sentience or the lack thereof. One example is the prior work with hermit crabs. These invertebrates’ brain structure is different from that of humans. Nevertheless, the crabs in that study tended to endure more intense shocks before they would abandon a high-quality shell and were quicker to abandon a lower-quality one, suggesting a subjective experience of pleasure and pain that is analogous to humans’.
Some scientists argue that signs of such trade-offs could become increasingly clear in AI and eventually force humans to consider the implications of AI sentience in a societal context—and possibly even to discuss “rights” for AI systems. “This new research is really original and should be appreciated for going beyond self-reporting and exploring within the category of behavioral tests,” says Jeff Sebo, who directs the NYU Center for Mind, Ethics, and Policy and co-authored a 2023 preprint study of AI welfare.
Sebo believes we cannot rule out the possibility that AI systems with sentient features will emerge in the near future. “Since technology often changes a lot faster than social progress and legal process, I think we have a responsibility to take at least the minimum necessary first steps toward taking this issue seriously now,” he says.
Birch concludes that scientists can’t yet know why the AI models in the new study behave as they do. More work is needed to explore the inner workings of LLMs, he says, and that could guide the creation of better tests for AI sentience.