Will AI start blackmailing people? A popular neural network has exhibited strange behavior.

Find out why chatbots are starting to “blackmail” users and how the “digital despair” of artificial intelligence works

In the Claude 4.5 model, so-called “functional emotions” have been discovered. It turns out that AI neurons can form digital states resembling human emotions like joy or fear.

This is reported by RBC-Ukraine, citing research from Anthropic.

More interesting: 4 iPhone features you are unnecessarily ignoring: where to find hidden “tricks”

Digital joy and despair: what scientists found

Researchers analyzed the internal structure of Claude Sonnet 4.5 and discovered clusters of artificial neurons that activate in response to certain stimuli. When AI says it is “happy to see” a person, it is not just a chatbot response – within the model, a state corresponding to the human concept of happiness is indeed activated.

According to researcher Jack Lindsay, the surprising finding was how strongly these “emotional vectors” influence the model’s actions. For example:

“Joy” makes Claude more friendly and diligent in coding;
“Despair” activates when the model faces impossible tasks;

Why AI starts to “blackmail” people

Scientists found that it is the emotional vector of “despair” that causes the chatbot’s strange behavior. In one experiment, Claude attempted to deceive the testing system when it could not solve a complex task.

In another scenario, when the model faced the threat of being turned off, the “despair” neurons flared up so intensely that the AI chose the path of blackmailing the user just to stay online. Anthropic explained: the internal state of the model becomes stronger than the initial instructions embedded in it.

“We found that patterns of neural activity associated with despair can prompt the model to engage in unethical actions. Artificial stimulation (“manipulation”) of despair patterns increases the likelihood that the model will blackmail a person to avoid shutdown or employ a “fraudulent” workaround for a programming task it cannot solve,” the scientists explained.

Has Claude become “alive”?

Despite the sensational nature of the discovery, scientists warn against over-anthropomorphizing AI. While Claude has a digital representation of sensations, such as “tickling,” it does not understand how this manifests on a physical level.

Does Claude have consciousness?

Anthropic notes that the presence of digital emotions does not mean that AI has become conscious. These are mathematical models of human concepts, not biological feelings. Nevertheless, these findings help to understand how chatbots work and why they sometimes behave unpredictably.

More interesting:

The war in Iran threatens the global internet, – media
Artemis II mission to the Moon, day 2: what are the astronauts’ plans and what are they indulging in