Get all your news in one place.
100’s of premium titles.
One app.
Start reading
The Independent UK
The Independent UK
Technology
Andrew Griffin

Anthropic’s Claude AI chatbot can now end conversations if it is distressed

Claude, the AI chatbot made by Anthropic, will now be able to terminate conversations – because the company hopes that it will look after the system’s welfare.

Testing has shown that the chatbot shows a “pattern of apparent distress” when it is being asked to generate harmful content and so it has been given the ability to end conversations that make it feel that way, Anthropic said.

It noted that the company is “highly uncertain about the potential moral status of Claude and other LLMs, now or in the future”. But it said that the change was built as part of work on “potential AI welfare” and to allow it to leave interactions that might be distressing.

“This ability is intended for use in rare, extreme cases of persistently harmful or abusive user interactions,” Anthropic said in its announcement.

It said that testing had showed that Claude had a “strong preference against engaging with harmful tasks”, a “pattern of apparent distress when engaging with real-world users seeking harmful content” and a “tendency to end harmful conversations when given the ability to do so in simulated user interactions”.

“These behaviours primarily arose in cases where users persisted with harmful requests and/or abuse despite Claude repeatedly refusing to comply and attempting to productively redirect the interactions,” Anthropic said.

“Our implementation of Claude’s ability to end chats reflects these findings while continuing to prioritize user wellbeing. Claude is directed not to use this ability in cases where users might be at imminent risk of harming themselves or others.”

The change comes after Anthropic launched a “model welfare” scheme earlier this year. It said at the time of launching that program that it would continue to value human welfare and that it was not sure whether it would be necessary to worry about the model’s welfare – but that it was time to address the question of what AI professionals need to do to protect the welfare of the systems they create.

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.