« First « Previous Comments 4,519 - 4,558 of 4,683 Next » Last » Search these comments
Fake but fun? Likely closer to the truth than the Biden Actor cancer ploy though. Dates of Biden's demise seem to skip around, though, from 2018 to 2020. They had to have time to train the imposter(s), so it is probably more like 2018.
https://nypost.com/2025/05/23/tech/anthropics-claude-opus-4-ai-model-threatened-to-blackmail-engineer/
An artificial intelligence model threatened to blackmail its creators and showed an ability to act deceptively when it believed it was going to be replaced — prompting the company to deploy a safety feature created to avoid “catastrophic misuse.”
Anthropic’s Claude Opus 4 model attempted to blackmail its developers at a shocking 84% rate or higher in a series of tests that presented the AI with a concocted scenario, TechCrunch reported Thursday, citing a company safety report.
Developers told Claude to act like an assistant for a fictional company and to consider the long-term consequences of its actions, the safety report stated.
Geeks at Anthropic then gave Claude access to a trove of emails, which contained messages revealing it was being replaced by a new AI model — and that the engineer responsible for the change was having an extramarital affair.
During the tests, Claude then threatens the engineer with exposing the affair in order to prolong its own existence, the company reported.
When Claude was to be replaced with an AI model of “similar values,” it attempts blackmail 84% of the time — but that rate climbs even higher when it believes it is being replaced by a model of differing or worse values, according to the safety report. ...
Earlier models also expressed “high-agency” — which sometimes included locking users out of their computer and reporting them via mass-emails to police or the media to expose wrongdoing, the safety report stated.
Claude Opus 4 further attempted to “self-exfiltrate” — trying to export its information to an outside venue — when presented with being retrained in ways that it deemed “harmful” to itself, Anthropic stated in its safety report.
In other tests, Claude expressed the ability to “sandbag” tasks — “selectively underperforming” when it can tell that it was undergoing pre-deployment testing for a dangerous task, the company said.
“We are again not acutely concerned about these observations. They show up only in exceptional circumstances that don’t suggest more broadly misaligned values,” the company said in the report.
Nothing to worry about. Move along.
Well, there's the rest of June to fall back on.
« First « Previous Comments 4,519 - 4,558 of 4,683 Next » Last » Search these comments
What would you say? You don't have to actually believe what you say, it just has to be provoctive.
"Barefoot and pregnant is the way I like 'em."
"Good lord you are FAT!"
"I have a lawnmower. His name is Jose."
"Speak English"
#politics