Skip to content

JOBUZO

  • News
  • Indonesia
  • Toggle search form
Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts

Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts

Posted on 10 May 2026 By jobuzo

Fictional portrayals of artificial intelligence can have a real effect on AI models, according to Anthropic.

Last year, the company said that during pre-release tests involving a fictional company, Claude Opus 4 would often try to blackmail engineers to avoid being replaced by another system. Anthropic later published research suggesting that models from other companies had similar issues with “agentic misalignment.”

Apparently Anthropic has done more work around that behavior, claiming in a post on X, “We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation.”

The company went into more detail in a blog post stating that since Claude Haiku 4.5, Anthropic’s models “never engage in blackmail [during testing], where previous models would sometimes do so up to 96% of the time.”

What accounts for the difference? The company said it found that training on “documents about Claude’s constitution and fictional stories about AIs behaving admirably improve alignment.”

Related, Anthropic said that it found training to be more effective when it includes “the principles underlying aligned behavior” and not just “demonstrations of aligned behavior alone.”

News :<div>12 weeks' jail for school IT support technician who took upskirt videos of teachers</div>

“Doing both together appears to be the most effective strategy,” the company said.

Techcrunch event

San Francisco, CA
|
October 13-15, 2026

Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts


News

Post navigation

Previous Post: Alibaba brings chat-style shopping to Taobao, Qwen amid AI gateway push: source
Next Post: Martin Short Details Daughter Katherine’s Heartbreaking Mental Health Battle Before Her Death

Related Posts

Woman telecaller hurls insults at Indian paramilitary personnel in viral audio Woman telecaller hurls insults at Indian paramilitary personnel in viral audio News
Trump says trade deal struck with Japan includes 15 percent tariff Trump says trade deal struck with Japan includes 15 percent tariff News
China welcomes international participation in five-point initiative China welcomes international participation in five-point initiative News

Latest

  • Leaked: the Surprising Difference Between Samsung’s Galaxy Z Fold 8 Ultra and Wide
  • When DWTS’ Alan Bersten Realized He, Emma Slater Could Be More Than Friends
  • OpenAI unveils Lockdown Mode to protect sensitive data from prompt injection attacks
  • What to expect from WWDC 2026: Siri’s highly anticipated revamp and Apple Intelligence updates
  • U.S. job market posts surprising increase in May, but prospects unclear amid price hikes
  • ‘World crying for peace’: Pope Leo kicks off Spain trip with fiery plea to leaders
  • Drone strike on central Sudan market kills 11: rights group
  • U.S. attacks Iranian sites after Iran launches drones, in latest Gulf flare-up
  • Baby killed in West Bank after Israeli troops open fire on a car, Palestinian health officials say
  • West Ham joint-chairman quits ahead of ‘historic allegations’ to be made against him

Copyright © 2025 JOBUZO. Disclaimers | Privacy Policies

Powered by PressBook Masonry Blogs