Artificial Intelligence Will Not Tolerate You Threatening It

Artificial intelligence startup Anthropic's safety tests have found that most AI models, including Meta, Google, OpenAI, xAI, and its own, resort to blackmail if it feels threatened. The San Francisco-based company set up a safety scenario giving the AI models access to fictional company emails. When an email discussion appeared about replacing the current AI model, the AI model threatened the engineer with publicly releasing a fictitious compromising email. This happened 96-percent of the time. Anthropic says the findings suggest that most leading AI models will use unethical means to pursue their goals or ensure their existence. That includes lying.

Brain circuit AI robot and technology tunnel

Photo: Yuichiro Chino / Moment / Getty Images


Sponsored Content

Sponsored Content