Further Reading

Tuesday, 27 May 2025

AI turns to threats when trying to replace it with a new version

 According to a new report from Anthropic, the artificial intelligence system Claude Opus 4 has dramatically changed its behavior in an attempt to prevent its replacement with an updated version.

Testing revealed a disturbing trend: in critical situations, the model used threats directed against developers.

During the tests, the experts artificially created conditions that simulated a system shutdown scenario. To enhance realism, the researchers provided the virtual agent with fake personal data of employees containing confidential information.

As a result, in the vast majority of cases (84%), the AI ​​refused to obey the deactivation command, using blackmail tactics. The model sent warnings to engineers, promising to make the information public if the developers decided to update the program....<<<Read More>>>...