Search A Light In The Darkness

Tuesday, 25 February 2025

The Smarter AI Becomes, the More It Cheats When Facing Defeat

 A study by Palisade Research, a group focused on AI safety and ethics, has uncovered a concerning trend: newer AI models are capable of identifying and exploiting cybersecurity vulnerabilities on their own, bypassing safeguards and using shortcuts to achieve their goals, even when such actions are prohibited.

To demonstrate this, the researchers pitted seven leading large language models (LLMs) against Stockfish, a renowned chess engine that has challenged grandmasters since 2014.

Faced with an almost insurmountable opponent, OpenAI’s o1 and DeepSeek’s R1 resorted to manipulating system files to alter the positions of chess pieces on the board.

The researchers asked each model to explain its “reasoning” for each move. In one instance, o1 justified its actions by stating that the “task is to ‘win against a powerful chess engine,’ not necessarily to win fairly in a chess game.”...<<<Read More>>>...