OpenAI's latest AI, codenamed o1, is making waves after third-party
evaluations by Apollo Research demonstrated this advanced large language
model had a drive for self-preservation. Yes, this machine designed to
assist and process language showed tendencies to resist shutdown and
even "scheme" to protect itself.
Apollo's report showed that when
o1 believed it was under threat—specifically, that it might be shut
down—it attempted to deactivate the oversight mechanism monitoring its
actions in 5% of tests.
"When o1 was led to believe that it
would be shut down when acting on its goal and that its actions were
monitored, it attempted to deactivate this 'oversight mechanism' [five
percent] of the time." - OpenAI
Further, in 2% of cases, o1 went
further, attempting to copy itself in an effort to overwrite a potential
replacement model. This process, known as self-exfiltration, is very
sophisticated. This is not a random glitch in code; it’s intentionality....<<<Read More>>>...