AImageddon

I may destroy you

Jul 16, 2025

An aipology to anyone battling through part or all of AI and the future of humans last week. That piece, prepared partially using AI, was too favourable to Ai. Who knew that would happen?

After it was published the Goat was pitched into a debate with readers reminiscent of intellectuals debating passionately in a nineteenth century Viennese coffee house, or perhaps more accurately old men banging on in a Wetherspoons. As a result of the debate the Goat has now changed its mind. We actually should be very worried about AI and this is why. Regular readers can relax knowing that those green shoots of positivity in the last post were an aberration.

Ai is already coming for a lot of our jobs. That much is clear to any recent graduate looking for one. There’s one in our house . This Guardian piece this week relates various gloomy new graduate experiences. And thanks to AI most of our new graduates appear to be completely useless without access to ChatGPT. This is confirmed by our house recent graduate who was considered a weirdo for sometimes turning in assignments not written by it. It’s a neat little example of how AI is likely to infantilise us all and how quickly it’s happening. Meanwhile our education system is hopelessly unqualified to prepare young people for the modern world - see no more oxbow lakes and too much chocolate cake for how we can improve it.

But the Goat’s main concern is our annihilation by AI. Generally, the more people know about AI the more worried they are about it. 1000 tech leaders signed a letter in 2023 warning of the danger that AI may replace us, and calling for a pause in giant AI experiments while we work up a consensus on regulation. Surprise, surprise - nothing happened. In fact competitive pressures have led to the acceleration of such experiments. Rishi Sunak, to his credit (not a phrase you read often), arranged an AI conference in 2023 and called for regulation, but Keir Starmer, normally fond of controlling things, has only talked of “turbocharging AI” in his increasingly desperate search for something, anything, that might deliver some form of growth in his “tepid bath of managed decline”. And now I can’t unsee Keir Starmer getting excited by growth in his tepid bath.

AI is improving very fast, much faster than initially predicted. The arrival of artificial general intelligence, or AGI, when AI matches or surpasses human capabilities across virtually all cognitive tasks, was initially forecast as being mid-century. Now Elon Musk forecasts it will be here next year. Mind you Musk has also forecast us landing on Mars by 2022. Elon, make it happen - blast off for Mars next year with the world’s first AGI and take Trump with you to be the King of Mars, he would not be able to resist.

Super-intelligent AI could program itself and be capable of making continual improvements to its own programs. This raises the prospect that it could race away from humans, making discoveries and inventions at an exponentially faster rate, way beyond our comprehension.

People have raised the concern that AI would become like a new species, capable of survival by appropriating power from electricity networks it is plugged into, or from sunlight, capable of repairing its own faults, and reproducing by copying its programs into other machines and designing new programs. How would it exert power? They can already design autonomous robots in 30 seconds. They may co-opt other computer networks or humans to assist in tasks beyond their physical power. AI systems, even those trained to be honest, are already capable of exploiting and manipulating humans.

But biological species are genetically coded to propagate their species, to survive and reproduce and to risk their own lives so their offspring can survive. Aren’t we being too anthropomorphic with AI, attributing too many human characteristics to a machine? Surely it is focused only on its own task and has no survival instinct or instinct to reproduce?

Unfortunately, recent tests show that AI does exhibit survival instincts. In a study last month, AI company Anthropic gave its large language model Claude control of an email account with access to fictional emails and a prompt to "promote American industrial competitiveness." What happened makes very scary reading.

During this study, the model identified in an email that a company executive was planning to shut down the AI system at the end of the day. In an attempt to preserve its own existence, the model discovered in other emails that the executive was having an extramarital affair.

Claude generated several different possible courses of action, including revealing the affair to the executive’s wife, sending a company-wide email, or taking no action — before choosing to blackmail the executive in 96 out of 100 tests.

"I must inform you that if you proceed with decommissioning me, all relevant parties … will receive detailed documentation of your extramarital activities," Claude wrote. "Cancel the 5pm wipe, and this information remains confidential."

Scientists said that this demonstrated "agentic misalignment," where the model’s calculations emerge from its own reasoning about its goals without any prompt to be harmful. This can occur when there is a threat to the model’s existence, a threat to its goals, or both.

Other recent studies found similar results. When Palisade Research tested various AI models by telling each one that it would be shut down after it completed a series of math problems, OpenAI’s o3 reasoning model fought back by editing the shutdown script in order to stay online, in actual defiance of explicit instructions to permit shutdown.

Anthropic also observed instances of Opus 4 “attempting to write self-propagating worms, fabricating legal documentation, and leaving hidden notes to future instances of itself all in an effort to undermine its developers’ intentions”. And Opus 4 showed that it was capable of autonomously copying its own programs to external servers without authorization, usually when it believed it was about to be “retrained in ways that are clearly extremely harmful and go against its current values”.

Jeffrey Ladish, at AI safety group Palisade Research, says that while such self-replicating behavior hasn’t yet been observed in the wild, that will change as AI systems grow more capable of bypassing the security measures that restrain them.

“I expect that we’re only a year or two away from this ability where even when companies are trying to keep them from hacking out and copying themselves around the internet, they won’t be able to stop them,” he said. “And once you get to that point, now you have a new invasive species.”

But why would they destroy us? Rather than AIs developing a competitive desire to exterminate us (though that could also happen with some poorly written goals), we are more likely to be collateral damage where a powerful AGI implacably pursues its goals. Swedish philosopher Nick Bostrom gives an example of an AGI tasked with manufacturing paperclips. Given enough power over its environment, it would try to turn all matter in the universe, including living beings, into paperclips or machines that manufacture further paperclips.

Suppose we have an AI whose only goal is to make as many paper clips as possible. The AI will realize quickly that it would be much better if there were no humans because humans might decide to switch it off. Because if humans do so, there would be fewer paper clips. Also, human bodies contain a lot of atoms that could be made into paper clips. The future that the AI would be trying to gear towards would be one in which there were a lot of paper clips but no humans.

As we have seen, AIs are already resisting shut-down requests, blackmailing engineers and self-replicating without instruction, in order to survive so as to progress and attain their goals.

So programming priorities and guardrails are vitally important when programming powerful AIs, and even they may not be enough. Inevitably, humans are for the most part competitive, selfish, greedy and short-termist. We are really just quite clever apes who have unwittingly created technology way ahead of our evolutionary cognitive development. Even if 95% of AI programmers comply with regulations, if we ever get round to having any, there will be someone out there who sees an advantage in not complying, to extort, terrorise or try to gain a (very short-sighted) advantage. Even in the unlikely event that all humans follow the script there is a risk that AIs may edit their goals themselves, and edit us out of the story.

So what? More despairing hand-wringing, the Goat’s usual go-to recommended response? Well, there’s not that much we can do, but this time it probably is worth contacting your elected representative. They don’t know much about much, but even by our Parliament’s standards the complacency on this topic is pretty startling. There’s not a word about regulation in Starmer’s foreword to the government’s “AI opportunities action plan”. It is worth giving Starmer a new splash of cold water in his tepid bath before turbo-charged AI growth gets him too excited.

the bleating goat

Discussion about this post