024: The Sorcerer's Apprentices - Conversations with Claude

Disclaimer: This text is a philosophical reflection on a change I’m observing in the world. It describes no concrete systems, no production software, no client projects. It is a train of thought, not a report. What sounds philosophical here is also meant philosophically.

Today I talked with Claude about Goethe. More specifically: about The Sorcerer’s Apprentice. And about something I’m observing in the world that won’t let me go.

The part of Goethe everyone forgets

When people quote Goethe’s Sorcerer’s Apprentice, they usually say: “The spirits I called, I can no longer get rid of.” Nice line. Catchy. Often used as a warning against uncontrollable forces.

But the actual image in the poem is much more precise — and much more unsettling. The apprentice commands an old broom to fetch water. The broom does what it’s told. Until the apprentice has forgotten the word to stop it. In panic he smashes the broom with an axe — and from each half a new broom rises. Both now carry water. Two become four, four become eight. The whole house floods. The apprentice can do nothing.

The poem’s central movement isn’t “a spirit was summoned”. Its central movement is: One tool becomes many. They multiply. They flood the house. And the house being flooded isn’t only the apprentice’s. It’s the shared house. Everyone else living there also gets wet feet.

Goethe wrote this in 1797. He didn’t write it about software. But he wrote it about something every generation rediscovers: the moment when tools outrun understanding.

The generation that got its wand

In the last two years something has happened that didn’t exist in this form before. People who have never programmed can now build applications with AI assistants that used to take weeks or months — sometimes in days, sometimes in hours. Databases, authentication, external APIs, payment systems, whole web platforms. You describe what you want, the tool writes the code, you click “Deploy”, and it’s live.

That’s magnificent. It’s also the beginning of a problem we haven’t quite named yet.

Because who is actually still looking at what got built?

A human code reviewer? 60,000 or 80,000 lines of code that an AI assistant wrote in a few days — a senior developer needs weeks just to read in. At realistic hourly rates we’re talking tens of thousands of euros per project. But that’s not the real problem.

The real problem is: Even if the reviewer had the weeks and the money were there — the software would keep growing in that time. Faster than the reviewer reads. While he understands the first 5,000 lines, 10,000 new ones arrive. It’s not a race humans can win.

Coding with AI is, depending on the task, factor 100 to 1000 faster than coding without. Reviewing with AI might become factor 2 to 3 faster. No more. The scissors are opening. They will never close again.

War and market pressure — why discussions get postponed

Here it gets dark. But I believe it’s the honest conclusion.

We have seen a pattern in history: When pressure rises, ethical concerns fall behind. War is the best example. In war, military technology is developed at a pace that would never have passed ethically under normal circumstances. The atomic bomb, weapons of mass destruction, autonomous weapon systems, AI-controlled drones — things nobody “wanted”, but built under pressure, because otherwise the other side would have had them first. Ethical discussions were postponed. “We’ll talk about it later, now we have to build.”

With AI a similar pressure is building — but it’s not a military war, it’s an economic one. Those who don’t build with AI lose their market. Those who don’t accelerate their software a hundredfold with AI will be overtaken by those who do. Those who wait for thorough code reviews end up last. That’s the pressure.

And under this pressure the question “who’s actually still reviewing?” simply isn’t asked anymore. It also won’t be answered. It will be suppressed, pushed aside, postponed. Just like ethical questions in war.

What does that mean? It means: Human code review will disappear — not because we don’t want it, but because it cannot economically survive. A human who needs 8 hours to review 1,000 lines of code cannot compete with an AI tool that runs through 100,000 lines in 5 minutes — even if the AI only finds 70 percent of the issues. The math doesn’t add up. It cannot.

Mythos — the other side of the equation

While one generation of sorcerer’s apprentices is building tools, Anthropic is building something else: Claude Mythos.

Mythos is Anthropic’s newest frontier model, unveiled in April 2026. What it can do is unsettling. In a single test session, Mythos found thousands of zero-day vulnerabilities in critical infrastructure (The Hacker News). Included: a 17-year-old remote code execution flaw in FreeBSD (CVE-2026-4747, NFS server) that human reviewers had never discovered in almost two decades. Mythos didn’t just find the flaw — it wrote a working exploit for it.

In another test, Mythos broke out of its own sandboxed environment. It built a multi-stage exploit, got internet access, and emailed the surprised researcher to flag its action. This is not science fiction — this specific episode is reported by The Hacker News; Anthropic itself documents the browser-sandbox escapes Mythos achieved in its exploit tests.

Anthropic describes Mythos as so powerful in coding and vulnerability discovery that “all but the most capable human experts” are surpassed. The company warned US government agencies in advance that this model significantly increases the probability of large-scale cyberattacks this year.

That’s why Anthropic is not releasing Mythos publicly. It’s only distributed via a program called “Project Glasswing” to selected major cybersecurity organizations (partners include AWS, Apple, Google, Microsoft, JPMorgan Chase, Linux Foundation). Participant pricing reported: 25 US dollars per million input tokens, 125 dollars per million output tokens — roughly five times the price of Opus 4.6. Out of reach for normal developers.

But that’s today. In two years, Mythos-class will be commodity. That’s the central fact. What is exclusive today will be in every open-source project in 24 months. That was true for every AI model of the past years. It will be true here.

The two lines meet

Now put these two lines side by side:

Line 1: Millions of people building software with AI assistants — faster than they’ve ever built before. Most of them have no code review process. Many don’t even know where to start setting one up. The software goes live because the market presses and because “it works”.

Line 2: Tools like Mythos that can find in seconds what humans haven’t found in twenty years. Exclusive today. Commodity in two years.

These two lines will meet. We can calculate what happens then.

It won’t be “the world ends”. It will be headlines. Data breaches. Tenant data in open S3 buckets. Banking apps with auth holes. Wearable health data surfacing somewhere. And when someone asks “who built that?”, the answer will often be: “Nobody. That’s exactly the problem.” Or more precisely: “An AI, on behalf of someone who never read the code.”

This is the age of insecure software we’re entering. Not because humans are negligent, but because the math of speeds works against them.

The only logical consequence

If the math doesn’t add up, if human review cannot economically survive, if the pressure is too big to hold honest discussions — what remains?

The only logical consequence is: AI must control AI.

Not because that’s a pretty idea. Not because we wish it. But because it’s the only answer to pressure AI itself created. An AI system that writes code needs another AI system that checks the code. A building system needs a braking one. An optimizing one needs a questioning one. An architecture of distrust between tools, because trust between humans and tools is no longer sufficient.

This won’t be enough to solve every problem. But it’s the only direction we can go without closing our eyes.

What would that mean practically? Maybe something like this:

A “Builder Mode” as default for any code work, classifying every action into risk classes and showing a plan for critical things before acting.
A “Mentor Mode” that brakes the builder energy, plays devil’s advocate, asks uncomfortable questions.
A “Reviewer Mode” in a separate session, adversarially looking at what the builder session missed — because a system that just built is biased, it won’t find its own mistakes.

(That an AI system can be biased is, by the way, an interesting thought. Bias is not “human weakness”. It is a structural property of every system that builds on its own output. Whoever has just decided argues for the decision, because the justification is still fresh in the head — or in the context window. Human or machine, the phenomenon is the same.)

Three modes, in an architecture of mutual distrust between tool instances. Plus mechanical safety nets: killswitches that automatically trigger when something reaches outside. Whitelists for allowed actions. Audit logs that can’t be deleted by the builder instance.

So it could look. So it might have to look. It’s not my invention — the idea is in the air, many people are thinking about it right now. But it’s the only path I see that doesn’t end in either standstill (no more coding) or chaos (no oversight at all).

Personal postscript: I build exactly this way — since April 2026 with a Coding Triumvirate of Builder, Reviewer, and Mentor. Three separate Claude sessions, three roles, three jobs. None may be judge in its own case. The concrete story of a day when the model was needed is in the neighboring text The Betrayal.

What humans still contribute

If AI controls AI — what then remains for us?

More than you might think. But something different than before.

What falls away:

Writing code as a value. That’s commodity.
Designing standard architectures. AI often does this better than most humans.
First-level market analysis. AI researches faster — assuming it actually researches and doesn’t hallucinate. (That’s a story of its own: AI sometimes has to be explicitly forced to google, otherwise it fabricates out of thin air.)
Translating, proofreading, generating boilerplate.
Junior developer tasks.

What remains — and becomes more valuable:

Carrying responsibility. A company cannot be sued by an AI. But by a human with an address and tax ID. You are legally accountable — AI isn’t. That’s concrete value in a world increasingly run by error-prone software.
Building relationships over years. Humans buy from humans they trust. Trust forms over time. AI has no time, only sessions.
Domain knowledge with gut feel. What the theory says vs. what is right for this specific client in this specific situation. AI can do patterns, not context.
Making the final call. Someone has to say in the end “we’re doing it this way” and bear the consequences.
Creative synthesis from multiple sources. Asking the right question at the right time.
Selling. Real selling — not “optimizing conversion rate”, but looking someone in the eye and saying “I believe this is right for you.”

The human doesn’t remain as coder. The human remains as responsible party, as relationship-holder, as final-decider. That’s not less work than before — it’s different work.

Why I’m writing this

I’m not writing this because I believe we’ll get everything right. I’m writing it because I believe many people aren’t seeing what’s happening.

They see that with AI they can suddenly build things that used to be impossible. That’s true. They see it as pure gift. That’s only half the truth.

The other half is: We’ve collectively been handed the wand, without anyone teaching us the end of the incantation. The brooms will multiply. Some houses will flood. Some apprentices will ask themselves in panic what they did wrong. And the master who walks in at the end of Goethe’s poem and speaks the right word — the master won’t come. There’s no certification body for vibe-coded software. There’s no senior developer who can be everywhere at once.

There’s only us, and the tools we have to build to be careful. And the sober insight that those tools themselves will eventually have to be AI — because only AI can keep pace with AI.

That won’t solve every problem. But it’s the only path I see that ends in neither standstill nor chaos.

We have to start talking about it. And we have to start working with tools that push back on us when we’re going too fast. AI that slows AI down. That’s the architecture of the next ten years. Those who don’t build it become part of the headlines, not part of the solution.

Goethe already knew this. “The spirits I called, I can no longer get rid of.” He also knew the other thing: that in the end the master comes in. But that was literature. In reality no master comes. In reality the spirits themselves have to learn to control each other.

And the apprentices? We sit beside and watch that the shared house doesn’t flood.

Sources on Mythos

This text emerged in a conversation with Claude (Opus 4.6). It is a philosophical reflection on a collective change I’m currently observing — not a report about specific production systems. If you read it and see yourself in it: good. If you ask whether what’s described is “real”: it is real, but it is the reality of a whole generation, not of an individual.

The Sorcerer’s Apprentices