In March 2026, Andrej Karpathy released a project with roughly 30 lines of code. No research paper. No benchmarks. Just a simple agent that edits its own training code, runs a five-minute experiment, checks whether the model improved, and then does it again. He let it run for two days. It ran 700 experiments and found around 20 improvements that stack on top of each other. Training time for a GPT-2-sized model dropped from about two hours to roughly 1.8 hours. An 11% speedup. From a script that fits in a text message.
Two months later, he joined Anthropic to do this at scale.
What "AI training itself" actually means
The phrase sounds futuristic. It is not. The mechanism is almost boring in its simplicity.
An AI agent edits a piece of training code. It makes a small hypothesis: what if we change this part? Then it runs a training experiment, measuring some objective metric at the end. If the metric improved, the change is committed. If it got worse, the change is reverted. Then the loop starts again. No human in the middle. No weekly meeting to review results. Just iteration.
This is what machine learning researchers already do manually. They propose a change, run an experiment, evaluate the result, move on. Karpathy's insight was that the loop itself can be automated. The agent does not need to understand what it's doing in any deep sense. It just needs a reliable metric to judge progress against.
Karpathy called his own first result mildly surprising. For someone who co-founded OpenAI and directed AI at Tesla, "mildly surprising" carries some weight.
The Karpathy Loop: how Auto Research works
Auto Research was intentionally stripped down. It was built on Karpathy's own nano-GPT training stack, something you could run on a home computer. The design choices are worth paying attention to because they reveal the philosophy.
The experiment time limit was set by wall clock, not by compute budget. Five minutes and cut. This matters because it forces the agent to work with the constraint that every human ML researcher actually faces: time. You do not have infinite patience for an experiment that might be going nowhere.
The metric was validation loss, or another concrete objective. Not a qualitative judgment. Not a vibe check. A number that is either smaller or larger than it was before the change.
Karpathy described a potential extension where many people run small versions of this agent distributed across home computers, all feeding results into one central pool. An agentic swarm doing science. But that is the extension. The core is simpler: one agent, one loop, one metric, running continuously.
Fortune magazine described it as the Karpathy Loop, which is a useful phrase because it captures the recursive structure. The loop is not just a process. It's the point.
What 700 experiments in 2 days actually produced
The numbers from Karpathy's initial run are concrete and worth keeping in focus.
700 experiments over approximately two days. About 20 of those produced improvements that could be stacked, meaning each one added value on top of the others rather than being mutually exclusive. The combined effect was an 11% reduction in training time for a GPT-2-scale model, from around 2 hours down to roughly 1.8 hours.
11% does not sound like a revolution. In this context it is not supposed to. The point is not the magnitude of the single result. The point is what produced it. A human researcher running experiments one at a time, taking notes, iterating across days or weeks, might find a handful of improvements over months. This agent found 20 in 48 hours on a laptop. The question is what the same loop looks like on the compute infrastructure of a frontier AI lab, running on models the size of Claude.
For context: Google DeepMind's AlphaEvolve, a different kind of AI-driven optimization system, produced improvements to Gemini's training and to Google's data center orchestration that were sometimes below 1%. Those improvements are still running. At the scale of Google's infrastructure, fractions of a percent become millions of dollars saved.
A 5% or 10% gain in pre-training efficiency at Anthropic's current spending levels is not academic.
Why pre-training is where this bet pays off
Karpathy joined Anthropic's pre-training team. He reports to Nick Joseph, Anthropic's head of pre-training. He was not placed in a general research role or a public-facing position. He went straight into the part of the process where the most money is spent and where the most consequential decisions get made.
Pre-training is the stage where a model is trained on vast amounts of data before any fine-tuning or alignment work begins. The decisions made in pre-training, about architecture, data mix, synthetic data, evaluation signals, and dozens of other variables, cascade through everything that follows. Small choices made early have large downstream effects on the final model.
This is also where AI labs spend enormous amounts of money. Runs cost tens to hundreds of millions of dollars. There is no cheap way to verify that a pre-training approach is working until the run is substantially complete. Anything that lets a lab find better approaches faster, or confirm bad approaches sooner, has compounding value.
Anthropic is also scaling compute aggressively. They have existing commitments with Google Cloud and with Colossus (SpaceX/xAI infrastructure), and as of May 2026 appear to be in discussions with Microsoft as well. When you are preparing to spend significantly more on compute, the efficiency of how you use it matters proportionally more.
Jack Clark's 2028 forecast and what it implies
Jack Clark, one of Anthropic's co-founders, published a forecast placing the probability of fully automated AI research and development at over 60% by the end of 2028. Not by a distant future date. In approximately two and a half years.
He was not claiming this is guaranteed. He was saying it is the most likely outcome. That is a striking thing for a co-founder of one of the most prominent AI safety organizations to say publicly.
Recursive self-improvement, or RSI, is the version of this that gets the most attention and generates the most controversy. The idea that AI systems could meaningfully improve their own successors sits at the center of a lot of disagreements about what AI development looks like at the frontier. Some view it as the path forward. Others view it as something that should not be attempted at all. Anthropic has been actively working on agentic alignment problems in parallel — a sign they are not treating automation as a shortcut around safety.
Karpathy's role at Anthropic can be understood as the execution plan for Clark's forecast. Clark identified the trajectory. Karpathy, with a working prototype of the loop and deep expertise in pre-training, is building it out at scale.
Where Google disagrees, and why it matters
Not everyone inside the frontier AI world is aligned on this path.
From what has been made public, Demis Hassabis at Google DeepMind appears to be placing more emphasis on world models: systems that develop a comprehensive understanding of physical reality, video, audio, and causal structure, rather than betting primarily on LLM-based code generation as the engine of automated research. His framing suggests he sees one or two additional breakthroughs as necessary before the kind of recursive improvement others are chasing becomes viable.
Sergey Brin, Google's co-founder, reportedly returned to Google to build his own team specifically focused on large language models with strong coding ability, oriented toward automated research. This creates an unusual situation internally at Google: the co-founder and the CEO of DeepMind appear to be pulling in somewhat different directions on this specific question.
On the Anthropic side, no one in a senior position is suggesting that additional foundational breakthroughs are needed before this kind of automated research becomes meaningful. They appear to believe the tools that exist now, applied recursively and at scale, are sufficient to begin compressing the research cycle significantly. This confidence shows up elsewhere too — Anthropic has been willing to hold firm on its principles even when major contracts are on the line.
Yan LeCun sits at a different extreme, arguing publicly that large language models will not lead to AGI and that this entire approach is fundamentally limited. He is in a smaller camp on this specific point.
My Take
What makes Karpathy's move interesting is the detail that is easy to miss. He was not brought in to talk about AI research automation. He was not given a broad mandate to explore promising directions. He was placed inside the specific team that controls pre-training at Anthropic, with a specific task: build a team that uses Claude to accelerate pre-training research.
That is not a visionary hire. That is an engineering hire. The vision is already settled. The implementation is the open problem.
This is the one number that actually matters: 700 experiments in two days, on a home computer, with 30 lines of code. Just is.
- Andrej Karpathy joined Anthropic on May 19, 2026, joining the pre-training team under Nick Joseph to use Claude to accelerate pre-training research.
- His Auto Research project (March 2026) ran 700 experiments in 2 days using a 30-line agent, finding ~20 stackable improvements and achieving an 11% GPT-2 training speedup.
- Pre-training is where AI labs spend the most money and where small efficiency gains compound enormously at scale.
- Jack Clark (Anthropic co-founder) gives a 60%+ probability to fully automated AI R&D by end of 2028.
- Google DeepMind and Anthropic appear to hold meaningfully different views on whether world models or LLM-based recursive loops are the more promising path forward.
Frequently asked questions
What is Andrej Karpathy doing at Anthropic?
He joined on May 19, 2026 to build a new team focused on using Claude to accelerate Anthropic's pre-training research. He reports to Nick Joseph, Anthropic's head of pre-training. The role is technical and operational, not public-facing.
What is the Karpathy Loop, and what is Auto Research?
Auto Research is an open-source project Karpathy released in March 2026. It is an AI agent that edits training code, runs short experiments (up to 5 minutes on wall clock time), evaluates a metric, and commits or reverts the change before repeating. The Karpathy Loop is the informal name for this recursive research cycle, coined by Fortune magazine.
What is recursive self-improvement in AI?
Recursive self-improvement (RSI) refers to the process by which an AI system contributes to improving future versions of itself, creating a feedback loop where each generation of AI helps produce a better next generation. In the context of Karpathy's work, this means using Claude to find better pre-training techniques that then get used to train future Claude models.
How much did Auto Research speed up GPT-2 training?
In Karpathy's initial 2-day run, the agent found approximately 20 stackable improvements that together reduced GPT-2 training time from roughly 2 hours to approximately 1.8 hours, an 11% speedup.
What did Jack Clark say about AI research automation?
In a public blog post, Anthropic co-founder Jack Clark stated that he assigns a greater than 60% probability to fully automated AI research and development, with no human involvement, arriving by the end of 2028. He framed this not as a certainty but as the most likely scenario that organizations in the AI space should be preparing for.
Whether the loop actually accelerates meaningfully at frontier scale is the open question. Within the next 6 to 12 months there should be enough signal, from published results or from conspicuous silence, to say something more definitive about whether this bet is paying off.
Source: Analysis based on the YouTube video "The REAL Reason Andrej Karpathy Joined Anthropic" by Wes Roth, published May 2026. Claims about specific figures and timelines are drawn from that transcript. Positions attributed to Demis Hassabis and internal Google dynamics are based on the speaker's interpretation of public interviews, not direct quotes.
0 Comments