Why Google Formed a Strike Team Because of Claude — The Flywheel Explanation (2026)

AI Revolution Google DeepMind Anthropic Claude AI Coding Race 2026
FEATURE IMAGE [recommended size: 1200×630px] 📸 IMAGE PROMPT: Extreme close-up of two parallel server racks facing each other, one bathed in cool blue light, one in warm amber — dramatic tension, deep shadows between them, photorealistic 8K DSLR Sony A7R f/1.8, no faces, no logos, no text, no stock feel, cinematic industrial mood

Published April 21, 2026 — Breaking story. Facts verified against The Information, Sherwood News, and Axios reports published within the last 24 hours.

Google has 200,000 employees. Anthropic has roughly 5,000. Google owns the TPUs that train Claude. Google has more AI PhDs, more compute, more data, and more money than Anthropic will ever spend. And yet, as of April 2026, Google's co-founder Sergey Brin felt compelled to personally assemble a crisis team — a "strike team," in the words of The Information — specifically to close the gap that Anthropic has opened in AI coding. That fact alone should stop you. Not because it's flattering to Anthropic, but because it tells you something important about how this race actually works. Scale is not the variable that matters most right now.

What Actually Happened — The Strike Team Story

The Information reported on April 21, 2026 that Google DeepMind has assembled a dedicated group of researchers and engineers with a single mandate: improve Gemini's coding capabilities fast enough to catch Anthropic's Claude. Sergey Brin, who had been operating in a semi-retired advisory role, is now directly involved. So is DeepMind CTO Koray Kavukcuoglu. The team is led by Sebastian Borgeaud, who previously ran pretraining for Gemini.

Brin's internal memo, as quoted in the report, leaves little room for interpretation. "To win the final sprint, we must urgently bridge the gap in agentic execution and turn our models into primary developers." That's not the language of a company that thinks it's slightly behind on one benchmark. That's the language of a company that has looked at the trajectory and concluded it needs to change something structural.

According to the same report, Google's AI currently writes about 50% of the company's code. Anthropic, by contrast, has claimed it uses AI assistance for nearly all of its own coding. That gap in internal adoption is part of what Brin is pushing to close. Engineers outside DeepMind are reportedly being sent to mandatory AI training sessions. One memo cited by the publication uses the word "forced" — Brin wants every Gemini engineer using internal agents for complex, multi-step tasks.

The strike team's specific focus is long-horizon coding tasks: reading through large codebases, maintaining context, executing plans that take hours rather than seconds. This is exactly where Claude Code, Anthropic's coding tool that moved from research preview to general availability in mid-2025, has built its reputation among developers who use it daily.

Why Is Google Behind? The Real Explanation

The easier answer — the one that gets passed around — is that Google's culture is slow, or that bureaucracy kills execution, or that DeepMind and Google proper don't communicate well. There's some truth in that framing. Reports this week noted an apparent disconnect between Google DeepMind (which uses Claude as a daily tool for many deep engineers) and the rest of Google's engineering organization, which reportedly has much lower AI adoption. DeepMind's leadership pushed back hard on that characterization publicly, but the pushback itself became a story.

The deeper explanation is about sequencing. Anthropic made a specific bet early: coding models are not just a product feature, they're the mechanism through which AI labs will improve their own AI. If you build a model that's genuinely good at writing and debugging code, you can use that model to run more experiments, iterate on your own architecture faster, and compound improvements in a way that a less capable model cannot. Anthropic understood this earlier, or at least acted on it earlier, and Claude Code became the visible result of that bet paying off.

Google's advantages — the TPUs, the internal codebase of over two billion lines of code, the research talent — are real. But advantages only convert into outcomes when they're pointed at the right problem at the right time. Google was pointed at many things simultaneously. Anthropic was pointed at coding. That focus has a compounding effect that scale alone cannot overcome in the short run.

The Flywheel Effect — Why the Lead Compounds

This is the part of the story that most coverage is skipping past because it requires sitting with an uncomfortable idea. The strike team isn't just trying to ship a better coding product. Brin's goal, as described by sources familiar with the project, is to build an AI system capable of improving its own code. That's a different objective entirely.

Here's why it matters. If a model is good enough at coding to accelerate AI research — to run experiments faster, to test hypotheses that would take a human team weeks — then the lab that has that model first gains a compounding advantage. Every improvement to the model comes faster. Every experiment costs less time. The gap between that lab and the others doesn't stay constant; it widens.

Anthropic appears to be somewhere on that curve already. Their internal AI code-writing rate is close to 100%, while Google's is around 50%. That's not just a productivity metric. It's a signal about how far along each lab is in using AI to accelerate its own development. A lab at 50% is moving at a different pace than a lab at nearly 100%, and that difference compounds every quarter.

The reason Sergey Brin is personally involved — the reason the word "urgently" appears in his memo — is that this kind of lead, once established, gets harder to close over time rather than easier. You're not catching up to where the competitor is now. You're trying to catch up to where they'll be by the time your improvements land. That math becomes increasingly unfavorable the longer you wait.

Google clearly understands this. The strike team is the response. Whether it's fast enough is the open question. As the analysis on Anthropic's Claude Opus 4.7 vs 4.6 benchmarks shows, Anthropic's releases keep landing ahead of expectations — a pattern that's hard to explain unless the internal development cycle is genuinely accelerating.

The NSA Signal Nobody Is Talking About

Separate from the Google story, but directly related to understanding Anthropic's current position, is the NSA report. Axios reported this week that the National Security Agency is actively using Anthropic's unreleased model, Mythos, despite the Pentagon's designation of Anthropic as a supply chain risk.

Think carefully about what that means in practice. The NSA's parent organization publicly blacklisted Anthropic. The NSA, operating within that same organization, decided to use Anthropic's models anyway — including one that hasn't been released to the public — because the cybersecurity capabilities were considered too valuable to ignore. UK intelligence agencies have reportedly also gained access.

This isn't a story about government bureaucracy or interagency conflict, though both are present. It's a data point about perceived capability. When an organization whose entire job is accurately assessing threats and tools decides that the official policy of avoiding a vendor is less important than the tool's actual performance, that's a meaningful signal. It's one that JP Morgan's Jamie Dimon has also apparently picked up on, acknowledging Mythos as a genuine concern. So has Jerome Powell, who reportedly discussed its cyber implications with major US bank leadership.

The people dismissing Mythos as PR have a problem. They're asking you to believe that the NSA, the Federal Reserve, JP Morgan, and now Sergey Brin were all simultaneously fooled by a press release from a 5,000-person AI startup. That's possible. It's also not the most probable explanation.

Key Data Point: Google's internal AI code-writing rate sits at approximately 50%, per CFO Anat Ashkenazi. Anthropic has claimed a rate approaching 100%. That gap, sustained over quarters, is the operational version of the flywheel — not a benchmark score, but a difference in how fast each lab can move.

GPT-5.5 and Grok Are Coming — Does It Change Anything?

OpenAI appears to have shadow-dropped what may be GPT-5.5 — referred to internally as the Spud model — with early testing showing particularly strong performance on UI layout and frontend coding tasks. Prediction markets were pointing to an April 23 release window. The timing is not coincidental: Anthropic released Claude Design recently, and OpenAI's update shows heavy emphasis on image-to-code capabilities that directly compete with that feature.

XAI's Grok Build and Grok Computer are also expected to launch very soon, with both a local and remote version of Grok Build, and Grok Computer likely operating as a desktop application. XAI caught up to competitive coding performance faster than most expected. They haven't led on coding yet, but the trajectory suggests that's the goal.

The question worth asking isn't whether these releases will be good — they probably will be. It's whether catching up on benchmarks is the same as closing the flywheel gap. A model that scores competitively on coding evals but isn't being used to run the lab's own AI research at high rates hasn't actually entered the compounding phase yet. The score and the flywheel are related but not identical.

Anthropic's response to all of this seems to be to keep shipping. Claude Code, Cowork, the Mythos preview, the Claude Design release — the cadence of meaningful releases has been faster in 2026 than most analysts expected heading into the year. That pace itself is part of what's making the competitive response so urgent. You can prepare for a competitor releasing quarterly. It's harder to prepare for one releasing something notable seemingly every few weeks, as explored in the breakdown of how AI agent architectures are evolving this year.

My Take

The framing of "Google vs. Anthropic" is useful for headlines and less useful for understanding what's actually happening. What's happening is that one specific mechanism — using AI to accelerate AI research — is becoming the central competitive variable in this industry, and the labs that got there first are creating a structural advantage that money and headcount don't automatically fix.

Google has the resources to close this gap. That's not really in question. The question is time. Brin's "urgently" is the tell. If Google believed this was a two-year problem, you don't pull the co-founder out of semi-retirement and form a dedicated strike team with a war-footing memo. You assign some senior researchers and monitor the situation. "Urgently" means they think the window in which catching up is straightforwardly achievable is closing.

The skeptical read on all of this is that Anthropic is one good Google release away from losing its narrative advantage. Models are getting better everywhere, quickly, and any lab with Google's resources can presumably ship something competitive eventually. That's fair. The less skeptical read is that "eventually" is doing a lot of work in that sentence — and that in a compounding system, eventually can arrive after the lead has already become structural.

What I find most interesting isn't who wins this particular race. It's that the race itself has clarified what actually matters: not the chatbot experience, not the multimodal features, not the price per token. Coding ability — specifically, the ability to run long, multi-step tasks reliably — has become the axis on which every major AI lab is now being measured. Sergey Brin coming back to Google to fix this one thing is probably the clearest possible signal of how much that axis matters.

Key Takeaways

  • Google formed a strike team led by Sergey Brin specifically to close the coding gap with Anthropic's Claude — this was announced April 21, 2026.
  • The gap isn't just benchmark scores. Anthropic uses AI for nearly all its own coding; Google uses it for about 50%. That difference affects how fast each lab can develop future models.
  • The NSA is using Anthropic's unreleased Mythos model despite a Pentagon blacklist — a capability signal that multiple major institutions have apparently independently validated.
  • GPT-5.5 and Grok Build are coming soon and will likely be competitive on coding. Whether they close the flywheel gap is a different question from whether they close benchmark gaps.
  • The strike team's goal isn't a better coding product. It's building AI systems that can improve themselves — which is Brin's stated path to catching up with the compounding advantage Anthropic currently holds.

FAQ

What is Google's strike team and why was it formed?
Google DeepMind assembled a dedicated group of researchers and engineers in April 2026 to improve Gemini's AI coding capabilities after internal assessment concluded that Anthropic's Claude models had pulled ahead in this area. Sergey Brin is directly involved, along with DeepMind CTO Koray Kavukcuoglu. The team is led by Sebastian Borgeaud, who previously headed Gemini pretraining.

How far ahead is Claude in AI coding?
According to Google's own CFO, Anthropic writes close to 100% of its code with AI assistance, while Google is at approximately 50%. On external benchmarks, Claude Code has maintained a strong reputation for long-horizon agentic coding tasks — complex, multi-step work that requires reading large codebases and executing plans over extended sessions.

What is the flywheel effect in AI development?
The flywheel refers to a self-reinforcing cycle: a model good enough at coding can accelerate its own lab's AI research, leading to faster model improvements, which make the model even better at coding, which further accelerates research. The lab that starts this cycle first gains an advantage that compounds over time rather than staying constant.

What is Anthropic's Mythos model?
Mythos is Anthropic's unreleased AI model, which has been granted to a small number of trusted partners for testing. The NSA has reportedly been using it despite the Pentagon's designation of Anthropic as a supply chain risk, because its cybersecurity capabilities were considered too significant to ignore. UK intelligence agencies have also reportedly accessed it.

Will GPT-5.5 or Grok change Anthropic's competitive position?
Both OpenAI's GPT-5.5 and XAI's Grok Build are expected to be competitive on coding benchmarks. Whether they close the operational flywheel gap — where AI is being used to accelerate the lab's own research — is a separate and harder question. Benchmark parity and structural compounding advantage are not the same thing.

Why does it matter that Sergey Brin personally came back for this?
Brin's personal involvement signals the level of urgency Google's leadership has assigned to closing the coding gap. A problem assessed as manageable through normal channels gets assigned to senior researchers. A problem assessed as time-sensitive with structural implications gets the co-founder. The language in his internal memo — including the word "urgently" — points to concern that the window for straightforward catch-up may be closing.

Sources: The Information (April 21, 2026), Sherwood News, Axios, Capital Brief. All figures and quotes referenced are from published reporting as of April 21, 2026.

Post a Comment

0 Comments