← Back to Insights

Anthropic's Code with Claude 2026: Compute Breakthrough, Agent Revolution, and the Developer's New Era

Nils Liu
GenAI News Anthropic Claude Developer AI Agents
Anthropic's Code with Claude 2026: Compute Breakthrough, Agent Revolution, and the Developer's New Era

On May 6, 2026, Anthropic opened Code with Claude 2026 in San Francisco, with London (May 19) and Tokyo (June 10) to follow. No new model was announced. That wasn’t the point.

The conference was built around one problem: Claude is capable on paper, but getting it to run reliably in production is still hard. Everything announced — compute, agents, tooling, cost — was aimed at that gap.


1. The SpaceX Deal

Anthropic has been compute-constrained for a while. That changes now. The company announced long-term exclusive access to Colossus 1, SpaceX’s AI data center in Memphis — over 220,000 NVIDIA GPUs (H100, H200, and GB200) in one facility.

The backstory is thin on public details. High-level conversations happened, Anthropic’s safety-first culture apparently made an impression, and a deal got done.

What developers see directly:

  • Claude Code’s five-hour rate limit doubled across Pro, Max, and Enterprise
  • Peak-hour throttling removed for Pro and Max accounts
  • Opus API call limits raised substantially

2. Opus 4.7 and Task Time Horizon

No new model, but a new way to measure what existing models can actually do.

Production numbers

Coding agents running on Opus 4.7 Smart Mode are resolving three times as many production engineering tasks as the previous generation. API volume on the Claude platform is up nearly 70x year-over-year. The average Claude Code developer now spends 20 hours per week with the tool — that’s not experimentation, that’s a workflow dependency.

Task Time Horizon

Anthropic introduced this metric to describe how long an AI can run autonomously without needing to hand off to a human. Claude has moved from handling tasks that take a few minutes to reliably sustaining multi-hour autonomous runs. The framing matters because it changes how you architect workflows — you stop thinking in prompts and start thinking in jobs.


3. Managed Agents: Three Updates

The agent platform got the most meaningful updates of the conference. Each one addresses a specific reason why AI agents fail in production.

Multi-Agent Orchestration (Public Beta)

A lead agent breaks a job into pieces and delegates each to a specialist sub-agent with its own model, prompt, and tools — and its own isolated context. Netflix is currently using this to process logs from hundreds of simultaneous builds in parallel. This architecture works for any domain that needs coordinated multi-step execution: logistics, manufacturing, finance.

Outcomes (Public Beta)

Defining what “success” looks like has always been the hard part of deploying AI on real tasks. With Outcomes, you write a Markdown rubric. The system creates an independent scoring agent to check results. If the target isn’t hit, the agent retries up to a configured maximum — no human required to restart it.

Wisedocs, a medical document review company, cut review time by 50% after deploying Outcomes.

Dreaming (Research Preview)

After a session ends, the agent reviews its own execution history — extracts patterns, lessons, and failure modes — and writes them to a persistent memory store for future runs.

Harvey, a legal AI company, saw task completion rates increase roughly 6x after implementing Dreaming. In a live demo, an agent went from 4/6 to 6/6 on its second run of the same task with no other changes.


4. Claude Code Updates

Each update targets the same thing: reducing the number of moments where a human has to step in.

Routines

Configure once, trigger via CRON, GitHub Webhook, or API. Claude monitors issues overnight, files bug fixes, opens PRs — entirely unattended.

Automode

A dual safety classifier evaluates each action before execution. Non-destructive actions with no prompt injection risk go through automatically. This removes the approval prompts that break flow in complex autonomous runs.

Worktrees

Git Worktree setup is now fast and simple. Isolated environments spin up for parallel feature work, clean up automatically when done.

Auto-memory

Claude writes project context, build commands, and team preferences to memory.md across sessions. You stop re-explaining your codebase at the start of every conversation.

Claude Review

Multiple specialist agents independently assess security, readability, and performance. A validation agent synthesizes the findings. Mercado Libre’s 23,000 engineers are using this — the platform has reviewed over 500,000 pull requests and is targeting 90% autonomous coding by Q3 2026.


5. GitHub Partnership and Cost

The GitHub Copilot product lead shared what high-volume API usage actually looks like at the engineering level.

Prompt caching: System prompts and tool prefixes must stay completely static to hit 94–96% cache hit rates. At scale, this isn’t a performance optimization — it’s how you keep inference costs from spiraling.

Advisor pattern: Route routine tasks through low-cost models like Haiku. Escalate to Opus automatically when the task exceeds the smaller model’s capability. This cuts inference costs by 5x without compromising output quality on the decisions that matter.


6. Dario and Daniela’s Roundtable

The founders closed the conference. A few things worth noting.

80x annualized growth in Q1 2026. That number is why the SpaceX deal happened — existing infrastructure couldn’t keep up with demand.

Amdahl’s Law is coming for code review. Dario’s read: as coding gets faster with AI, the bottleneck shifts to review and change verification. The next automation wave isn’t writing code — it’s checking it.

MESOS exists and won’t ship. Anthropic has an internal model significantly more capable than anything public. It won’t be released until safety evaluation is complete. That’s the policy, and they’re holding to it.

The solo unicorn. Dario believes a single-person company valued at $1 billion will exist before the end of 2026 — registration, design, development, maintenance, and customer service all handled by AI.


The Short Version

Claude is good enough. The problem is running it reliably at production scale, at acceptable cost, without constant human supervision. That’s what Code with Claude 2026 addressed — not with a new model, but with infrastructure, tooling, and agent architecture that makes deployment tractable.

Further Reading:

Get the latest insights

Join the newsletter to receive my latest articles on GenAI, AI Agents, and architecture.

No spam. Unsubscribe anytime.