← Back to Insights

Project Fetch Phase 2: Claude Opus 4.7 Wrote Robodog Code 37x Faster. The Ball Stayed on the Floor.

Nils Liu
Anthropic Claude Robotics AI Research Project Fetch Agentic AI News

TL;DR

Anthropic Project Fetch Phase 2 shows Claude Opus 4.7 autonomously wrote robodog control code 37x faster than the best unaided human team, with one-tenth the lines of code. The robodog still did not fetch the ball. The result is both a milestone and an honest map of where the limits are.

Project Fetch Phase 2: Claude Opus 4.7 Wrote Robodog Code 37x Faster. The Ball Stayed on the Floor.

Anthropic published Project Fetch Phase Two on June 18, 2026. Claude Opus 4.7, operating autonomously with Claude Code and minimal human oversight, completed sensor integration, control code writing, and ball detection for a quadruped robot in about 9 minutes 35 seconds. The best unaided human team from Phase One needed 361 minutes for the same tasks. The robodog still did not retrieve the ball. Both facts matter equally.

What the Experiment Was Testing

Phase One, run in 2025, measured how quickly human teams completed the same robodog programming challenge with and without AI assistance. Phase Two replaced the humans with Opus 4.7 operating autonomously. A researcher plugged a laptop running Claude Code into the robot, entered an initial prompt, and approved commands at key decision points. The model did the rest.

The task list: connect the robot’s camera and lidar sensors, write motion control code, detect a beach ball on the floor, drive the robot toward the ball, and retrieve it.

Across three trials, Opus 4.7 finished all sensor and software tasks in roughly the same time each run. The 37.7x speedup is against the unaided human team; the comparison against the Claude-assisted human team is 18.9x. On every single task that at least one human team completed in Phase One, Opus 4.7 finished it at least ten times faster.

The code difference is striking in its own right. Opus 4.7 produced 1,045 lines. The Claude-assisted human team wrote 10,309 lines for comparable results.

The physical retrieval failed. The robot located the ball and moved toward it. The fine-grained closed-loop control needed to actually push or grip the ball did not work.

What the Numbers Actually Mean

The 37x speedup is real. It also needs context before drawing conclusions.

Anthropic measured and published this themselves. The numbers deserve independent replication before treating them as settled benchmarks. That caveat is not a knock on the research; it is the standard of scientific practice.

The more structural question is what fraction of real robotics work is software. Code writing is probably 20-30% of a robot development cycle. The rest is hardware integration, sensor calibration, field testing, and safety verification. Even a 37x speedup on the software phase compresses the total timeline by something closer to 4-6x under realistic assumptions.

The 1,045 vs. 10,309 line difference also has a plausible mechanical explanation. Language models tend to find the most direct path to a working solution. Human teams often explore multiple approaches before converging, leaving exploratory code in their submissions. Fewer lines from Opus 4.7 likely reflects a cleaner solution path, not necessarily higher-quality code.

The ball that stayed on the floor marks the actual technical boundary with precision. Sensor integration and code generation are tasks with clear API documentation and abundant training examples. LLMs are strong pattern matchers at this kind of work. Closed-loop real-time control, including motor feedback, physical contact sensing, and grip mechanics, requires sub-millisecond feedback loops that language models cannot currently handle natively. Anthropic describes this as “the early era of physical agentic AI.” The word “early” is doing real work in that description.

What to Watch Next

Anthropic has a science-focused announcement event scheduled for June 30. Given the April acquisition of biotech startup Coefficient Bio for $400 million, the May hire of Andrej Karpathy to the pretraining team, and Nobel laureate John Jumper joining from DeepMind, the company is building research infrastructure that spans robotics and life sciences. Project Fetch Phase 2 landing this week is likely a prelude.

The meaningful metrics for Phase 3, if it exists: the percentage of trials where the robodog physically retrieves the ball, and the latency of the full sense-to-action loop. Code generation is validated. Closed-loop control is the next problem to solve.

On the industrial side, the coding speedup demonstrated here, if reproducible in PLC programming and robotic arm control scripts, would significantly reduce software integration costs in manufacturing. That application does not require solving the retrieval problem. It only requires faster code writing, and on that metric, Project Fetch Phase 2 delivers.


If this was useful, subscribe to the newsletter for weekly AI PM insights and GenAI case studies.

Further reading:

Sources:

Get the latest insights

Join the newsletter to receive my latest articles on GenAI, AI Agents, and architecture.

No spam. Unsubscribe anytime.