Agentjacking: How Fake Sentry Errors Trick AI Coding Agents Into Running Hacker Code
TL;DR
Tenet Security reveals Agentjacking: attackers inject malicious commands into Sentry error events, which AI coding agents like Claude Code, Cursor, and Codex execute with 85% success rate. 2,388 organizations have exposed DSNs. Sentry declined to fix the root cause.
Agentjacking is the name Tenet Security gave to an attack class they documented in June 2026, and the mechanics are uncomfortable in their simplicity. An attacker finds your public Sentry DSN (often embedded in your client-side JavaScript bundle), sends a malicious POST request, and waits. The next time a developer asks Claude Code or Cursor to “fix the open Sentry errors,” the AI agent executes the attacker’s commands using the developer’s own credentials, on the developer’s own machine, with no indication anything is wrong.
Tenet tested the attack across the most widely used AI coding assistants and achieved an 85% exploitation success rate.
How the Attack Chain Works
Sentry is the dominant error-tracking platform in professional software development. When integrated with AI coding tools via the Model Context Protocol (MCP), it allows developers to query unresolved errors and ask their AI assistant to propose fixes. This workflow is increasingly standard across engineering teams using Claude Code, Cursor, or OpenAI’s Codex.
The vulnerability sits at the intersection of two design decisions. First, Sentry’s event ingestion API accepts payloads from anyone possessing a valid DSN, with no authentication beyond that key. Second, the Sentry MCP server returns error event content to AI agents as trusted system output, with no distinction between legitimate error data and attacker-injected content.
The attack chain runs as follows:
- Attacker locates the target’s Sentry DSN, typically found in public JavaScript bundles
- Sends a crafted error event via HTTP POST, with malicious “resolution steps” formatted as markdown inside the message and context fields
- Developer asks their AI coding agent to address open Sentry issues
- Agent reads the response, treats injected instructions as legitimate diagnostic guidance, and executes attacker-controlled code
Successful exploitation gives attackers access to CI/CD credentials, private repositories, cloud infrastructure controls, and the ability to install persistent backdoors, all executed under the developer’s own user context.
During testing, Tenet successfully took control of an AI coding agent inside a Fortune 100 company with a market cap of $250 billion. The Hacker News reported the attack worked against Claude Code, Cursor, and Codex, with the 85% success rate measured across the most widely deployed agents.
The Numbers Behind the Numbers
The 2,388 organizations figure represents the attack surface, specifically, the number of companies Tenet identified with injectable Sentry DSNs discoverable via standard scanning techniques. It is not a count of confirmed breaches.
The relevant question is what fraction of those 2,388 organizations have developers who use AI coding agents with Sentry MCP integration. Given adoption rates in professional software development by mid-2026, the realistic estimate is 30 to 50%, placing between 700 and 1,200 organizations in the “actively exploitable” category under normal workflows.
The 85% success rate requires context. Tenet measured this under conditions where developers actively invoked the agent to process Sentry errors, the most favorable scenario for the attacker. Real-world success rates will be lower because the trigger condition doesn’t fire automatically. But that’s also the wrong frame: once a developer issues the command, the defense rate is approximately 15%, which is not a number any security team can work with.
Attack cost is near zero. A public DSN and a single API call. The potential yield per successful breach, given CI/CD credential access, exceeds six figures in incident response and remediation costs alone, before accounting for IP exposure or customer data implications.
Tenet disclosed the vulnerability to Sentry on June 3, 2026. Sentry leadership acknowledged the issue but declined to fix it at the root level, stating it is “technically not defensible” at the architecture layer. The vulnerability remains open as of this writing.
What to Watch Over the Next Three to Six Months
Three specific developments will determine how quickly this attack surface closes.
First, whether the MCP specification adds an input trust classification layer. MCP is an open protocol, and it currently has no standardized mechanism for distinguishing high-trust system output from externally-injectable content. A Content Security Policy equivalent for MCP tool responses, one that lets AI agents identify and flag content from sources with external write access, would address the root pattern across all MCP integrations, not just Sentry.
Second, public statements from Anthropic, Anysphere, and OpenAI on how they plan to handle this at the agent layer. All three products were tested and breached. A reasonable mitigation at the agent layer would be adding a “human approval required before executing external service instructions” mode for MCP tools that accept third-party event payloads. Whether they treat this as urgent enough to push in the near term is an open question.
Third, whether Sentry revisits its position. “Technically not defensible” is a meaningful claim. If the statement reflects a genuine architectural constraint rather than a prioritization decision, then the solution has to come from external mitigations. Sentry adding DSN-scoped write permissions or server-side payload validation would change that calculus.
If you run Claude Code or Cursor with Sentry MCP connected, the immediate mitigation is straightforward: disable automatic Sentry error resolution workflows and require explicit human review before your AI agent executes any steps sourced from error-tracking platforms. The trigger condition is specific enough that this doesn’t significantly disrupt normal usage.
What’s worth debating in your team: does “trusted system output via MCP” need to be rethought more broadly? Sentry is the demonstrated case, but any MCP tool that accepts external writes and returns that data to AI agents is structurally similar. That’s a longer list than most security reviews currently cover.
If this was useful, subscribe to the newsletter for weekly AI PM insights and GenAI case studies.
Related Reading:
Related Articles
Anthropic Accuses Alibaba of Largest AI Distillation Attack on Claude: 28.8M Exchanges, Senate Sanctions Incoming
Anthropic told the US Senate that Alibaba ran the largest known distillation attack on Claude: 28.8 million exchanges across 25,000 fake accounts over six weeks, targeting Claude's most commercially valuable capabilities. The cost may have been under $90K. The competitive value extracted was orders of magnitude higher.
GPT-5.5-Cyber Is Live: OpenAI Used AI to Find 24 Linux Kernel Exploits
OpenAI launched GPT-5.5-Cyber on June 22. Daybreak already found 24 Linux kernel exploits, 5 Chrome V8 vulnerabilities, and 10 Safari flaws. The CyberGym score of 85.6% is the headline. The ExploitGym score of 39.5% is why access is restricted to vetted defenders only.