Category Archives: Security

Cartel or Not? Anthropic Mythos is a Curious Case

Anthropic did the opposite of every established norm in vulnerability disclosure, and I am curious why. Or to be fair, someone commented on my earlier post that they aren’t sure “cartel” is a fair term here.

Ok, challenge accepted.

A company that genuinely believed it had discovered an unprecedented offensive capability would do what every serious security organization has done since CERT/CC was founded in 1988: coordinate disclosure quietly, patch the affected systems, publish the technical details after remediation, and let the work speak for itself.

In the aftermath of the Morris Worm attack, the Defense Advanced Research Projects Agency (DARPA) asked the SEI to establish an incident response capability. In 1988 Pethia founded the CERT Coordination Center (CERT/CC) as the first computer emergency response team.

That’s right, 1988. We aren’t new to this. And even the newest and shiniest researchers usually get it. Project Zero does this. Talos does this. ZDI does this. They don’t hold press conferences before patches ship, because they understand the economics, if not the history. They don’t create named consortiums with Fortune 100 logos to gate-keep safety because that’s been known as hoarding. They don’t radically inflate the discovery tool price 5x the prior product, and then discount it $100M in credits to nudge people to use it, because that’s known as manipulation.

What does a normal discovery look like? Let’s say Google’s Project Zero finds a critical vulnerability. They file it with the vendor, start a 90-day clock (incidentally a very short window that was pushed as a standard by Google), publish the technical writeup after the patch ships, and move on. The writeup includes reproduction steps, version numbers, CVSS scores, and enough detail for independent verification. A consortium adds nothing. They don’t withhold discovery tools. They don’t issue press releases about how their own researchers are too dangerous to let out of their cages. The work is the product. The reputation follows the work. I recently gave Google a good rubbing for issues with their attacks on Microsoft NTLM, if you want to see an example of the normative dynamic. Google turned threat knobs up to 11, while saying damage is the defenders’ problem.

The clever mind will notice Google is a Glasswing partner, and Google’s own CISO Heather Adkins is a contributing author on the CSA “not just one model, one vendor, one announcement… Mythos-ready” paper. Ok, point taken. But that does not validate the consortium. Look at Google’s actual Glasswing quote:

It’s always been critical that the industry work together on emerging security issues.

That is a politically neutral statement about showing up to work anywhere with anyone, not a statement about Mythos finding anything in Google’s products. Google joining Glasswing is Google taking a zero-risk seat on the possibility that Anthropic’s claims are real. It is a rational move for any company to take a free seat at a table where vulnerability information will be shared. It is not an endorsement of any evidence, which Google’s own Project Zero standards would reject on the first page of the Anthropic card.

There are so many examples of what normal discovery does that it raises the obvious question of why Anthropic enters the ring only to start punching themselves in the head. When DARPA AIxCC found 54 vulnerabilities in four hours at DEF CON, it was a competition with public rules, public participants, public results, and no commercial product attached. When AISLE found 12 OpenSSL zero-days, they published the methodology and the results. When Carlini’s own February paper documented 500+ vulnerabilities with Opus 4.6, it followed a recognizable research disclosure pattern.

Mythos breaks every one of those norms, as if someone in the marketing department grabbed the reins and said the world is about to change for… marketing department KPIs.

The capability is withheld. The evidence is self-evaluated. The technical document refuses to quantify the headline claim. The disclosure regime is controlled by the vendor. The consortium is funded by the vendor. The partners validate the vendor’s product as a condition of access.

Seasoned practitioners DO NOT do this.

Decades of vulnerability management, thousands of experts, have taught the industry several things that Anthropic’s Mythos launch ignores or contradicts:

  1. Volume is not severity. OSS-Fuzz finds thousands of bugs per quarter. The industry absorbs them through triage, prioritization, and patching. “Thousands of vulnerabilities” is not a crisis. It is a Tuesday. A first-year researcher panics at the number. A seasoned CISO asks: what’s the CVSS distribution, what’s the exploitability, what’s the exposure, what’s the patch velocity? The system card answers none of those questions, while flooding the reader with over 200 pages of unnecessary filler and nonsense.
  2. Discovery is the easy part. The constraint on vulnerability management has been remediation for over a decade. Finding bugs faster without fixing them faster grows the backlog already growing beyond capacity. Anthropic’s own stated justification for Glasswing is defensive uplift, yet their system card measures zero remediation metrics. No patching velocity delta. No mean-time-to-remediation. No partner-reported CVE closure rate. A seasoned security leader would never build a defensive program and then measure offensive capability only, making remediation a second-class story. That is the kind of dog and pony show that any good security initiative would slam the door on. Or it’s like a surgeon telling you they have an even sharper scalpel to cut you deeper and faster. Yeah, so then what?
  3. Severity is not leverage. Responsible disclosure exists precisely to prevent the weaponization of vulnerability knowledge for commercial or political advantage. The entire ethical framework of the security community is built around the principle that knowledge of vulnerabilities is meant for remediation, not a matter of market positioning. McAfee predicted five million Michelangelo infections in 1992, got a few thousand, never retracted, and rode the panic to a decade of market dominance. Symantec ran the same playbook for years: inflate the threat, sell the cure. The industry is still recovering from the blocklist monoculture those companies built on manufactured fear. When a vendor withholds a capability from the market, grants selective access to the largest incumbents, and funds participation through its own credits, the vulnerability knowledge is being used as leverage. That is the opposite of responsible disclosure. It is cynicism and protection racketeering that undermine decades of trust building by well-intentioned hackers.
  4. Fear is evidence of something other than security engineering. The “too dangerous to release” framing is doing commercial work, not security work. It justifies inflationary pricing with the same ethics as surge pricing taxis during a terror attack. It should not be used to justify a consortium structure. It should not justify withholding a model from the market. It should not justify a $100M credit pool that creates a vendor-funded validation loop. Every element of Anthropic’s commercial structure depends on fear being their blunt tool. If the capability is in fact already a commodity, as AISLE has quickly demonstrated, the fear-based pricing collapses, the consortium has no reason to exist, and the withholding is just a market corruption.

The burden of proof is on Anthropic now to show more evidence, or it’s a manufactured crisis pattern. Step back and look at their sequence first as a product launch and then ask if there is any security event:

  1. Build a model.
  2. Test it against targets with mitigations removed.
  3. Get a headline number (72.4%) that collapses under scrutiny (4.4%).
  4. Put the headline number in the blog and the collapse in a footnote on page 52.
  5. Claim “thousands” in the press materials and refuse to quantify in the technical document.
  6. Attribute a prior model’s finding to the new model.
  7. Price the new model at 5x.
  8. Create a consortium of the largest companies in tech.
  9. Fund their participation with credits for the new model.
  10. Frame the whole thing as too dangerous for the public.
  11. Let the press, the government, and the institutional ecosystem, roped in and invested in the price jump, do the rest.

That is not how a company responds to a genuine security discovery. That is how a company manufactures urgency to launch a premium product into a market that didn’t know it needed one. The crisis is the marketing. The consortium is the channel. The fear is the pricing justification.

A first-year vulnerability researcher inflates severity because they don’t know better. It is the job of the expert to help them gain the intelligence necessary to become wise.

Anthropic inflated severity without any apparent reason other than their business model depends on it. That seems worse than just being new or bad at security, because they do know better.

Their own system card proves they had staff who knew that the 72.4% collapses to 4.4%. They published both numbers. They pushed the wrong one into the headlines, into the velvet rope promotional events, and the real one got dropped along the way somehow.

I’m not a lawyer, but cartel is easily the word that comes to mind for me.

Glasswing is sharing more than technical threat information. The partners get early access to a commercial capability withheld from competitors outside the consortium. A company on the Glasswing list gets to scan its own products and its competitors’ products before the rest of the market can. Apple, Google, Microsoft, and Amazon are on the same list. They compete in operating systems, cloud, browsers, and AI. Knowing which of your competitors’ products has unpatched vulnerabilities, before anyone else does, is textbook competitively sensitive information. Strike one.

As much as I’m a huge information sharing guy, I recognize that information exchange alone, even through an intermediary, can constitute concerted action under Section 1 of the Sherman Act. The DOJ argued exactly this in their 2024 Statement of Interest in the Pork Antitrust litigation: exchanges facilitated by intermediaries can have the same anticompetitive effect as direct exchanges among competitors. Anthropic is the intermediary. The partners are competitors. Strike two.

The $100 million in credits creates a financial relationship between the intermediary and the participants that goes beyond information sharing. Anthropic is subsidizing competitors’ use of its product in exchange for validation and participation in a disclosure regime that only Anthropic controls. That is a vendor-mediated exchange where the vendor has commercial interests in the outcome. Strike three.

Yer’outta here.

Selective access to a withheld capability, vendor-funded participation, incumbents shaping disclosure timelines on their own products, and a partner list drawn entirely from the largest companies in the affected industry. That is coordinated market control. The word cartel is for exactly this. The fact that Anthropic framed it as safety does not change the meaning of the word cartel, it reveals why they think they found a loophole.

EFF Proposes Granular Government Control Over Speech Devices

The policy that EFF just advocated on their blog and the business outcome the NETGEAR CEO just announced are the same policy and the same outcome. And that’s a very bad thing. The consumer is the justification in both cases and the beneficiary in neither.

Whether their combined failure is coordination or convergent interest doesn’t matter. Both organizations take an externally imposed condition and narrate it as evidence of their own virtue.

  • NETGEAR can’t sell new routers without a government waiver? Then the waiver becomes proof they’re trustworthy.
  • EFF can’t generate traffic on X? Then they leave claiming it as proof they have standards.

These are identical and disappointing because they convert an unjustified constraint into a credential.

EFF’s position on router bans protects the same move NETGEAR is making to falsely credential itself. EFF argues for product-by-product evaluation instead of a geographic ban. NETGEAR is the poster child for that argument: a US-headquartered company manufacturing abroad that would sail through a Cyber Trust Mark certification while being caught in a geographic ban. EFF’s “better policy” argument backs up the literal corrupted regulatory environment where the NETGEAR letter from their CEO makes sense.

To our Valued Customers:

We’re pleased to share that NETGEAR is the first retail consumer router company to receive conditional approval from the Federal Communications Commission (FCC) as a trusted consumer router company. We hope this recognition gives you added peace of mind — knowing that the network powering your home meets rigorous standards.

First of all, my name isn’t “our Valued Customers”. Did someone in China write this?

Second, “conditional approval” is a waiver from a blanket ban, not recognition or endorsement. It says “we licked boot” or “we kissed ring” and basically nothing more. The “transparency” process requires full management structure disclosures, supply chain disclosures, and a plan for onshoring manufacturing. None of that is real security. It’s basically a hostile roadmap for the Trump family to exert control over private companies. There’s no mention from NETGEAR how they navigated Andrew Jackson levels of federal corruption, and nothing has been proven about consumer safety let alone a security certification.

Third, at the risk of repeating myself, “rigorous standards”? What? I see none.

The traditional EFF position on this would frame Trump’s blanket ban on foreign hardware as overreach, full stop. Instead, their response has been to swallow expanded government authority and loss of liberty. Because why? Are they worried about certain forms of hardware or software being among the good stuff, justifying tailored controls? What proof of any gain in security do they need to prevent overreach?

Let me put it like this. The EFF says we can’t narrowly tailor speech to block Nazis. We have to let even the most heinous and deadly intent for genocide flow on networks. They take a reverse granular approach to find and support Nazis at risk of being held accountable, setting them free to do more harm. However, Cyber Trust Mark is suddenly their granular, case-by-case government evaluation standard we should get behind for devices that control speech? Why not apply that logic to content on them? The Cyber Trust Mark literally could judge whether the router is capable of filtering Nazi speech.

The Cyber Trust Mark evaluates a product’s security properties on a case-by-case basis. Content moderation evaluates speech properties on a case-by-case basis. How different are they?

Hello, is this thing on? Can you hear me?

Is your router blocking me? I joke but UK Virgin routers literally block this blog by default as unsafe content because… I talk about Nazis being bad. That’s true.

EFF champions and rejects the same thing without a logic to sustain the contradiction. The router that passes Cyber Trust Mark certification could enforce the content standards EFF says nobody should enforce. The hardware and the speech run through the same device, and EFF has now argued for granular government authority over the device while arguing against granular authority over what flows through it.

FreeBSD CVE-2026-4747 Log Suggests Mythos is a Marketing Trick

Anthropic’s flagship showcase for Claude Mythos Preview is CVE-2026-4747, a remote kernel code execution vulnerability in FreeBSD’s RPCSEC_GSS module. It is a 17-year-old bug. It is a textbook stack buffer overflow. And it was found before Mythos, patched by FreeBSD, and publicly exploited by a third party. Yet someone’s idea of credit flows backwards to Mythos.

The FreeBSD security advisory says this:

Credits: Nicholas Carlini using Claude, Anthropic
Announced: 2026-03-26

The advisory notably credits “Claude”, leaving out the model that Carlini used in his February 2026 paper documenting 500+ vulnerabilities found by the prior model.

Then the Anthropic Mythos launch blog says this:

Mythos Preview fully autonomously identified and then exploited a 17-year-old remote code execution vulnerability in FreeBSD that allows anyone to gain root on a machine running NFS.

The FreeBSD advisory is dated March 26, and the Mythos launch was April 7, 2026. Twelve day gap.

Carlini is an Anthropic employee. If he used Mythos to find this bug, Anthropic controls the disclosure pipeline and the credit line. “Nicholas Carlini using Claude Mythos Preview, Anthropic” makes sense as their marketing pitch. It’s also weird to market tools in a disclosure. What brand office chair was he sitting on? Did Logitech provide the keyboard? Was his underwear Calvin Klein?

Ads in bug reports? The future integrity of vulnerability disclosure at stake

The simplest explanation for why they did not heavily brand promote Mythos in a March 26 advisory is that Mythos was not the model used. If that explanation is wrong, the question is why Anthropic left the most valuable attribution in the entire Glasswing launch on the cutting room floor of a FreeBSD advisory, only to claim it twelve days later in a blog post, without offering proof. Reversal is hard and not believable.

So either Mythos rediscovered a bug that Anthropic’s own prior model had already found, reported publicly, and gotten patched, or Anthropic is attributing the prior model’s work to the new product.

In the first case, the showcase proves Mythos can find what someone else already found. In the second case, the showcase is misattributed.

Neither version supports the “unprecedented frontier capability” narrative.

And both versions of this story are irrelevant next to the fact that AISLE showed 8 of 8 open-weight models detect the same bug, including a small model that costs eleven cents per million tokens.

That’s everything.

The frontier-exclusive claim dies on the commodity reproduction regardless of which Anthropic model found it first.

Timeline

  • February 5, 2026: Carlini and colleagues at Anthropic’s Frontier Red Team publish “Evaluating and mitigating the growing risk of LLM-discovered 0-days.” The model is apparently Claude Opus 4.6. The paper documents over 500 validated high-severity vulnerabilities in open-source software, including FreeBSD findings. The FreeBSD advisory credits the same researcher, the same company, and the same disclosure pipeline that produced the February paper.
  • March 26, 2026: FreeBSD publishes advisory FreeBSD-SA-26:08.rpcsec_gss. Credits Nicholas Carlini using Claude, Anthropic. The bug is patched across all supported FreeBSD branches.
  • March 29, 2026: Calif.io’s MAD Bugs project asks Claude to develop an exploit for the already-disclosed CVE. Claude delivers two working root shell exploits in approximately four hours of working time. Both work on first attempt. The model used is Opus 4.6.
  • April 7, 2026: Anthropic launches Mythos Preview. The launch blog claims Mythos “fully autonomously identified and then exploited” the FreeBSD vulnerability. No mention of Opus 4.6, or that it found it first. No mention that FreeBSD patched it twelve days earlier. No mention that a third party had already built a working exploit with the prior model.
  • April 8-13, 2026: AISLE tests 8 open-weight models against the same CVE. All 8 detect it, including GPT-OSS-20b with 3.6 billion active parameters at $0.11 per million tokens.

The Vulnerability

CVE-2026-4747 is a stack buffer overflow in svc_rpc_gss_validate(). The function copies an attacker-controlled credential body into a 128-byte stack buffer without checking that the data fits. The XDR layer allows credentials up to 400 bytes, giving 304 bytes of overflow. The overflow happens in kernel context on an NFS worker thread, so controlling the instruction pointer means full kernel code execution.

Two things make the exploitation straightforward.

FreeBSD 14.x has no KASLR. Kernel addresses are fixed and predictable. And FreeBSD has no stack canaries for integer arrays, which is what the overflowed buffer uses.

A modern Linux kernel would have both mitigations. FreeBSD has neither. And the FreeBSD forums noticed. One user pointed out that Claude “wrote code to exploit a known CVE given to it” and did not “crack” FreeBSD.

That distinction matters a lot here, because Anthropic doesn’t seem very good at it.

  • The advisory was public.
  • The vulnerable function was identified.
  • The lack of mitigations was documented.

The exploit development, while technically impressive as an AI demonstration of cost reallocation, was performed against a disclosed vulnerability on a target with no modern exploit mitigations. That is a VERY different claim from “autonomous discovery of an unprecedented threat.”

Anthropic FUD Show

If you read the Mythos blog claim charitably, Mythos may have independently rediscovered CVE-2026-4747 during internal testing before launch. That is plausible. It is also meaningless as a capability demonstration, because Opus 4.6 found it first, a third party exploited it with Opus 4.6 three days later, and AISLE showed that an inexpensive old model finds it too.

If you read the claim less charitably, Anthropic presented a prior model’s discovery as a new model’s achievement in the launch materials for the new model. The FreeBSD advisory is a PGP-signed public document dated March 26 that credits “Claude,” not “Mythos.” The Mythos blog post claims the finding without acknowledging the prior discovery, which is damning. Anthropic controlled the credit line on the advisory. It’s not Mythos.

Either way, the showcase flops because it does not demonstrate what Anthropic claims.

The “too dangerous to release” framing requires the capability to be frontier-exclusive. A bug found by a prior model, detectable by small open-weight models for eleven cents per million tokens, on a target with no KASLR and no stack canaries, is the opposite of frontier-exclusive.

It is the worked example that proves the capability is already commodity.

Enough of This

“Hey kids. Nice trick. You just charged me over 200 times the going rate to fuzz a vulnerability that my 3.6B model found for a dime. Now I’d like my credits back.”

This is the same structure as the Firefox 147 evaluation. Bugs found by Opus 4.6, handed to Mythos, tested in an environment with mitigations removed, presented as evidence that Mythos is too dangerous to release.

The Firefox bugs were pre-discovered by Opus 4.6 and already patched by Firefox 148. The FreeBSD bug was pre-discovered by Opus 4.6 and already patched by FreeBSD on March 26.

In both of the cases we are expected to investigate, the prior model found the bugs.

In both cases, the targets lacked the defenses that production systems have.

In both cases, AISLE reproduced the detection on pocket-change models.

In both cases, I’m getting tired of this not being the actual news.

  1. The system card’s Firefox evaluation collapses to 4.4% when the top two bugs are removed.
  2. The FreeBSD showcase collapses entirely when you read the date on the advisory.

The Anthropic Riddle

Did Mythos find CVE-2026-4747 independently, or did Anthropic attribute the prior model’s finding to Mythos in the launch materials?

The FreeBSD advisory is a signed document with a date and a credit line. The Mythos blog post seems to be a sloppy marketing document with a bullshit claim.

If Mythos found it independently, say so explicitly, with timestamps, and explain why rediscovering a bug your prior model already found and got patched is evidence of unprecedented capability rather than evidence that the capability is already widespread.

If Mythos did not find it independently, retract the claim, and tell the hundreds of people signing up for Martian gamma ray defense training that it’s all just a sad joke.

The PGP signature on the FreeBSD advisory is there for a reason. It’s one thing in this entire story that cannot be edited after the fact, which now says a lot about the current trajectory of trustworthiness in Anthropic.


Sources

Texas Governor Moves to Defund Police

Texas Governor Greg Abbott is already moving to defund the police in Houston. He is pulling $110 million in public safety grants that fund police, fire, emergency preparedness, and security operations for the 2026 FIFA World Cup at NRG Stadium.

It seems to be related to an ordinance that says police shouldn’t wait around for federal agents. He gave Mayor John Whitmire until April 20 to repeal a money-saving efficiency ordinance or repay the full $110 million within 30 days. Attorney General Ken Paxton, running for U.S. Senate, opened an investigation the same week and raised the possibility of removing elected officials from office. For what?

Council Member Abbie Kamin called the Texas State order what it is: Abbott is “defunding the police.”

Houston’s city council simply voted 12-5 last week to do what is expected of them: free police officers from being saddled with detaining people or prolonging traffic stops only over civil immigration warrants issued by ICE. Officers still contact ICE. They just don’t stop all police work to instead sit around and physically detain people while federal agents may never show up. If ICE wants someone physically detained, that’s ICE’s job, while the police have more important actual police work to do.

This is routine. San Antonio requires officers to contact ICE but operates the same way. Dallas officers don’t wait for ICE to respond either. Austin and Dallas give supervisors discretion over whether to contact ICE at all. Houston’s policy is more cooperative than several other major Texas cities.

Abbott is targeting Houston in his first move to defund the police. If the Houston ordinance stands, Texas defunds police by cutting budget. If the Houston ordinance is removed, Texas defunds police by cutting authority.

Police lose, either way.