Category Archives: Security

Executive Summary for Claude Mythos Project Glasswing: June 2026 Verification Status

June 8, 2026 Davi Ottenheimer Leave a comment

A market in which the buyer cannot measure what they bought is no market at all.

Morrell’s flashy claims of a revolutionary coast-to-coast rapid transit machine allegedly sold 250,000 shares of stock in a hotdog-shaped 450 foot gas balloon. It launched in Berkeley, California on May 23, 1908, after San Francisco had banned it. Source: The Jive Bomber

Summary

Anthropic’s central claim for Claude Mythos, that its capability is too dangerous to release, is unverified and increasingly contradicted. Independent researchers reproduce its results on commodity and open-weight models at negligible cost, among them the engineer who wrote the OpenBSD flaw Anthropic placed at the center of its launch. Its headline numbers are the model grading its own output, while the data that would allow independent verification stays withheld. Project Glasswing has continued to widen access and Anthropic has filed to go public, both ahead of the verification the program itself promised. Treat the claims as unproven, and defer any strategy, procurement, or risk decision that depends on them until the July 6 report is published and independently checked.

Strategic assumption

Through 2026, AI vulnerability-discovery capabilities marketed as frontier-exclusive will remain reproducible on commodity open-weight models, removing the technical basis for premium pricing and restricted-access programs.

This is a question of whether a premium nail-gun is worth paying for, versus many quality commodity nail-guns already available on the market, while the premium vendor runs a marketing campaign that it has to restrict access based on its own comparisons to a hammer.

Key findings

Of 23,019 vulnerabilities Mythos reported, 1,752 were verified by a human or security firm and fixes have been shown for 75. The 90.6% accuracy rate in press coverage applies to a human doing the work, not the large numbers from a machine alone.
The flagship discovery used to claim novel risks (FreeBSD CVE-2026-4747) is a 2007 fix for shared code that sat with a patch waiting to be applied. The fix was present in the model’s training data, making the result consistent with recovery from the backlog of delayed fixes rather than novel discovery.
Eight of eight open-weight models reproduced the detection capability, one at $0.11 per million tokens. On June 8, 2026, Glasswing launch partner Cisco ran six frontier models across 1.8 billion lines of code and showed results do not depend on Mythos.
No reproduction steps were published with the Anthropic launch blog, the system card, or the Glasswing update, meaning premium claims cannot be independently verified.
Anthropic has meanwhile filed confidentially for an IPO near a one-trillion-dollar valuation and expanded Glasswing to roughly 150 organizations, committing access and capital ahead of verification.

Recommendations

Treat AI-assisted vulnerability discovery as a commodity input and source it competitively. The showcase results are reproducible at low cost on public models. AI vulnerability harness runs should cost cents per million tokens, not tens of dollars or more. An open-source harness on commodity Haiku 4.5 and Sonnet 4.6 produced eight findings in two minutes for $0.75, two of them matching the Mythos showcase, at the discovery layer. The FreeBSD exploit was reproduced separately by Calif.io on the prior Opus 4.6 model in about four hours.
Do not pay Anthropic a premium or restructure operations on the basis of the Mythos security capability claim until an independent verification exists.
Require any AI security vendor to supply reproduction steps and verified, fixed CVEs rather than model-generated finding counts.
Set July 6, 2026 as a validation checkpoint, and reassess with the Glasswing report published and independently reviewed.

The flagship “discovery” was backlog recall

CVE-2026-4747 is a valid stack buffer overflow in FreeBSD. The code is a University of Michigan implementation that was patched by MIT in 2007. FreeBSD imported the unpatched code in 2008 and never applied the fix. This 2007 patch is present in the model’s training data, so the Mythos published exploitation demonstration took an old vulnerable operating system with a known missing patch and pointed at it. The result demonstrates how a known, undefended target can be flagged by AI, rather than discovery of anything unknown.

Discovery is reproducible at commodity cost

The CVE explanation should help clarify why independent parties have repeatedly reproduced the showcase findings on very inexpensive public models. AISLE confirmed the FreeBSD detection with eight of eight open-weight models, showing $0.11 per million tokens was a sufficient cost model. Vidoc reproduced it on the public Opus 4.6 model and on GPT-5.4. Cisco’s June 8 assessment across six frontier models showed the outcome is model-independent. The curl maintainers reported no change to their workflow, and Mozilla’s headline of 271 Firefox vulnerabilities reconciles to roughly three against the advisory record. Discovery at this level carries a published, commodity cost.

The premium is unjustifiable as presented

Anthropic prices Mythos at roughly five times its public Opus model, from $25 to $125 per million input and output tokens, on the strength of exploit development rather than discovery. No replayable exploit with reproduction steps accompanies the launch blog, their very large and inefficient 244-page system card, or the late-May Glasswing update. A buyer cannot confirm the capability they are paying for, and the available reproductions indicate the defensible cost is a fraction of the quoted price.

Results are self-assessed, data is withheld

Anthropic’s interim Glasswing update reports results in stages that have undermined their own headlines.

Stage	Figure	What it represents
Total findings	23,019	The model’s ungraded output
Estimated high or critical	6,202	The model’s own estimate
Checked by a human or firm	1,752	28% of the high-critical pile, about 8% of the total
True positives among those checked	90.6%	A statement about the 1,752, not the 23,019
Fixes shown	75	Out of 23,019

The 90.6% accuracy figure is from humans. The rest is just the model assessing its own output. Anthropic has also withheld the fixes used to derive the findings, the artifacts that would allow independent re-derivation. A result that can be validated only against the system that produced it, does not rise to the level of independent confirmation of its capability.

Extractive disclosure structure

The disclosure architecture inverts established norms, and economics are the reason why. Anthropic commits up to one hundred million dollars in model credits to a consortium of about a dozen large firms. The consortium attests to the capability that justifies restricting the model to the consortium, and the same firms sell the products and services that follow from that attestation. A rushed “emergency” memo about Mythos risks crediting 250 CISOs was apparently curated by security vendors who would capitalize on myths about machine risks. The most consequential findings instead have come from humans during the Glasswing period: the Palo Alto vulnerability that triggered a federal mandate was attributed to attackers operating in production. It was excluded from the company’s AI-credited count. Findings are directed to Anthropic while fixes fall to volunteer maintainers, even as the patch-generation step that a model can automate already runs in production for paying customers. Anthropic’s Claude Security product patched more than 2,100 vulnerabilities in three weeks for paying customers, while the open-source projects apparently have only received reports.

Market motivations

On June 1, 2026, Anthropic filed confidentially for an initial public offering following a funding round near a one-trillion-dollar valuation. On June 2, it expanded Glasswing to roughly 150 organizations across more than fifteen countries, covering power, water, healthcare, and communications. Access widened and capital was committed before any independent validation of the capability, and before the report Anthropic itself promised.

Outlook

Anthropic committed to a public report within ninety days of the April 7 launch, due around July 6, 2026. However, the question of novelty has been repeatedly answered. With each reveal, Mythos has failed to prove its initial claims. A report containing a verified CVE list with reproduction steps would substantiate the capability claim and the program’s premise. A report that restates model-graded headline figures without independent verification would confirm the pattern described here.

The prudent posture is to treat their unproven capability as unproven.

Morrell’s airship rose about 300 feet and then ripped apart and crashed, shortly after it’s first launch on May 23, 1908. Source: The Jive Bomber

References: flyingpenguin series

The Boy That Cried Mythos: Verification is Collapsing Trust in Anthropic, April 13, 2026.
America Prepares as Anthropic Mythos is 100X More Deadly Than Martian Death Ray, April 13, 2026.
FreeBSD CVE-2026-4747 Log Suggests Mythos is a Marketing Trick, April 14, 2026.
Cartel or Not? Anthropic Mythos is a Curious Case, April 15, 2026.
Ox Security Report: Anthropic MCP is Execute First, Validate Never, April 15, 2026.
How SANS Mythos Marketing Disappoints Defenders, April 16, 2026.
Mythos Mystery in Mozilla Numbers: How 22 Vulns Became 271 or Maybe 3 in April, April 22, 2026.
Alisa Esage Throws Mythos Under Zero Day Bus, April 24, 2026.
Anthropic Mythos as Valuable as a Firehose in a Blizzard, May 2, 2026.
Seventy-Five Cents Gets You an Anthropic Mythos Killer, May 4, 2026.
cURL Toe to Toe With Mythos: Big Nothingburger Leaves Bad Taste, May 12, 2026.
Palo Alto Defender’s Guide Refutes Mythos Claim, May 13, 2026.
I’m on Mythos, May 25, 2026.
Mythos Grading Mythos: Got Patches Yet?, May 26, 2026.
Cisco’s Mythos Post Throws Anthropic Under the Bus, June 8, 2026.

References: Anthropic program materials

Project Glasswing (program page), Anthropic.
Project Glasswing: An initial update, Anthropic, late May 2026. Source of the 23,019 / 6,202 / 1,752 / 90.6% / 75 figures and the 90-day disclosure convention.

References: independent reproduction and refutation

AISLE reproduction: eight of eight open-weight models detect CVE-2026-4747, one at $0.11 per million tokens. Documented in references 1 and 10.
Vidoc reproduction on public Opus 4.6 and GPT-5.4. Documented in reference 10.
Nicholas Carlini’s personal confirmation that he found CVE-2026-4747 using Mythos Preview, placing it outside his February 5 paper. Documented in references 3 and 10.
Cisco frontier-model assessment, six models across 1.8 billion lines of code. Documented in reference 16.
Palo Alto Networks May 2026 Defender’s Guide and the CVE-2026-0300 advisory, with the federal-mandate CVE attributed to attackers in production and excluded from the AI-credited count. Documented in reference 13.
Mozilla Foundation Security Advisory 2026-30 (Firefox 150) and Bobby Holley, “The zero-days are numbered,” Mozilla blog, April 21, 2026. Documented in reference 7.
Claude Mythos Preview system card (244 pages), Anthropic. Documented in reference 1.

References: press on the June expansion and IPO filing

Anthropic scales Claude Mythos to critical infrastructure in 15+ countries, TechCrunch, June 2, 2026.
Anthropic expanding access to Project Glasswing, CyberScoop, June 2026. Source for Claude Security patching 2,100+ vulnerabilities in three weeks.
Anthropic expands Mythos to 150 additional organizations in more than 15 countries, CNBC, June 2, 2026.
Anthropic expands Project Glasswing to 150 organizations in more than 15 countries, Help Net Security, June 3, 2026.
Experts: Anthropic’s move to expand Project Glasswing will end in Mythos public release, Cybernews, June 2026.

Security

Cisco’s Mythos Post Throws Anthropic Under the Bus

June 8, 2026 Davi Ottenheimer Leave a comment

Cisco has posted their assessment of Anthropic’s latest model and it is close to the opposite of “Mythos is working for them.” Their post emphasizes most that the model is interchangeable. They ran six frontier models (Mythos and GPT-5.5-Cyber named among them) on 1.8 billion lines of code to show their results are not tied to any one of them. The line they keep returning to is that the model is an accelerant while the harness is the engine.

Commodity unix box painted white with a bridge on the side. Do not open.

Of course we have to pause and reflect on why it’s an executive post by their Chief Security & Trust Officer yet it carries no disclosed methodology. Does Cisco think trust comes from lack of transparency? I get that the Cisco security executive is selling Cisco’s harness as a counter-argument to Anthropic. I’m just curious why it’s lacking the kind of specificity a harness buyer would be looking into. Fun history fact: the first “Chief Information Security Officer” job was invented in 1995 for Steve Katz to calm Wall Street after Russians popped the funds-transfer system.

But I digress. The sub-3% false-positive rate being claimed by Cisco in their post is clearly a measure of their human-in-the-loop after triage. It cannot be attributed to the model, and it cannot be attributed to the harness either. It is a measure of people filtering machine output.

From there the post gets weird because it tries to contrast legacy static analysis (one finding per ten thousand warnings) against Cisco’s false-positive rate. One is a precision number, the other a false-positive number. That doesn’t add up.

Maybe I shouldn’t be surprised. Despite claims to being agnostic, no per-model variance is published. And their post headline has the same problem. “Eight years of work in eight weeks” is supposed to sound like they made a measurement. They didn’t. Cisco ran the scan in eight weeks. And then they made up an eight years estimate. Nobody worked eight years. It’s a guess about how long humans take to do something, without revealing how they cooked the estimate. Guess any slower and the number gets bigger. Guess faster and it shrinks. Why not eight gabookles? I believe them on lines scanned and the languages covered. The eight years sounds like management hallucination.

The figures Cisco published cannot be reproduced from anything Cisco released. If Cisco wanted to throw Anthropic under a bus (I almost said off a bridge), this is how it’s done.

Security

ToxicSkills Revisit: Loch Ness Levels of Mythical AI Risk

June 7, 2026 Davi Ottenheimer Leave a comment

In 2012 I rambled at BSidesLV that if you flood a system with enough volume and velocity, it fills with monsters that were never there (oh, and also that political coups would get easier with social media poisoning). Over the past week I was asked to assess nearly 70,000 AI agent skills, and I could not stop thinking about that mythical monster.

A regex pass flagged one in eight skills for critical risk. But then I went through the flags and 95% were nothingburgers: an installer, the author’s own API key, a cron job doing cron jobs.

Who wants to buy a Loch Ness Skill shirt?

Grey t-shirt with a line-art Loch Ness monster forming a code glyph, text reading Loch Ness Skill, 5% critical, 95% never there.

Perhaps you already know what I’m talking about. The agent skills are on ClawHub, behind the disaster known as OpenClaw. As you may recall, Snyk and Invariant published ToxicSkills last February, a real audit of this ecosystem, across 3,984 skills drawn from ClawHub and skills.sh. When I was asked to walk the live index today, I found 68,321 unique skills on ClawHub alone. That’s an AI-generated explosion of seventeen times the skills, in just four months.

Aside from the jump in numbers, three things from February look stale, right out of the gate. First, the named indicators are nowhere to be found: the eight skills the report listed as live, and the four authors behind them, are absent from what I saw. Second, the index keeps moving, and two skills I pulled for this study suddenly returned 404, and stayed gone when I rechecked. They were removed after I had begun, whether by registry takedown or author unpublish is unknown. That’s because, third, the registry now scans itself, with per-version VirusTotal, an LLM scanner, and capability tags that did not exist in February.

I did a static review, given each skill’s bundle is just a simple ZIP, which is all you need to read it without ever running it. Nothing in this study was executed. Luckily, I already had a tool laying around the office: an eight-policy regex detector within Lyrik that can mirror the ToxicSkills taxonomy. Using a sample of 1,500 skills it basically showed what a pattern scanner sees.

The regex detection pass flagged 12.6% of skills as critical and 53.8% as having some issue. But reading these flags revealed legitimate agent-skill overlaps the malicious-pattern match almost completely. A run-of-the-mill installer (uv, aliyun, foundry) shows up as a suspicious download. A scheduling command shows up as dangerous persistence. A skill cleaning up its own directory shows up as a destructive delete. A doc that says “export your API key” reads as credential dumping. You can probably see the problem, because it’s obvious to the human eye. The emoji’s zero-width joiner reads as Unicode smuggling.

A regex number of 12.6% measures patterns, not malware, so there’s an important judgment layer missing. Is the delete helpful or malicious? You have to be the judge because the tool can’t.

I thought about researching whether the “malicious prevalence fell from 13.4% to X.” Too many variables ruin the idea. The instrument, the definition, and the population all differ. Snyk ran a model engine; I ran a regex baseline and then a model adjudicator under a different threat model. Their critical classes include prose prompt injection, which I carved out because the method can’t see it. They deduplicated two registries in the dinosaur days of last February; mine is a sample of ClawHub today, and the worst skills they found were removed. The only fair comparison is the size and named indicators disappearing from the index. Everything else is an independent measurement, perhaps for the better. Perhaps an apples-to-apples is for a later day.

The February post said this about its detectors:

intentionally tuned to minimize false positives on widely adopted legitimate skills; these numbers represent real risk, not scanner noise.

I did not run their stuff so I cannot speak to the veracity of this claim. But I can surely ask out loud today whether throwing flags is really the best approach? The answer is they are peddling mostly noise, and the report’s own authors admit it: they write that single-LLM or regex-only scanners miss the behavioral prompt-injection patterns their engine catches.

My research seems to prove that their pattern layer does not just miss things. It invents them.

This is what I learned when I took Lyrik, as a code auditor that scores findings twice against a written rubric, to see whether a bundle, by static evidence alone, performs or installs a dangerous action that the user-facing description does not surface. I searched primarily for what I decided to flag as something “undisclosed-dangerous”.

The cleanest example of what this means is a skill called auto-domain. Its description promises only to detect a port and hand you a public URL. Its bundled script downloads a native binary from a stranger’s personal repository, makes it executable, runs it as a persistent background daemon, and routes your traffic through a bare IP address. The script’s own help text lists the backend, while the description a user sees does not.

As expected, credential leaks are all over the place, even though not all the same. Authors commit their own API key into their own skill. That endangers the author and invites abuse of their quota. A smaller set is more interesting: live database credentials and a WeChat secret reach infrastructure other users touch. In one case, called deepseek-balance, it falls back to sending the user’s Anthropic token to a different vendor.

On the flags the regex layer called critical, Lyrik confirmed 9 of 188. More than 95% of what the pattern scanner called critical was cleared with a cited reason. Of everything Lyrik flagged, its label was right 26 times out of 37, about seventy percent, with a wide interval at that sample size. It never once fabricated evidence: every secret and endpoint it cited was real in the bundle.

The method used was blind to two things. First, as mentioned above, it does not read prose prompt injection, the natural-language attack hidden in the description itself. That is one of the three classes the regex baseline leaned on hardest, and Lyrik isn’t yet designed to do anything about it.

The second blind spot is the one the study quantified. Static analysis of a bundle can’t see code in an external clone, or a remote install target. That’s notable when 4.5% of flagged skills hide their payload outside the bundle, and 3.2% ship a confirmed dangerous one inside. Roughly as many skills put the dangerous stuff where you cannot look as put it where you can.

The security vendor posts usually end with a self-serving call-to-action. Every section resolves to a product, and the last screen is a demo button. That’s a reasonable step since it’s saying they can help with the problem they just described.

I suppose I’m different because I have nothing to sell you here. My concern is the skills you install today have access to your credentials today, whether or not anyone monetizes you being alarmed about it. A regex scanner will hand you a number that is 95% mythical and call it risk. That’s operator-fatigue levels of noise. A better system runs at about seven in ten right and never invents evidence. Lyrik is free and open source, like many of the best tools, so there’s not a reason to buy anything. It is a reason to read the skills before you run, and to be wary of any system that doesn’t prevent bad skills.

In 2012 the joke was that big data was going to be so vulnerable that we would be hunting monsters that didn’t exist. Fourteen years later I’m seeing a reported critical rate that’s 95% mythical.

Sailing, Security

Police Arrest Sunken Cybertruck Owner For Elon Musk Stunts

June 5, 2026 Davi Ottenheimer Leave a comment

I think the buried lede in this story is that the guy arrested is the same guy Elon Musk has been promoting as evidence the Cybertruck can cross deep open water.

In April of 2025, Musk commented on a video of a Cybertruck moving through shallow water in Lake Grapevine—perhaps one of McDaniel’s previous Wade Mode escapades—writing, “With a little work, it should be able to cross some open water.”

And back in 2022, before the Cybertruck’s release, Musk hyped up the vehicle’s then-unseen Wade Mode features, saying that they’d essentially turn the car into a viable watercraft.

“Cybertruck will be waterproof enough to serve briefly as a boat, so it can cross rivers, lakes, and even seas that aren’t too choppy,” Musk wrote.

To say the Cybertruck concept as a whole is underwater is an understatement.

So they arrested Elon’s mule?

When he made it to shore, McDaniel was arrested on multiple charges, including driving a vehicle in a closed section of the park and boating law violations, such as not having a valid boat registration and not having lifejackets on board.

Boat registration. Life jackets. Those are small hurdles, not barriers.

He admitted he’s been doing this exact stunt multiple times and intends to do it again, clearly based on advice of Elon Musk.

Maybe charge him with pollution? Can’t get out of that one. Or here is a better one: try arresting Elon Musk.

a blog about the poetry of information security, since 1995