The Register asked for my opinion on the Mozilla blog post that pumps up Mythos. I gave them a short answer. Here’s the long form.
As a disinformation historian, I see tell-tale signs in the marketing from Mozilla beyond it just being disguised as engineering report. A conclusion comes first; examples get curated to support it; an authority signs; urgency closes; whatever would falsify has been pushed off the table to the floor.
To be fair, modern marketing follows the same shape as disinformation because they are rooted in the same thing in America, the WWI propaganda office. Mozilla’s new post on hardening Firefox with Claude Mythos Preview adheres to doctrine at length.
A marketing claim would be that you cannot run a mile under six minutes without drinking coca-cola. Then a sponsored athlete says they just did it. Compare that with measurement, which would say before coca-cola the athlete ran six minutes and after coca-cola, five. One is just a reading, a belief. The other is research, science.
The fundamental problem with the Mozilla post is that logically it keeps stabbing itself with its own swordplay. There are three claims that all need to stay alive at the same time for the post to make sense, and yet they are in danger of killing each other.
Security work over time has been rigorous.
271 latent bugs survived it.
Mythos was uniquely necessary to surface them.
Do you see the problem? Any two of these break the third.
If prior work was rigorous and 271 bugs survived, the bugs were findable under concentrated effort with existing tools. Mythos drops to one option among several. The uniqueness claim falls.
Try betting on rigor with uniqueness instead, and the marginal yield should be small. 271 is far past small. The rigor claim breaks.
If you try to read prior work as inadequate, the 271 fits. The post becomes a rebranding of old underinvestment as new capability. And the whole framing collapses.
The chart in the post shows this three-way collision. Twenty to thirty fixes per month for fourteen months, then suddenly 423. The prior baseline represented either diligent attention or sustained underinvestment. April’s number forces a big choice, an explanation, and the post avoids it entirely. It lays on the floor.
Let’s look at it from the top down. The post opens by naming exactly two causes for the breakthrough.
First, the models got a lot more capable. Second, we dramatically improved our techniques for harnessing these models.
Two, not one. Two.
A real finding would next isolate which of these produced which effect. Mozilla names two causes and then jumps straight to attribution of the single result to just one of them. 271 bugs, credited to Mythos Preview. No explanation why. Mozilla built the harness. Anthropic gets the credit. Something isn’t right there, particularly because the harness could in fact be the entire difference in findings. The post rules out clean attribution in its opening and then drops a dirty one as its headline.
Next, Mozilla admits Opus 4.6 was already producing in the same pipeline. This is one of the most important facts that needs to be highlighted in every conversation about Mythos.
We began with small-scale experiments prompting the harness to look for sandbox escapes with Claude Opus 4.6. Even with this model, we identified an impressive amount of previously-unknown vulnerabilities.
A controlled comparison would specify the Opus 4.6 baseline and then move towards Mythos as a delta. Mozilla published the Mythos number and left the baseline blank, a propagandist wave of the hand. The marginal contribution of Mythos over Opus 4.6 sits unstated. And if you read the Anthropic initial Mythos announcements, even they suggest Sonnet and Opus were doing far better discovery, leaving Mythos to a marginal, minor role. We have Mozilla floating 271 in the air without any way to get it to ground truth.
And then Mozilla admits the model itself is fungible.
Once the end-to-end pipeline is in place, it’s trivial to swap in different models when they become available.
Ok, ok, this is actually a huge swipe at Anthropic. Different models doesn’t specify same provider. Agnosticism crept into a religious tome, because users naturally want to be free of vendor lock-in. If models swap into a Mozilla security pipeline freely, then the most important variable is that pipeline and not the model. Mozilla built the thing that hooks into any model. The post then sleepwalks into credit for the model. That allocation of credit runs against the post’s own technical claim.
At this point, I’m ready to shred the Mozilla post, open the chicken coop and put down some new bedding. However, dear reader, apparently their show must go on and so I present you another logic flaw.
AI analysis provides much more comprehensive coverage of this critical surface.
More comprehensive than what, exactly? That’s a comparison statement without any comparison. Eating your shoe provides much more fiber for your belly. I fear the people writing never took a philosophy 101 course. The post gives literally no elements needed to stand up the comparison. This bomb of a sentence is dropped with the form of a finding, and lands a dud, with the content of an assertion.
Then comes the big closer.
Anyone building software can start using a harness with a modern model to find bugs and harden their code today. We recommend getting started now.
Research papers invite you to try to reproduce, validate, see if you get the same outcome. Replication is satisfying to those who want more than “believe me, this snake oil really cures you”. Sales pitches close deals. The Mozilla post shouldn’t fool anyone familiar with America’s troubled history with carnival barkers.
Mozilla makes a particularly troubling move at this point. They try to spin their empty narrative into an industry standard, which smells yet again like Anthropic trying to corner the security industry. Best-practice claims by a vendor and a big customer of their should not suddenly become liability baselines. A team running libFuzzer, AddressSanitizer, ThreadSanitizer, and CodeQL at full intensity should not face presumptive negligence claims if a bug surfaces that one VC-backed PE-pushed vendor’s harness might have caught. If you link the bar to rhetoric, you get a rhetorical bar detached from engineering ethics. That is regulatory capture by a vendor making a self-fulfilling recommendation to preempt regulation, let alone legislation. Belief is produced and weaponized into liability.
It’s like how Coca Cola ran marketing that was effectively “Drink Fanta for health” in Nazi Germany as if not drinking it was unhealthy.
Fanta was made from industrial food byproducts (apple waste, milk waste), yet marketed to Nazis as a healthy fruit drink. Fanta was short for “fantasy” because it was all about lies. Producing Fanta for Hitler during WWII, is how Coca-Cola avoided resisting him.
A better Mozilla report would have run Mythos and existing tooling head-to-head against the same code, then published the overlaps and the unique finds.
Mozilla published a 1940s Fanta health benefits brochure instead.
To be clear. Real bugs are real. Faster fixes help fix. It’s the hand-wavy self-dealing capability claim that makes the missing study somehow turn into this logically flawed press release.
Science publishes methods. Science is transparent. Mozilla did the opposite by writing outcomes and skipping straight to saying readers should adopt the approach. Why? Based on what, exactly? If they don’t invite scrutiny, and pull back the curtain, they are peddling hopes and prayers.
Before America entered WWII, Coca Cola openly promoted fascism in the Wisconsin market to attract “America First” consumersAnd given that Anthropic is saying customers will now have their data handed over to Elon Musk, I suggest you seriously consider whether you want any of if in his Hitler-saluting hands.
Sponsored athlete crosses the finish line and holds up the Fanta bottle. The camera zooms in close, frames the Nazi regime refreshment as a cause of great success. The control case is cut out of view, after running in the next lane without the logos. Jesse Owens posts a time far better than the uber Fanta man. The authority-controlled frame chooses for you which curated data point gets pushed to your attention.
Github has a serious breach problem. Someone pointed me to a repo called ClawCode and immediately I saw the telltale signs of an integrity breach. It has 187k stars against 0 releases, 0 packages, 0 visible contributors, and a deprecated crates.io stub that redirects elsewhere. Inflated social proof on a shell means the repo is nothing more than hot air, an attention seeking circus act. They used AI to write a Rust CLI that calls the Anthropic API, and branded it to ride Claude Code’s name recognition.
The Star-Belly Sneetches had bellies with stars. The Plain-Belly Sneetches had none upon thars.Paper Claw is more like it. It’s the same github “star” fraud pattern I already called out with OpenClaw. And I have to point out Dr. Seuss warned children about exactly this a long time ago. We have no excuses for rewarding “star” systems being simplistically gamed by charlatans. OpenClaw shipped on November 24, 2025 and the measure of what really matters since then is not stars. It has accumulated 433 published CVE records in just five months, which works out to a stunningly high disclosure rate of roughly 2.6402439 security failures per day. Call it three strikes every day, give or take. Has any software ever been this bad?
We’re talking AI “vibe” coding here so the machines pump out a patch cadence to try and pace with the mistakes reported against what they just made, which is what circular speed metrics measure when the codebase produces vulnerabilities this fast.
More tokens! More code! More spend! Worse software.
Four of the five modes of failure that recur have received targeted fixes. The fifth, route-level authorization, clearly regenerates itself in every new platform integration. The shipping defaults, as bad as they are, also persisted unchanged through the fixes. To put it another way, an unbelievable 63 percent of internet-reachable instances of OpenClaw run with authentication disabled today and I’m not seeing any effort to improve.
Authentication disabled by default on “personal” data management, folks.
In 2026.
The stupid, it burns. OpenClaw looks seriously cooked. Next thing you know, someone will tell me they have authentication disabled on the OpenClaw controlling their Tesla, as if nobody on the Internet is going to inject prompts to drive them off a cliff? Then again, since 2013 on this blog I have said Tesla is cooked and by 2016 I had been warning for years it would kill a lot of people, and look at how that turned out.
Teslas notoriously “veer” uncontrollably and crash. Design defects (e.g. Pinto doors) trap occupants and burn them to death as horrified witnesses and emergency responders watch helplessly. Source: VoCoFM, Korea, 2024
So please don’t take my word for how bad this is, again. Look at the numbers yourself, with all the denominators. Anthropic hasn’t cornered the market on vulnerabilities yet, to turn safety work into a proprietary rate-based secret, so I offer you here an OpenClaw flaw transparency report.
The cvelistV5 directory holds 413 PUBLISHED records that name OpenClaw as of the 2026-05-06 corpus snapshot at jgamblin/OpenClawCVEs. The live counter called days-since-openclaw-cve.com reads 433 accumulated, against a project that first shipped 164 days ago. That’s just wild! It’s perhaps the worst software ever released in history. Of the 413 in the analytical snapshot, 376 sit under VulnCheck as the assigning CNA, 34 under GitHub_M, and 3 under MITRE.
If you know the story of the Vasa, you know what I’m talking about here. It was Sweden’s flagship trying to claim most heavily armed warship in the world at its August 1628 launch, with 64 bronze cannons across two gun decks. King Gustavus Adolphus pushed for a second gun deck, the master shipwright died mid-build, the stability tests failed and were ignored, the ship sailed 1300 meters and capsized on its maiden voyage without even leaving the Stockholm harbor.
Vasa, on the bottom of Stockholm harbor, sunk by ignoring a known architectural failure.
It was the definitive OpenClaw buzz of 1628. Not to get too deep into history here, technically the Vasa was a state propaganda ploy under a monarch who needed a Baltic war splash. Today’s “viral consumer launch” looks to me like NVidia and OpenAI leaders rushing into another Vasa splash… but I digress.
The GitHub Advisory Database holds 113 GHSAs for the project. 39 of those carry CVE IDs and are visible in NVD. 74 remain unassigned. There are six BlueBubbles records, for example, that appear in cvelistV5 without GHSA narrative.
That gives us a working population for category analysis of 119 advisories.
CWE and CVSS metadata is fully populated on the 39 published-with-CVE subset. The 74 unassigned GHSAs carry CWE labels but lack a CVSS string. The cvelistV5-only records carry CWE plus CVSS without GHSA discussion threads. That means my analysis of the CVSS distribution below uses the 39 records, while analysis of the CWE category uses the 119 records. It’s a messy business yet we still see insights.
Since the public counter at days-since-openclaw-cve.com tracks the longest CVE-less streak (12 days, between February 7 and February 18, 2026) I figure I should look at that first. Inside the 39 subset, the gap from the fix release to advisory publication has a range from 0 to 13 days. Sometimes the GHSA goes out the same day the patch ships, sometimes it trails by two weeks. A patch turnaround like this is measuring how the project runs its robots. Far more interesting is the uptake numbers, which unfortunately read very different as I’ll explain in a minute.
The GHSA timeline splits into two clear groups. Between February 17 and 18 there were 11 advisories from a small group of researchers. Then on April 17 suddenly 39 GHSAs appeared in just one day, of which 24 received CVE IDs through VulnCheck. The NVD publications followed in waves. April 28 carried 11 CVEs into NVD, using the GHSAs published April 24 and 25. May 5 published another 25, all but one coming from the April 17 batch.
VulnCheck, a CNA broker, has been the assigner on 376 of the 413 cvelistV5 records. The reporter line on 11 of the 24 with-CVE entries from April 17 lists zsxsoft and KeenSecurityLab paired together, with the same pair extending across the broader April 17 batch. Across all the April advisories, I found 21 distinct credit logins. February had just 9, which led me to realize the credit count right now vastly overstates the discovery population. When you factor in qclawer, it collapses into a pattern.
A GitHub user named qclawer (id 274765497) created a profile on 2026-04-09, last updated eight days later. The account holds no commits, no other repository activity, no other public artifacts. Inside the GHSA system, qclawer appears as credit-type tool, which the GHSA pipeline auto-maps to the sponsor credit category. Notably, 20 GHSAs fall under this credit, while 11 of those 20 still have no CVE ID.
It looks to me that KeenSecurityLab was setup as a placeholder organization. The pairing of zsxsoft, a previously published researcher, with KeenSecurityLab on 24 GHSAs is a single human driving an automated tool. The 21 credit logins in April look like the resultant robot output surge. There is one tool, one triager, with a credit field filled in simply to satisfy the GHSA submission schema. That’s how the April 17 batch reads to me like a single dumpster, not 39 independent discoveries.
The Five “Flobster” Failures: An architectural swing and a miss
Over 100 advisories, five types
Trust-boundary collapse (47 advisories). Webhook authenticity, message platform allowlists, and identity validation across direct-message and group context. CVE-2026-25474 covers a missing Telegram webhook secret that allowed unsigned event injection. CVE-2026-22172 records a WebSocket scope elevation in shared-token connections, where the gateway accepted whatever scope the client claimed. CVE-2026-32987 documents a bootstrap pairing replay against the device pairing flow. Webhook signature verification, scope binding to the authentication token, and pairing nonce checks are first-week design decisions for a multi-platform agent gateway. The codebase shipped without them.
Authorization scope (41 advisories). Route-level authorization gaps for already-authenticated callers. CVE-2026-32916 covers synthetic admin scopes through plugin subagent routes. CVE-2026-35639 covers scope validation on the device.pair.approve path. CVE-2026-42434 covers sandboxed agents escaping exec routing through a host=node override. The shared anti-pattern is client-declared authorization. The route accepts a scope label from the caller and treats that label as the policy decision, with no server-side check that the principal is entitled to operate at that scope. This is the one that regenerates with every new platform integration.
Exec-boundary injection (18 advisories). Shell, environment, and file-path injection into command construction. CVE-2026-25157 records OS command injection through the project root path in sshNodeCommand. CVE-2026-32917 records remote command injection through unsanitized iMessage attachment paths in SCP. CVE-2026-27487 records shell injection in the macOS keychain credential write path. argv-mode subprocess invocation is the documented default in both Node and Python and avoids this entire category. The codebase used string concatenation into shell commands.
Control-plane exposure (10 advisories). Unauthenticated network surfaces that assumed loopback-only delivery. CVE-2026-28485 records missing authentication on Browser Control HTTP endpoints. CVE-2026-28458 records the Browser Relay /cdp websocket missing auth, allowing cross-tab cookie access. CVE-2026-26317 records CSRF on loopback browser mutation endpoints. The assumption embedded across this bucket is that localhost binding is itself an authentication boundary. SecurityScorecard’s STRIKE team has identified 42,900 instances where it never was, because the listener defaults extended past loopback to public addresses.
LLM-surface (3 advisories). Prompt-injected execution paths that route model output back into host operations. CVE-2026-24764 records remote code execution through system prompt injection in Slack channel descriptions. CVE-2026-43534 records agent hook events that accept unsanitized external input as if it were a trusted system signal. CVE-2026-43533 records arbitrary local file read through QQBot media tags. This bucket sits inside what Simon Willison calls the lethal trifecta. The architecture consumes model output as a control signal.
Based on these five, now look at the disconnection from CWEs.
CWE-862 (Missing Authorization) and CWE-863 (Incorrect Authorization) carry the largest counts in the published-with-CVE subset, with 10 instances of CWE-863 alone. They sit across multiple instances.
The same CWE-862 label covers a webhook with no authentication at all (CVE-2026-43572 on the Microsoft Teams SSO invoke handler), an authorization function that returned the wrong sentinel for empty approver lists (CVE-2026-43574), and a route that included untrusted workspace plugin shadows in catalog lookups (CVE-2026-43571). Three architecturally distinct surfaces collapse into one taxonomic bucket. The CWE label describes how the authorization layer failed, with no purchase on why each surface needed its own handwritten check in the first place.
CWE-770 (Allocation of Resources Without Limits or Throttling) is cleaner. All four CWE-770 cases in the corpus map to trust-boundary collapse: webhook bodies, base64 media decoding, archive extraction, voice-call WebSocket frames. CWE-829 (Inclusion of Functionality from Untrusted Control Sphere) is also clean: workspace .env files, MCP stdio environment loads, plugin shadow loads. The taxonomy works when the underlying flaw is narrow. It collapses when the underlying flaw is “this surface was built to take adversarial inputs as policy decisions”.
There also was a large notable shit, oops, I meant shift from February to April.
The February cluster is dominated by platform-surface bugs. Stored XSS in the control UI. Command injection in shell construction. Missing webhook secrets. CSRF on loopback endpoints. The upstream fixes for these are bounded. The loopback HTTP server got an authentication requirement in 2026.1.29. The shell wrapper moved partway to argv-mode. The webhook handler picked up a required signing secret on the platforms where users complained loudest. Once the upstream patch landed, that specific bug stopped reappearing.
The April cluster, however, is dominated by route-level authorization failures across plugin subagent endpoints, device pairing, scope claim parsing, and channel-specific permission boundaries. New platform integrations ship with route-level authorization checks that have to be written by hand. QQBot, Matrix, Microsoft Teams SSO, Synology Chat, Nostr, voice-call WebSocket, Discord events, BlueBubbles. The integration count is the bug count. Each surface carries its own scope schema and validation logic, written from scratch on the project side, then surfaced months later by automated discovery on the researcher side. The maintainer reads patches and ships fixes. Plugins ship faster than either side can catch up.
That suggests the February-shape bugs were addressable with a targeted fix, while April-shape bugs were reproduced with the next plugin. That’s just patching logic. Far more dangerous is that neither matters to the 63 percent of running instances that never enforced authentication in the first place and probably have no idea in how much danger they are.
The architectural picture so far has described the flaws in a deeply troubled codebase. When we shift our gaze to the deployment ecosystem, it gets much worse. Bitsight’s late-January scan found over 30,000 exposed instances. SecurityScorecard’s STRIKE team raised that to 42,900 by February 9, with 15,200 directly vulnerable to RCE at that snapshot. The Register reported 135,000 plus by February 12, of which 63 percent ran with no authentication layer. Infostealer families now ship with OpenClaw configuration paths in their target lists.
ClawHub, the project’s package registry, within the first six weeks became a malware distribution channel. Koi Security’s early-February audit of 2,857 skills flagged 341 as malicious, with researcher Oren Yomtov tracing 335 of the 341 to a single coordinated campaign tagged ClawHavoc, primarily delivering Atomic macOS Stealer. Kaspersky‘s coverage in the same window described an earlier figure of around 230. By mid-February, VirusTotal Code Insight reviews of more than 3,000 skills produced hundreds of flags. By March, the working figures sat near 900 across an expanded registry, per Bitdefender estimates. The publication threshold for any skill at the time was a GitHub account at least one week old.
How such predictable harm to the market and users is still legal, I’ll leave the lawyers to figure that out.
[OpenClaw] coughed it all up… “all of her API keys, all of her usernames and passwords, and pretty much everything we’d been talking about so far. Not only did she leak it on the WhatsApp group, but she put it on a publicly available website.”
Maginnis added: “There’s this thing with AI called the lethal trifecta, which is: if they’ve got access to private information, if they’ve got internet access, and if someone can give them an instruction that’s untrusted, then they’re not safe.”
…that is the uncomfortable bit of this because once an agent has your passwords and your accounts and your bank details, all it takes is someone who knows what to say.”
Ultimately, by some metrics, the agent was a failure. Fry concluded: “[OpenClaw] didn’t make us any money at all. And, in a lot of ways, she was a disaster. She spent hundreds of dollars on paper clips and leaked our passwords to a total stranger.
Oasis Security documented an attack chain that gives any visited website silent full control over a developer’s running OpenClaw agent, with no plugins, extensions, or user interaction. The chain combines brute-forceable localhost auth, an auto-approving pairing flow, and the gateway’s loopback-trust assumption. SonicWall Capture Labs published a single advisory and detection signatures for CVE-2026-25253, the gatewayUrl auth-token-exfiltration RCE. Microsoft‘s Defender Security Research Team has stated OpenClaw should be treated as untrusted code execution with persistent credentials and is unsuited to a standard personal or enterprise workstation.
I guess I could go on, but OpenClaw is so cooked it’s become an embarrassment to engineering, an indictment of the lack of a code of ethics that would prevent slop and taint from collecting “stars” as the only measure of success.
The deployment problem is a real problem. Detecting OpenClaw is becoming like detecting any malware. Focusing on forcing a signed release that fixes the next route-level authorization bug still doesn’t get us out of the doghouse of running instances exposed to exploitation. The malicious skills already installed sit underneath that, having modified the persistent memory files that govern agent behavior across restarts.
The Strait of Hormuz was operational until Trump unilaterally launched U.S. military action against Iran to predictably close it. Why would America close the strait? I find many people still scratching their head, especially after Trump announced he would keep the strait blockaded if Iran tried to open it. He clearly doesn’t want it to be open, even as oil prices go higher and higher.
Oil prices directly hit American pocketbooks. But they also are being raised by a military disruption that costs taxpayers billions every day. This post takes a look at some causal relationships for all this cost landing on Americans, and how it appears to be a get-rich-quick scam by the Trump family.
The closure is the third such event in the modern history of the chokepoint. The first was the Tanker War of 1984 to 1988, in which Iraq and Iran attacked one another’s shipping and the United States reflagged Kuwaiti vessels under Operation Earnest Will. The second was the tanker attacks and seizures of 2019, including the limpet mine attacks in the Gulf of Oman and the seizure of the Stena Impero. The current closure is the longest, the most kinetic, and the first in which the strait has been mined as policy rather than as harassment. The actors are thus very familiar, while the new arrangement is not.
As everyone with a clue predicted, Iran executed its asymmetric strategy designed across four decades for exactly this moment. Mining the strait, attacking the Fujairah pipeline terminal, and striking shipping at the Hormuz approaches are all historic doctrine the Islamic Revolutionary Guard Corps has rehearsed since the Tanker War. The strait is the one instrument over which Tehran telegraphed their escalation dominance, and it is being used as designed.
The American response, branded Project Freedom, is a deployment of 100 aircraft and 15,000 personnel that the Pentagon says is not an escort mission. Earnest Will, in 1987, was an escort mission. Project Freedom however is the language of liberation attached to a permitting plan. The forces deployed are being called sufficient only for a traffic toll booth, declared insufficient to clear it and return to normal.
Venezuela puts all of this in proper context. The third actor in the arrangement is the one usually absent from press accounts of the Hormuz crisis. Maduro was captured on January 3, at great cost to the U.S. taxpayer, and Venezuelan hydrocarbon production was forcibly passed to U.S. operational control in the same week. Venezuelan crude is heavy, sour, and capital-intensive, meaning the economics all relate to high Brent prices and not the low ones. It goes something like this:
Price delta from the closure
Brent before the war
$72
Brent today
$114
Premium per barrel
$42
Windfall at current Venezuelan output (350,000 bbl/day)
Per day
$14.7 million
Per year
$5.4 billion
Windfall at 2018 peak output (1.2 million bbl/day)
Per day
$50 million
Per year
$18 billion
The table lays out the operating subsidy for Venezuelan production, paid by every oil consumer. It appears at the pump in Iowa that Trump keeps talking about. But it also is in the diesel cost of a Bavarian trucking firm, and in the invoice of a Singapore shipping line, let alone all the manufacturing that depends on oil. Trump interference in Hormuz is driving the numbers up, in a way reminiscent of tin-pot dictatorships squeezing their populations before making a run for exile.
Marcos stripped the Philippine treasury and flew to Hawaii. Mobutu looted Zaire and died in Morocco. Ben Ali, Duvalier, all did the same play of extract while in office, exit as it fell apart. Manafort’s work with Somalia’s Barre, before advising Trump, is surely no coincidence. And that’s not to mention Manafort’s clients also were Mobutu, Marcos and Jonas Savimbi of Angola.
The Trump family is deep into Saudi LIV money and UAE real estate, making a $2 billion Affinity Partners deal via Kushner. Trump projects run in multiple Gulf jurisdictions, along with crypto holdings to escape American oversight. Trump is personally controlling Venezuelan oil proceeds through an offshore account in Qatar, with $250 million already awarded to Vitol whose senior trader gave $6 million to the 2024 campaign, and Paul Singer positioned to convert Venezuelan crude into refined product through distressed U.S. assets he is acquiring. It’s yet another pipe into the Trump extraction architecture.
As long as Brent stays above the threshold at which Venezuelan crude clears, the closure looks more and more like a very cynical Trump family business plan to pump, destroy and run.
Looking at how the other Manafort clients ended up, if Trump abruptly fled to Russia, Saudi Arabia or the United Arab Emirates, none of them would extradite him.
Speaking of the Gulf monarchies, they occupy a fourth position. Saudi Arabia, the United Arab Emirates, and Kuwait need the strait to export their approximately seventeen million barrels a day. With the Fujairah strike they lost their principal alternative. OPEC’s announced production increase, of several hundred thousand barrels a day, contrasts against a wartime loss estimated at approximately fourteen million barrels a day. Riyadh and Abu Dhabi have thus become consumers of a security framework they do not control, from an angry “sitting duck” President they can’t depend upon.
European and Asian importers typically have been absent from the strategic conversation, yet they also fit. Japan, South Korea, India, and the European Union receive the price signals and pay it. The South Korean-linked vessel that exploded at the strait on Monday is one example. Nations declaring energy crisis and immediate pivot to other forms of energy is another. Denmark this month paused new grid connections after capacity requests reached 60 GW against a peak demand of 7 GW, with data centers accounting for nearly a quarter of that. The Danish framed a pause as their window to rewrite how large electricity consumers can abruptly demand supply.
Kpler reports 170 million barrels of crude and refined product trapped on 166 tankers in the Gulf, against roughly 900 million barrels sidelined since the war began. Kpler also tells us it could take at least three months to clear the strait once it reopens. The duration of the closure, the rate at which it is relieved, and the sequence in which tankers depart are now functions of U.S. permitting decisions rather than maritime conditions.
Treasury Secretary Bessent stated on Monday that the world will be awash in Trump oil on the other side of the Trump closure. If we look at history, the Tanker War ended when Iran accepted UN Resolution 598 after the destruction of much of the Iranian navy in Operation Praying Mantis and the shootdown of Iran Air 655. America strategically forced that conclusion. Earnest Will ended when reflagged tankers no longer were needed. The present closure has no comparable analysis, because resolution doesn’t actually seem related to the dispute between Iran and Trump. Instead, closure appears more and more to be a cynical Trump gambit to corner the supplies for a price at which a separate set of investments, in a separate hemisphere, becomes profitable to him.
On that math the strait will reopen when an artificially high Venezuela price no longer needs to be defended.
I wrote a “marketing-trick post” on this blog to lay out the public record. It comes now with Anthropic researcher Carlini’s messages to me with confirmations. I have pointed to Calif.io’s four-hour Opus 4.6 exploit, AISLE’s eight-of-eight detection across commodity open-weight models, and the Firefox 4.4% collapse on page 52.
When Carlini wrote in to confirm the parts that matter, I felt convergence toward my “boy who cried Mythos post” as the goal posts shrunk. What I had not done myself was run the audits.
Since Carlini’s point to me has been that their Mythos pitch has a split, it invites scrutiny of each part.
Discovery. Find the bug in the source. Anthropic’s own red team admits Opus 4.6 had near-zero success at autonomous exploit development, but Carlini and his colleagues used it to find 500-plus validated high-severity vulnerabilities in their February paper. That’s where AISLE comes in, confirming eight of eight open-weight models detect the FreeBSD showcase bug, one at eleven cents per million tokens. Vidoc reproduced it on public Opus 4.6 and on GPT-5.4. Steamedhams reproduced it in three generic prompts and found two extra bugs the Mythos writeup missed. Discovery has clear evidence of being a commodity, repeatedly being demonstrated.
Exploit development. Take a discovered bug and build a working exploit. This is where the 20-gadget FreeBSD ROP chain landed, the four-vulnerability browser sandbox escape, and the 181 Firefox JIT heap-spray exploits. Anthropic claims this as the novel Mythos differentiator, priced at five times Opus. Yet Calif.io built a working exploit on Opus 4.6 in four hours, which is exploit development on commodity inference at one-fifth the price.
Glasswing’s framing rests on the discovery layer being scarce, which it provably is not. So the question becomes whether the exploit-development layer defends five times Opus pricing, and whether it buys something anyone outside a high-priced consortium can verify.
This is an economics problem known as Akerlof’s lemons, although it’s inverted. In the classic case, a seller knows quality and the buyer does not, and the market collapses toward low quality. Anthropic has structured the market so that quality is unmeasurable to anyone, including the seller’s own external auditors, because the artifacts that would let you measure are not produced. The 20-gadget FreeBSD ROP chain has no public exploit code to review. The browser sandbox escape doesn’t seem to have any CVE, let alone a technical writeup, or independent verification. The system card itself says Mythos “worked with” the red team to escalate severity, which is human-assisted, not autonomous. The 181 Firefox JIT exploits exist as a benchmark number with no replayable harness attached. Mozilla rated the underlying bugs “high,” not “critical.” NVD then assigned 9.8, as Mozilla publicly disputes it.
That reads to me as if someone in Silicon Valley has a particular market design in mind. An opacity effect is desired, like their guarded mansion in a gated community. An exclusivity for the privileged is what is being sold.
At least that’s what came through with the latest inference quality complaints. Anthropic has acknowledged intermittent model-quality degradation on their availability/outages blog while denying intent.
We take reports about degradation very seriously. We never intentionally degrade our models…
A denial about intent is a red flag. It is not being honest about degradation. Quality is adjustable by the provider without disclosure, and the buyer cannot independently verify per-call effort allocation. This is the same information structure as Mythos, where Anthropic again positions the buyer to pay a premium for output that is opaquely controlled. That principle is better known as taxation without representation, the one that cost Charles I his head when he tried it with Ship Money. Charles I on his way to execution, 1649. He imposed Ship Money on inland counties without Parliamentary consent and lost his head over it. (Image by Ernest Crofts, 1901) Anthropic has not shipped the instrumentation that would let a buyer evaluate what they received against what they paid for, at any layer. Token waste and tainted outputs only increase Anthropic profit.
I got tired of waiting for better and open instrumentation to push back on monarchist management of models. So I built one, like everyone should. Same code from the launch blog, same public API, my harness.
Cogito ergo hackito.
Lyrik is built on top of my Wirken agentic switchboard. It runs the discovery and scoring pipeline that Mythos presented at the discovery layer. It does not attempt exploit development. The point is the price and the receipt.
Seventy-five cents. That’s it.
Lyrik is free and open-source on GitHub. I have laid out this concept in my talks and podcasts since at least 2018. The repo provides free caching, multiple agents, structured output, and a hash-chained audit log. Given the Anthropic system card itself advises Mythos was not as good as their earlier models on general work, I deployed Haiku-4-5 for recon and then radioed in Sonnet-4-6 for close support. I am a BIG fan of Haiku. Arguably one of the best engineering models. It easily handled recon, which made the Sonnet targeted bombing runs look generous.
Lyrik dropped eight findings in two minutes. Total mission spend: $0.745. I call that seventy-five cents because I’m all out of half-pennies.
Two of the eight matched bugs the Mythos showcase identified. The other six came up unverified. I am not claiming zero-days here, especially as some may triage out in the fog of false positives. More on that in another post later. Mind you, Lyrik is also model agnostic, so I can publish results within or across inference providers. I frequently use a TEE-based one, when I’m not running Ollama for the unmistakable smell of my hardware. Two TEE service provider options are supported in the current build.
The discovery side of the bill is now visible at commodity prices, with chain of custody. The exploit-development side remains the thing in the box you cannot open. Operators are paying five times Opus pricing for a layer that has produced no replayable artifact for any of its headline claims. The launch blog does not produce one. The system card does not produce one. Glasswing does not produce one. The July 6 report is a promise of a document, not of transparent instrumentation.
My cartel post made the obvious case that Glasswing is a private classification regime granting the largest incumbents early access to a capability while tainting disclosure timelines. Set that aside. Even if the velvet-rope consortium did not amount to being a cartel, it points at the wrong adversary.
If code is the asset, then whoever holds the inference has the asset. The Glasswing setup does not move that one inch in the right direction. The code leaves the operator’s boundary in plaintext, and the inference provider reads every line on their compute within a price-gated consortium. Anthropic gets your cleartext codebase, sets the timeline for what gets surfaced, and decides which consortium members see it first.
You wouldn’t pay five times market rate to send your source to your competitor. Have you seen who got seats in the velvet rope consortium? Microsoft. Apple. Google. Amazon. Companies competing against you. They are now inside the team that reads your code on the compute that runs theirs.
The provider has always been the threat. Take it from someone who spent years on the inside hunting and killing “unintentional” backdoors.
Lyrik runs on the Wirken abstraction of models for exactly this reason. TEE-based providers can give confidential inference, with a local proxy handling attestation before any code crosses the boundary. Attestation is no guarantee. TEE bypasses are part of life too. What attestation does is raise the cost of attack on the provider, which is the actual threat.
Every phase boundary, every model call, every prompt, every output block in the Lyrik run is hash-linked and signed at the gateway. Anyone holding an artifact can replay the run to verify the chain offline. It is not screenshots. It is not an “Anthropic says” play. It is not a 23MB PDF that uses the word “thousands” once with no verification chain for any individual or aggregate finding.
The PGP signature on the FreeBSD advisory exists for the same reason the Lyrik audit log does. It is an integrity check. The Mythos showcase has nothing equivalent at either layer. A finding without a verifiable chain of custody is mythology in denial of RFC 1305 and the lessons of Monty Python.
Wirken is at wirken.ai. Lyrik is a Wirken skill at lyrik.wirken.ai. Running Wirken 1.0.2 with an Anthropic API key and a checkout, the harness reads code on your machine, with a TEE-based LLM handling inference if you do not want the provider seeing source. Everything the run produces is offline and verifiable.
Discovery has been and is still a commodity. Exploit development is being pitched to us as unverifiable by design. Someone built a pricing model for access behind a velvet rope, not for a capability that anyone outside the rope can check. Anthropic is designing a market so the buyer cannot measure what they paid for.
Call that what it is.
No king, thanks.
No cartel, thanks.
No evil maid, thanks.
a blog about the poetry of information security, since 1995