Let’s recap what we know since April, when Anthropic’s marketing department started coal-rolling the industry with their nonsense about novelty. A model with 3.6 billion active parameters reproduced Anthropic’s flagship Mythos discovery, the FreeBSD RCE CVE-2026-4747, and the most consistent open-weight model in that test ran about six hundred times cheaper per token than Mythos.
The frontier is supposed to be the frontier, meaning the best model. But really, if you know history, the frontier was about immoral claims. And so today, the evidence points away from the frontier.
Set the marketing and history aside. Four documents, when read together, form a single brief that further buries the Mythos. The best model available to you runs on your own inexpensive hardware. Cost and performance make the obvious case. I’ll start there. And then the deeper case is much more important, where I suspect the PhDs at Anthropic don’t even know how to spell it: CIA.
Cost Considerations
The price gap was the easiest and first frontier collapse. Niels Provos put an orchestration harness in front of older commercial and open-weight models, Opus 4.6, Sonnet 4.6, and Z.AI’s GLM 5.1, and discovered live zero-days for thirty to one hundred fifty dollars a codebase, including a reproduction of the 1998 OpenBSD SACK bug he wrote himself. Security Research Labs ran a Qwen3.6 model with roughly three billion active parameters on a Mac laptop and produced finding sets comparable to GLM-5 and Claude Opus 4.6 on two production codebases, in under ninety minutes, with zero human nudges. Vicki Boykis runs Gemma 4 on a 64GB Mac and gets agentic coding loops at about seventy-five percent of frontier speed and accuracy. The Ornith team trained a nine-billion-parameter model that matches dense models several times its size, and a flagship that matches Claude Opus 4.7 on the coding benchmarks. And for what it’s worth I put https://lyrik.wirken.ai/ to the test and it matched two of the Mythos card flagship bugs for seventy five cents. 
The AI Security Institute then explained why the gap is smaller than the leaderboards suggest. Benchmark scores are protocol-dependent. Raise the token budget one to three orders of magnitude above the published default and performance climbs on FrontierMath, TerminalBench, HLE, and the cyber ranges. Fixed-budget evaluations understate capability, and the gap widens as models improve. The generational gains arrive as greater reach and reliability rather than token efficiency. A frontier score describes the harness and the budget as much as it describes the weights.
So much for cost. The closed nature of the Anthropic releases seems to be intended to prevent the kind of research that proves their claims false.
Now comes the real reason to hold the model yourself. Many already know this, but let’s walk the CIA triad to be sure we’re on the same page.
Confidentiality
The customers who need a code review most are the ones forbidden to send their code anywhere. Finance, government, critical infrastructure. The SRLabs pipeline answers this directly. A cloud model designs the review from metadata alone, the local model reads the source, and a cloud model consolidates the findings. The proprietary source stays on the machine through all three stages. They are precise about the boundary, and so should we be: metadata crosses, so the accurate promise is that no source leaves the building rather than that nothing leaves. That distinction is the whole discipline. A local executor turns confidentiality from a contractual hope into a physical fact. The bytes that matter remain on a disk you control.
Integrity
Here the local model wins on a property the frontier surrenders by construction. Integrity is the correspondence between a claim and a process you can inspect. A capability you can replay is a capability. A capability asserted through an institution is a press release.
The local pipeline is fairly simple and repeatable. Provos publishes the IronCurtain harness, whose workflows are defined as finite-state machines in plain YAML. AISLE published nano-analyzer as a single Python file, and clearbluejar took that file, ran it on two open-weight models on one consumer GPU, recovered the same FreeBSD bug, and fixed the false-positive rate by adding one reachability stage that dropped the noise from thirty candidates to five. The work replays. You can rerun it, change one stage, and watch the result move. Boykis makes the same point from the inside: with a local model you watch the tokens arrive, change the context window, swap the quantization, and edit the system prompt while it runs. The box is open. And https://lyrik.wirken.ai was built with exactly this purpose in mind. Integrity is a required control, a prerequisite to doing the work at all.
The frontier offers the opposite trade. The Mythos checkpoint that AISI evaluated is one the public cannot run, scored under a protocol AISI’s own paper shows to be the lever that moves the number. The capability is real, perhaps. The evidence is an authority signature on a result you are invited to trust, like a self-signed cert in the age of Let’s Encrypt. Integrity asks for the actual head of authority, the root and details of the artifact. A model on your disk hands everything over in full transparency for high security. A model behind an API hands you a number and a logo, meaning nothing at all.
Availability
The newest fact settles the matter. Access to Fable and Mythos was suspended in June 2026 under a Commerce Department export-control directive. A rented capability can be withdrawn by a regulator, a pricing committee, or a board. And the latest erratic, grudge-filled, targeted moves by Trump prove he can wag a finger at any person or company and immediately shut down all access to US technology under “sanctions” authority. No trial, no hearing, no warning, just one minute you have US technology and the next minute it’s all gone with no path for recovery. A government that willingly undermines its entire economy and private sector is itself a moral question, but business continuity risk numbers in tech speak for themselves.
Anthropic prices Mythos at roughly five times public Opus, from twenty-five to one hundred twenty-five dollars per million tokens, which is a second kind of withdrawal for anyone whose budget matters. Many firms in June are reporting token bankruptcy and shutting down AI access to reduce explosive spend. A capability that exists at the pleasure of someone else’s arbitrary pricing policy is a capability you are borrowing into debt.
A model on your disk answers when you ask it. Its uptime is a property of your own infrastructure. No directive reaches it, no erratic price change locks you out, no quarterly access review applies. Availability stops being a service-level agreement and becomes a fact of ownership.
The brief
Confidentiality, integrity, and availability were always the job. The industry has never improved upon the simplicity and elegance of the triad, yet it now is confronted with an architecture that concedes all three to whoever holds the API. The work above shows the concession was a significant preventable error. A model you hold satisfies this brief and proves Mythos was never about capability. The frontier offers an expensive route to a number you cannot replay and do not really control.
Choose wisely.



