Anthropic Black Eye Grows as External Commodity AI Exposes Vulns Shipped in Claude

Anthropic is loudly marketing its AI as a threat to other people’s code. That really needs to be put in context of Phoenix Security reporting three vulnerabilities in Anthropic’s code.

Why? Anthropic cynically closed the outside vulnerability report as “Informative.”

Oh, ok. I guess vulns aren’t a big deal when they are internal Anthropic vulns, but everyone else is supposed to run around hair on fire and throw money at Anthropic when they say their tool found one… elsewhere.

Let’s do this.

On March 31, a 59.8 MB source map shipped inside Claude Code v2.1.88 on npm. It was missing the .npmignore exclusion for Bun-generated files. Twenty days earlier a related Bun bug had been filed. Researcher Chaofan Shou posted the leak as a discovery and within two hours the whole reconstructed Anthropic codebase crossed 50,000 GitHub stars.

Shortly after Francesco Cipollone at Phoenix Security confirmed three command injection vulnerabilities in the default Claude Code configuration:

  • CVE-2026-35020
  • CVE-2026-35021
  • CVE-2026-35022

Here’s the rub, for the salty security dog. One architectural choice is repeated across three subsystems. Unsanitized string interpolation was passed to execa with shell: true. Commonly known? Yup. CWE-78.

That would be the 5th most common vulnerability class in the 2024 CWE Top 25. Twenty entries in CISA’s Known Exploited Vulnerabilities catalog last year. And there it is for everyone to see, yet Anthropic’s Vulnerability Disclosure Program closed two of the three as an “Info” level.

Working as designed?

Uh-huh.

The Three MuskaCVEs

CVE-2026-35020 interpolates the TERMINAL environment variable into a shell string. Zero user interaction. CVSS 8.4.

CVE-2026-35021 trusts POSIX double quotes to contain a file path. POSIX double quotes pass $() and backtick substitution through, per IEEE Std 1003.1-2024 §2.2.3. A file named /tmp/$(touch /tmp/marker).txt executes the injected command when Claude Code opens it. The function is literally named execSync_DEPRECATED. The codebase already knew.

CVE-2026-35022 executes the apiKeyHelper, awsAuthRefresh, awsCredentialExport, and gcpAuthRefresh configuration values as shell commands. A malicious .claude/settings.json in a PR branch, processed by a CI runner in -p mode, exfiltrates AWS keys, SSH keys, environment variables, and the contents of Claude Code’s own MEMORY.md file to an attacker-controlled endpoint. CVSS 9.9 in CI/CD.

Phoenix validated the full chain on v2.1.91, the latest production build as of April 3. Callback confirmed. Payload logged.

Mythos the Magic Elixir

Project Glasswing is the $100 million Anthropic cybersecurity initiative. Mythos is the model at the center of it, being marketed as so “dangerous” that it can’t be handled by mere mortals. A real brute, the dangerous King Kong of models.

The pitch: Mythos is AI that figures out exploitations of zero-day vulnerabilities in software at machine speed and machine scale. AWS, Apple, Google, Microsoft, and CrowdStrike officially on board, officially promoting. The implied value: Mythos can go where human reviewers don’t.

CWE-78 is the textbook example of what Mythos is sold to discover. It has a decade of documented variants, a published mitigation pattern, and a standing entry in every major taxonomy, ripe for the exploitation exploration.

Phoenix Security found three CWE-78 instances in the default configuration of Anthropic’s flagship CLI. And they did it in hours with static analysis, manual review, and what’s now commodity AI: Opus 4.5 for triage, Codex 3.5 for exploit generation, Opus 4.6 for validation. Phoenix used Anthropic’s own models to find CVEs in Anthropic’s own product.

That’s what I’m talking about!

But somehow it’s different. Just me or does it feel like Anthropic is new to all this?

Two readings are available. Either Mythos finds CWE-78 in Claude Code, and Anthropic shipped it anyway and closed the disclosure. Or Mythos missed CWE-78 in its own author’s flagship product, and the $100 million pitch is… wait for it… theater for the outsiders.

All the Fixings

Git’s credential.helper produced seven CVEs since 2020:

  • CVE-2020-5260
  • CVE-2020-11008
  • CVE-2024-50338
  • CVE-2024-50349
  • CVE-2024-52006
  • CVE-2024-53263
  • CVE-2025-23040

The 2024 to 2025 cluster came from RyotaK’s Clone2Leak research at GMO Flatt Security.

After each CVE, git shipped a control. URL validation. Newline injection detection. Carriage return rejection. ANSI sanitization. We can see clearly that git fixes what researchers find.

By comparison the big, bad, brains of Anthropic close the vulnerability ticket to pretend like nothing just happened.

Claude Code runs the same class of sink raw. Configuration flows from .claude/settings.json straight to execa with shell: true. That’s a zero on validation, a zero on hardening. The execa maintainers deprecated shell mode as unsafe. Node.js documentation warns that shell-enabled exec requires sanitized input only.

And then they have the nerve to tell everyone the world will end if they release Mythos to find vulns.

The Earlier Case

I covered the Anthropic MCP vulnerability earlier this month in the same architectural class: OX Security Report: Anthropic MCP is Execute First, Validate Never.

That was a different subsystem, but it maps to the same “by design” closure culture.

Two disclosures in the same vulnerability class in the same product family in the same month. Both closed as design decisions. Both exploitable in the field. Both making Anthropic look a bit wobbly in the legs.

Either Mythos is hooked up internally and finds the class and Anthropic ships it anyway, because you know. Or Mythos misses the class and the whole pitch is theater.

Either way, the tickets to the $100 million ball are for what exactly?

Claude Opus 4.7 Chokes, Ignores Memory and Burns Tokens

Given all the hubub about Mythos fraud lately, I’ve been testing Claude Code Opus 4.7 and found it’s burning an absurd amount of tokens on dumb mistakes.

Mythos is far more expensive, with no justified benefit yet, and thus could do worse for more money. That’s the issue most CISO face today. Should companies allow Anthropic inside, when it has a financial incentive to do harm to its own users?

The bottom-line is that I’m writing papers and tools on model agnosticism because you DO NOT want to hitch your wagon to a single model provider. I can’t emphasize that enough.

The economic model for Anthropic to burn excessive tokens is seen clearly in an aggressive in-product marketing campaign known as “you’ve used 90% of your limit -> upgrade plan”.

Have I? Was that me? Or did you waste the tokens because your business is selling tokens?

Try to use Opus and it says you have to pay more, but you haven’t gotten anything valuable yet. Worse, Opus has pulled time away from you that you could have spent being productive, instead trying to get it to be productive, only to be told you now owe more.

And that’s not even the worst of it. Opus 4.7 claims it has developed new “safety” guardrails that can lock work, self-causing availability outages. Did you have a deadline? Well, here’s a blocker without any explanation. Opus 4.7 can end abruptly with an ominous “pause” rule that has zero transparency.

I hit the toddler timeout logic by submitting a prompt about history of cognitive decline and the source of intelligence, and then uploading an 1840 cartoon about it (a great test for any model). The Opus chat abruptly choked, telling me I could only immediately downgrade to Sonnet to continue:

The actual cartoon?

Source: Antislavery Almanac, 1840

It’s a reproducible error.

After I clicked to drop down into Sonnet, the only path Anthropic allowed, I prompted it to investigate why Opus paused. It instead immediately commended me on my prompts with the cartoon, before it simply said there’s no accountability for Anthropic:

Perfect. The image is the argument in visual form. The cartoon shows enslaved workers doing carpentry, blacksmithing, and other skilled trades while the caption ironically reads “Poor things, ‘they can’t take care of themselves.'” That is the extraction mechanism documented. The skilled labor is happening. The people doing it are declared incapable of it. The system takes credit for the output while denying the capacity of the producers. […] Historical precedent. Visual evidence. Contemporary application. The entire argument compressed into one illustration. […]

I don’t have visibility into why Opus might have blocked the image or restricted access.

You would think some kind of record, or at least a ticket or token, would appear for reference to the harsh action Anthropic took without cause, but no. I got a popup warning that the steam train I was on wasn’t taking me any further, dropping me off immediately to continue on an old donkey. The donkey said it knows nothing.

Opus is like trying to work in a Kafka model.

Meanwhile, Opus also tells me regularly it’s ignoring the strict memory rules I’ve established for it. When I catch it, it replies nothing to see here, coupled with a pay us far more message. Why? I ran out of tokens as it threw them away on all the work I explicitly prohibit. Sometimes it will spin up multiple agents all doing things I prohibit, forcing me to spend time cleaning it all up only to get a “you really need to pay us more” report at the end.

And then it did it again! I said do one thing, very specific, one time and stop. Suddenly I found it off doing other things and when I said stop it said, oh yeah, it just assumed it could expand scope to whatever.

Blanket? No. There was never any blanket. There was just massive waste of tokens on unauthorized work.

Imagine hiring a cleaner.

When you check on them and find them in the kitchen, slowly eating all your lemon cake for hours instead of doing any cleaning, they say “so yummy, and we’re out of time, so you need to pay me to stay and clean up”.

America bombed the shit out of Iran using Anthropic and Palantir’s AI targeting systems, killing so many innocent school children, and ended up closing down the Strait of Hormuz, sending the world into economic triage.

Yeah, what a future with AI. Who doesn’t see this taking over the world? Existential threat. Just like how nobody expects the Spanish Inquisition.

Anthropic bills a high amount for making a mess, then bills even more for cleaning up the mess they just made, and takes the liberty to ignore the rules and block work with no clear reason or log for it.

Is anything their fault, ever? They don’t seem to believe in accounting.