Category Archives: Security

ToxicSkills Revisit: Loch Ness Levels of Mythical AI Risk

In 2012 I rambled at BSidesLV that if you flood a system with enough volume and velocity, it fills with monsters that were never there (oh, and also that political coups would get easier with social media poisoning). Over the past week I was asked to assess nearly 70,000 AI agent skills, and I could not stop thinking about that mythical monster.

A regex pass flagged one in eight skills for critical risk. But then I went through the flags and 95% were nothingburgers: an installer, the author’s own API key, a cron job doing cron jobs.

Who wants to buy a Loch Ness Skill shirt?

Grey t-shirt with a line-art Loch Ness monster forming a code glyph, text reading Loch Ness Skill, 5% critical, 95% never there.

Perhaps you already know what I’m talking about. The agent skills are on ClawHub, behind the disaster known as OpenClaw. As you may recall, Snyk and Invariant published ToxicSkills last February, a real audit of this ecosystem, across 3,984 skills drawn from ClawHub and skills.sh. When I was asked to walk the live index today, I found 68,321 unique skills on ClawHub alone. That’s an AI-generated explosion of seventeen times the skills, in just four months.

Aside from the jump in numbers, three things from February look stale, right out of the gate. First, the named indicators are nowhere to be found: the eight skills the report listed as live, and the four authors behind them, are absent from what I saw. Second, the index keeps moving, and two skills I pulled for this study suddenly returned 404, and stayed gone when I rechecked. They were removed after I had begun, whether by registry takedown or author unpublish is unknown. That’s because, third, the registry now scans itself, with per-version VirusTotal, an LLM scanner, and capability tags that did not exist in February.

I did a static review, given each skill’s bundle is just a simple ZIP, which is all you need to read it without ever running it. Nothing in this study was executed. Luckily, I already had a tool laying around the office: an eight-policy regex detector within Lyrik that can mirror the ToxicSkills taxonomy. Using a sample of 1,500 skills it basically showed what a pattern scanner sees.

The regex detection pass flagged 12.6% of skills as critical and 53.8% as having some issue. But reading these flags revealed legitimate agent-skill overlaps the malicious-pattern match almost completely. A run-of-the-mill installer (uv, aliyun, foundry) shows up as a suspicious download. A scheduling command shows up as dangerous persistence. A skill cleaning up its own directory shows up as a destructive delete. A doc that says “export your API key” reads as credential dumping. You can probably see the problem, because it’s obvious to the human eye. The emoji’s zero-width joiner reads as Unicode smuggling.

A regex number of 12.6% measures patterns, not malware, so there’s an important judgment layer missing. Is the delete helpful or malicious? You have to be the judge because the tool can’t.

I thought about researching whether the “malicious prevalence fell from 13.4% to X.” Too many variables ruin the idea. The instrument, the definition, and the population all differ. Snyk ran a model engine; I ran a regex baseline and then a model adjudicator under a different threat model. Their critical classes include prose prompt injection, which I carved out because the method can’t see it. They deduplicated two registries in the dinosaur days of last February; mine is a sample of ClawHub today, and the worst skills they found were removed. The only fair comparison is the size and named indicators disappearing from the index. Everything else is an independent measurement, perhaps for the better. Perhaps an apples-to-apples is for a later day.

The February post said this about its detectors:

intentionally tuned to minimize false positives on widely adopted legitimate skills; these numbers represent real risk, not scanner noise.

I did not run their stuff so I cannot speak to the veracity of this claim. But I can surely ask out loud today whether throwing flags is really the best approach? The answer is they are peddling mostly noise, and the report’s own authors admit it: they write that single-LLM or regex-only scanners miss the behavioral prompt-injection patterns their engine catches. My research seems to prove that their pattern layer does not just miss things. It invents them.

This is what I learned when I took Lyrik, as a code auditor that scores findings twice against a written rubric, to see whether a bundle, by static evidence alone, performs or installs a dangerous action that the user-facing description does not surface. I searched primarily for what I decided to flag as something “undisclosed-dangerous”.

The cleanest example of what this means is a skill called auto-domain. Its description promises only to detect a port and hand you a public URL. Its bundled script downloads a native binary from a stranger’s personal repository, makes it executable, runs it as a persistent background daemon, and routes your traffic through a bare IP address. The script’s own help text lists the backend, while the description a user sees does not.

As expected, credential leaks are all over the place, even though not all the same. Authors commit their own API key into their own skill. That endangers the author and invites abuse of their quota. A smaller set is more interesting: live database credentials and a WeChat secret reach infrastructure other users touch. In one case, called deepseek-balance, it falls back to sending the user’s Anthropic token to a different vendor.

On the flags the regex layer called critical, Lyrik confirmed 9 of 188. More than 95% of what the pattern scanner called critical was cleared with a cited reason. Of everything Lyrik flagged, its label was right 26 times out of 37, about seventy percent, with a wide interval at that sample size. It never once fabricated evidence: every secret and endpoint it cited was real in the bundle.

The method used was blind to two things. First, as mentioned above, it does not read prose prompt injection, the natural-language attack hidden in the description itself. That is one of the three classes the regex baseline leaned on hardest, and Lyrik isn’t yet designed to do anything about it.

The second blind spot is the one the study quantified. Static analysis of a bundle can’t see code in an external clone, or a remote install target. That’s notable when 4.5% of flagged skills hide their payload outside the bundle, and 3.2% ship a confirmed dangerous one inside. Roughly as many skills put the dangerous stuff where you cannot look as put it where you can.

The security vendor posts usually end with a self-serving call-to-action. Every section resolves to a product, and the last screen is a demo button. That’s a reasonable step since it’s saying they can help with the problem they just described.

I suppose I’m different because I have nothing to sell you here. My concern is the skills you install today have access to your credentials today, whether or not anyone monetizes you being alarmed about it. A regex scanner will hand you a number that is 95% mythical and call it risk. That’s operator-fatigue levels of noise. A better system runs at about seven in ten right and never invents evidence. Lyrik is free and open source, like many of the best tools, so there’s not a reason to buy anything. It is a reason to read the skills before you run, and to be wary of any system that doesn’t prevent bad skills.

In 2012 the joke was that big data was going to be so vulnerable that we would be hunting monsters that didn’t exist. Fourteen years later I’m seeing a reported critical rate that’s 95% mythical.

Police Arrest Sunken Cybertruck Owner For Elon Musk Stunts

I think the buried lede in this story is that the guy arrested is the same guy Elon Musk has been promoting as evidence the Cybertruck can cross deep open water.

In April of 2025, Musk commented on a video of a Cybertruck moving through shallow water in Lake Grapevine—perhaps one of McDaniel’s previous Wade Mode escapades—writing, “With a little work, it should be able to cross some open water.”

And back in 2022, before the Cybertruck’s release, Musk hyped up the vehicle’s then-unseen Wade Mode features, saying that they’d essentially turn the car into a viable watercraft.

“Cybertruck will be waterproof enough to serve briefly as a boat, so it can cross rivers, lakes, and even seas that aren’t too choppy,” Musk wrote.

To say the Cybertruck concept as a whole is underwater is an understatement.

So they arrested Elon’s mule?

When he made it to shore, McDaniel was arrested on multiple charges, including driving a vehicle in a closed section of the park and boating law violations, such as not having a valid boat registration and not having lifejackets on board.

Boat registration. Life jackets. Those are small hurdles, not barriers.

He admitted he’s been doing this exact stunt multiple times and intends to do it again, clearly based on advice of Elon Musk.

Maybe charge him with pollution? Can’t get out of that one. Or here is a better one: try arresting Elon Musk.

Carl Orff and the Swastika: a Correlation German Schools Should Count

Der Spiegel reports that Germany’s schools have learned to count antisemitism. Their April 24, 2026 edition draws a picture from a survey of the state education ministries.

Source: Der Spiegel. April 24, 2026
  • Hesse went from 39 reported far-right, antisemitic, and racist incidents in 2023 to 159 in 2025.
  • Saxony rose from 149 to 247 over a comparable window.
  • Lower Saxony climbed from 133 in 2022 to 322 in 2024.
  • Across the nine states that gave figures, roughly 1,500 antisemitic and far-right incidents in 2024 alone, much of it banned symbols and graffiti.
  • Saxony’s education minister, Conrad Clemens of the CDU, called right-wing extremism the single largest societal problem in his state’s schools.

As always, the figures have context that matters. The comparison windows do not line up. The ministries themselves claim, without data to back it up, that teacher sensitivity improved. Saxony’s separate count of all school reports to the authorities, from 55 in 2014 to 1,644 in 2024, is incidents reported. Where’s the proof that this has anything to do with better sensing? Most people say the opposite, that sensitivity to antisemitism is declining, which is why the acts have been increasing along with the AfD.

The 1,500 is a self-reported number with three states declining to answer. Declining to answer certainly doesn’t sound like improved sensitivity. The direction is not in doubt, and the most obvious apparatus is still the point: the German system can still see the swastika sprayed on the schoolyard wall as a bad thing, and is registering more every year.

What the German justice machinery cannot see

There is a particular thing this counting does not do, and it is because of how some of the worst antisemitism in Germany worked hard to normalize itself after 1945. It is in the music room.

A great many of these same schools teach music through a method that carries Carl Orff’s name. His name shouldn’t even be on it, given most of the work is Gunild Keetman’s and the framework beneath it is Leo Kestenberg’s. Putting Orff’s name on the cover is the least accurate thing about it, and the parents allow it typically because they claim no knowledge about Orff’s huge importance to Hitler. The spread of his name to children is because there remains no standardized curriculum. One of the largest figures in Nazi music arrives as though he were merely an introduction to the instruments, as if his face and name on the cover carried no more weight than any other pedagogy.

Carl Orff was forty-one and remained obscure as a composer when Hitler seized power. His full two decades of work had produced no breakthrough and he was unknown; this is partly why he himself dated the beginning of his real career as 1937, when he premiered Carmina Burana. His career began, according to him, four years after Dachau had been destroying opposition to Hitler, and on the doorstep of Kristallnacht and the Nazi invasion of neighboring countries.

The success came with and because of the Third Reich, and he secured it by courting the regime’s officials when he needed to and by using adherence to the Reich to overcome the regime’s conservative critics by winning the regime’s favor. He left the Nazi dictatorship with a high fixed income, high praise and official favor, and his highest honor of all the Nazi Gottbegnadeten exemption. The regime protected and rewarded him for serving it, at the same time it was executing men like his friend Kurt Huber, whom he refused to help.

After Hitler committed suicide he fraudulently told the Americans that the Nazis had disliked his music and that no Nazi critic had ever given him a good review. Both claims were patently false and easily disproved. Fritz Stege, a Nazi critic, reviewed him favorably. Orff had lied during the denazification process because it was based on whether he had been a beneficiary of the regime. Not only had he been a huge beneficiary, the openings he took existed only because Jewish musicians had been banned. Orff after 1945 even falsely claimed the valor of his dead friend Huber as his own. He lied repeatedly in many ways to launder his role in the Reich into a “grey” post-war curriculum work. And as a matter of fact, to his very last day he never, ever criticized Nazism or Hitler. Asked about it point blank, Orff wouldn’t condemn the Reich.

The ground he claimed and rose on, his career “success”, was made by force against Jews. When the regime wanted a score to replace Mendelssohn’s banned music for A Midsummer Night’s Dream, several composers turned the commission down. Werner Egk waved off a generous fee. Richard Strauss declined. Even Hans Pfitzner, an antisemite, refused, on the grounds that Mendelssohn’s original was better than anything he could have written. Orff took it. Schulwerk was set for trials in Berlin’s elementary schools in the early 1930s, arranged through Leo Kestenberg, the official in the Prussian education ministry who knew the method and could open the door. The trials never happened. Kestenberg, a Jew, was dismissed in 1933, and the door closed with him. Orff was an opportunist, which was worse than an ideologue, because the ideologue at least held values. Long after the ideologues were caught and stopped, Orff was still rising because he benefited from erasure of Kestenberg.

So here is the question Germany needs to start answering: wherever Orff is taught, does the swastika correlate? No registry tracks it. No ministry counts it. The people best placed to report it cannot, because they do not know how to classify the correlation of a Reich celebrity the children are being told to learn about.

The missing column

That absence of evidence is usually mistaken for evidence of absence. It is, instead, the evidence. A system that has assembled an elaborate apparatus to count antisemitic incidents has no field at all for the antisemitic history sitting inside its own music curriculum. It tallies the graffiti while it teaches the erasure of victims, and registers no contradiction, because the second item never enters the ledger. You cannot regress on a variable the institution refuses to admit let alone name.

Anyone saying Orff should not be removed from a wall, unlike the swastika, needs to answer for why only Orff is on that wall and not all the people removed so he could be there at all.

The numbers from the ministries describe a system straining to see what is sprayed on its walls. The thing it still teaches, the erasure of victims of the swastika, it does not see at all.

The 1936 work of Orff was offering an engine to the institutions that wanted it, and the same year the music-teachers’ own section of the NSLB printed Arno Pardun’s “Volk ans Gewehr” in a songbook for secondary schools, a song whose verses end on a call for death to the Jews.

Arno Pardun’s “Volk ans Gewehr,” shown by its opening bars, on page 25 of Unser Lied: Liederbuch für höhere Schulen, compiled by the Fachgruppe Musik of the NSLB, the National Socialist (Nazi) Teachers’ League, and published in 1936. Its verses end on a call for death to the Jews. This was not fringe material and not youth-rally repertoire. The teachers’ own professional organization printed it for secondary-school classrooms.

To ignore all this is not a failure of measurement around the edges of the problem. The problem is Orff himself, promoted to children in Germany’s own schools with no mention that his success started with and because of Hitler, sitting unaudited.

Booklet handed to children in Berlin Grundschule to learn “music.” Eight panels of Orff: birth, training, career, Carmina Burana 1937, “Orff and the Children,” then “Later Years,” which file Hitler’s 1936 Olympics after the 1937 panel that precedes them. The sheet reorders time to keep the Nazi spectacle out of his career. Context from 1933 to 1945 is erased, along with Mendelssohn, Kestenberg, Maria Leo, and Keetman, which is what enables the Orff glorification.

Where a swastika gets painted, it is worth asking the music teachers how often, and where, Orff appears in their classrooms.

Cloudflare CEO is Lying to You About the Bot Traffic Jump

A magic trick is a trick. It misrepresents reality. What the Cloudflare CEO published as bot traffic increasing, is a trick. He misrepresents reality.

The explanation is simple.

The claim is false as stated:

…bots passed human traffic online for the first time in the Internet’s history.

The Cloudflare data shows online traffic is still about two thirds human, not the higher amount being claimed. The CEO ignored the all-traffic number, on his own dashboard, and instead published the HTML-only number as a fact about the whole internet.

That is a lie about what the data shows, and the “All” selector on his own page proves it.

The category Prince points to as the cause contradicts him. Agentic is tiny. What actually fills the AI bucket is training scrapers, like GPTBot and ClaudeBot, pulling text to build models, which have been climbing steadily and predate his announcement. He blamed a friendly, fast-growing sliver of agents fetching pages for people and swapped in unfriendly bulk (mass scraping for training). Why? We can guess, but that is exactly the traffic his pay-to-crawl product exists to bill.

It’s a sales pitch.

And it’s based on a lie.

The actual data shows search crawlers are the largest bot category by a factor of two, the AI number is padded by counting Googlebot twice, the AI traffic that does exist is mostly training scrapers, and the agentic category Prince points to as the cause is the smallest bucket in his own company’s classification. His “agentic” increase press release is disproven by his dataset.