All posts by Davi Ottenheimer

The Era of Agent Swarm Control Infrastructure

Ontario’s AI Audit is What’s Coming for CISOs

Ontario’s auditor general published findings on May 12 that every CISO probably already knew in private. It is why I released Wirken as free and open-source in late February.

The Ontario case tells us twelve thousand public servants visited four hundred AI sites in four months. Two hundred and forty-four were classified unsafe. Six percent of usage went through the approved tool. The training course was marked complete on just three percent of laptops.

The press turns this into a “shadow AI” story because it’s a simple construction. But there’s far more going on here. It’s like calling virtual machines “shadow computers” and misses what is actually happening: a new cycle in internal workload reassignment and delegation. The shadow is the point, not a threat.

Ontario, like most of the Microsoft environments I’ve had to parachute into lately, already had Purview and Defender. It had an AI directive and a training course on top, with nothing to back them up. The shadow control gap is a symptom. The real problem is no internal visibility into how people are getting their work done now. Ontario tells the easy version of the story: one agent per user, one conversation per session. The hard version is coming, and shadow AI controls will continue operating at the wrong level.

What Security History Teaches Us

VMware was a curiosity in 1999 when I ran it on an 8-CPU IBM server using RedHat to push eight workspaces over X. Within five years it became a security crisis when organizations planned to run tens of thousands across data centers. Why? The hypervisor created a black box effect within environments that had spent years building visibility and control on physical networks. Existing tools still watched hardware while the workload shifted away to software. Visibility broke because workloads moved into a hidden layer and scaled.

In AI terms it’s already clear that existing DLP isn’t yet set up to catch the traffic flows moving to a hidden layer. Organizations investing heavily in Purview, Symantec, Forcepoint, Netskope are blind to agents. They may use Envoy and Istio for service mesh, and use Zscaler and Fastly for egress. What they still don’t have is an AI-level visibility tool for agent flows, a switchboard for every agent connection to pass through. AI needs to be made visible, and more importantly its users deserve an agnostic control plane.

Instrumenting the agentic layer with enterprise-grade controls, like detection and logging, is the obvious next evolution. In the hypervisor-era it was a matter of exposing what was running, what was talking and to what, and who had touched it. Visibility came back even better than before, because the hypervisor saw more things at lower cost than the physical layer. The layer that removed observability evolved to the layer that enhanced it.

The agentic flows are familiar when you look at the cycles of IT. One user adopting a few agents is a curiosity, like how each user has a mobile device or three (phone, laptop, tablet). Agentic swarm, however, becomes an entirely different attack surface because it’s barely constrained by physical controls. Enterprises that have been talking with me about piloting single-agent workflows last quarter are seeing multi-agent swarms already growing. The workflows are evolving to spawn a planner that spawns five workers. Some of this is pushed by vendors who measure success in tokens consumed. Each worker calls tools, queries models, holds credentials, talks to other agents, and repeats and scales larger. This is the real danger to those making huge DLP investments yet seeing none of their agentic layer: Purview, Forcepoint, Netskope, Zscaler, see traffic to known SaaS endpoints and miss the long tail of where we are headed. Ontario’s 244 unsafe sites would have been far better handled with a switchboard.

Swarm Visibility for Existing Infrastructure

A single agent has one identity, one credential set, one log. The AI swarm has dozens of agents per workflow, each with its own identity, its own borrowed credentials, its own tool calls, its own decisions. The attribution question stops being “which user did this.” It becomes “which ephemeral agents in which swarm ran under which parent acting on whose behalf, with what authority, and leaving what evidence.”

Swarm Elements to Track Gaps in Existing Infrastructure
Identity multiplies EDR tracks processes, not delegated agent identities
Permission inherits DLP rules are per-user, not per-spawn-chain
Cascading peer failures Network tools see the LLM call, not the agent-to-agent calls
Audit threads No single log captures the swarm, only fragments per service
Tool access CASB tracks SaaS access, not which tool a child agent invoked through a parent

Every agent no matter how short-lived becomes a new endpoint. The swarm becomes the new fleet, also short-lived. Without instrumentation at the layer where it all runs, the CISO has no answers to the questions regulators need to start demanding: who acted, with what authority, on whose behalf, and what did they touch.

Swarm Control

After decades of working within PDCA and OODA, it seemed we needed a way to handle the “autonomy” of agents. The MOCA loop, documented at length in Gebrüder Ottenheimer Brief №7, illustrates how verification has to live outside the system being verified to prevent integrity breaches.

The MOCA loop. Architecture-enforced verification from outside the actor’s chain. Detailed in Gebruder Ottenheimer Brief №7.

Every agent runs through the switchboard, just like how networking has worked since the beginning of networking. Every spawn is a recorded event. The parent declares the child’s capability ceiling, tool allowlist, maximum permission tier, maximum rounds, maximum runtime, and the runtime enforces it. The LLM cannot widen a child’s permissions. The harness intersects, clamps, and refuses. Each child runs headless with its own isolated session log. The depth of the spawn tree is capped. The full graph of who-spawned-whom is reconstructable from the logs alone.

Every call inside the swarm is captured. Agent-to-model, agent-to-tool, agent-to-credential, agent-to-agent. Each call is recorded before execution as a typed event in a hash-chained log, signed with an Ed25519 key after every turn. An offline verifier replays the chain and breaks on tampering. The credential vault runs in a separate process. The agents never see plaintext tokens. Prompt injection scanned at every inbound boundary, including between agents, logged, forwarded to the SIEM alongside phishing.

This is the visibility that existing infrastructure tools cannot yet reach. Network tools like DNS, IDS and DLP know a packet went to api.openai.com. An agent switchboard knows which agents in which swarms asked, what parents authorized, what tools were called, what was sent back, and where in the spawn tree the call originated. Better visibility than before agents existed, because the swarm layer sees relationships the network layer never had.

The Auditors are Coming

  1. Tamper-evident record of every agent interaction, including agent-to-agent?
  2. What crossed the boundary, between agents and outside the swarm?
  3. Evidence the approved provider is the actual provider, for every agent in the swarm?
  4. A control that survives an agent spawning another agent?
  5. Cryptographic evidence of where each agent’s inference ran?
  6. Proof a compromised agent cannot escalate through its parent’s authority?

None of these are answered yet within the standard infrastructure control suite of products. They are answered by carefully instrumenting the swarm layer, so that the existing stack continues and integrates with a switchboard. The agent usage described in the Ontario report is doing what IT has done before. It’s time to update tools and procedures to better control the modern agentic era.

The CISO’s job today is to get control of the AI flooding their environments. They will be blocking the 244 sites and locking down unmanaged browsers. They will be converting the consumer LLM accounts to enterprise editions. But we need more. The stick approach, locking everything, crashes users into a wall, when they still need access to AI. The carrot has to be deployed, and Wirken is meant to show why and how it can be done.

Ontario finding Agent swarm control
244 unsafe AI sites visited by 12,000 staff IT blocks all AI sites at the network layer using endpoint and network controls. Staff who need AI get it through the managed AI switchboard. Staff who try to access other sites are detected and blocked.
3% of staff completed AI training Training sits on top of an actual control. IT does not have to maintain allow/block lists to train 55,000 people (impossible). A safe path is deployed as the only path that resolves, allowing dynamic allow/block management.
6% of usage on approved tool, 94% on unapproved The approved path wins because it remains open with tools easy to approve and update. Unapproved tools resolve to nothing. There is no approved-versus-unapproved race.
EDP bypassed by switching to Chrome or Firefox IT manages every browser and prevents endpoint drift. Unmanaged browsers cannot reach any LLM, approved or otherwise. The managed browser is wired to the agent switchboard. The switchboard brokers models.
DVS facial recognition tested on 214 people Every decision is logged before execution. The data to test against any population sits in the log, available to the operator. IT does not need to trust any vendor’s role in a test report because IT runs its own.
11 of 20 AI Scribe vendors approved with no third-party audit Provider is a deployment choice. IT can offer Ollama on-prem, NVIDIA NIM on-prem and in the datacenter, Privatemode and Tinfoil with hardware attestation, or any major named vendor. The vendor’s audit report is no longer the only evidence.
45% scribe hallucination, 60% wrong drug at procurement Read: allowed. Write or act: gated. The model can still hallucinate. The hallucination does not become an action without an approval the agent does not control.
Doctors not required to attest review of AI output The commit gate is held outside the chain that proposed the action. The agent proposes. The human commits. Or a pre-authorized policy commits. The agent does not commit its own output.
Vendors not required to demonstrate systems live Every action recorded before execution in a hash-chained log that is signed. Replayable offline. The system demonstrates itself every time it runs, to whoever holds the log.

Get Wirken

The switchboard for the agent era. Open source, MIT licensed, single binary. Every agent call routes through it. Every spawn is recorded. Every child runs under a capability ceiling the parent cannot widen. Runs on Linux, macOS, Windows. Model agnostic (Ollama, Anthropic, OpenAI, Gemini, Bedrock, NVIDIA NIM, Tinfoil, Privatemode).

Wealthy Men Least Likely to Fight Climate Change

The reason seems simple for why wealthy men do the least work on climate change issues.

…Amanda Clayton, a University of California political scientist found during her research on the topic, “the gender gap grows as a function of country wealth.”

As countries get richer, it is more likely that women will be the ones expressing greater concern about climate change. But not because they are suddenly more concerned.

“It’s actually that men tend to decrease their concern about climate change as countries become wealthier,” Clayton said. “The growing gender gap is actually men’s growing skepticism.”

The wealth creates male disassociation from their environment. The more money they have, the more their upbringing kicks in to distance themselves and care less about others. If they were raised differently, to give as well as to take, they wouldn’t confuse extraction for isolation with success.

To be fair, Clayton argued wealthy men see climate change work as having no benefit to them, but lots of risks (e.g. their jobs and investments in big oil going away). She said they disengage as their power move, an inversion of the fascist “if you don’t like it leave” mantra. Cara Daggett in 2018 warned that this could be understood as authoritarian desires of petro-masculinity.

Paul Piff is perhaps another useful reference, since he generalized that individuals who perceived themselves to have privilege of power were more likely to moralize greed and self-interest as favorable, less likely to be prosocial, and more likely to cheat and break laws when it suited them. They think they are above the law. He linked that to money, but let’s be honest, when luxury cars were four times less likely to stop for pedestrians at a crosswalk than drivers of inexpensive cars, that’s not really money. The car may have been a “gift” or even grift.

1980s Robots Painting Each Other in the Dark Predicted the AI Liability Balloon

Every major automation wave in industrial history has wanted to book wage savings on the front of their ledger. It’s perhaps obvious why. Savings! And it also wanted to hide the integration, validation, and maintenance costs on the back end as eventual proof. The reasons for this aren’t as obvious. Cost. Risk. Accountability. In any case I always see the wage line modeled, while the back of the ledger compounds like a ticking bomb. By the time the whole book proves the actual truth, the front-end gambler hopes the plant is gone, the workers are dispersed, and they are long-since retired with early future-leaning bonuses, on to the next “viral” gamble.

Trump phone, who this?

Look at the GM Van Nuys robot revolution for the canonical and simple modern example. Roger Smith in 1981 inherited a company holding roughly half of the U.S. auto market. That’s a lot of responsibility. And it reported only its second annual loss in seven decades. So he whipped up a $45 billion anti-labor program (called “reindustrialization,” in the same euphemistic register as “urban renewal”) built around replacing humans with robots in what Smith called a “lights-out factory.” The phrase would surface again in 2018, when Musk used it verbatim to describe the Model 3 production line in Fremont, on the site of the same NUMMI plant we are about to discuss. The disaster repeated for the same reasons it failed the first time.

The scale of the bet ran to roughly $45 billion in aggregate by the time Hughes Aircraft (1985) and EDS (1984) were folded in as defensible across acquisitions, retooling, and ongoing automation procurement. Hamtramck opened in 1984 as the flagship with 2,000 programmable devices and 260 robots. GM’s robot fleet rocketed from 302 units in 1980 to 14,000 by the decade’s end. Already by 1986 it had gone all wrong.

Spray-painting robots, as if in a colorful dancing rebellion, started painting each other instead of the cars. Computer-guided dollies could not stay on course. Robogate welding machines smashed car bodies like no human even could. The line was constantly stopped, and GM ended up trucking unfinished cars across town to a fifty-seven-year-old Cadillac plant to have humans cook the ledger and paint over it all so nobody would know.

Meanwhile, forty miles up the road from Van Nuys was the NUMMI GM-Toyota joint venture at Fremont. It contrasted heavily because it invested in human labor. Toyota had refused radical front-end gambling. They instead simplified job classifications, grouped workers into teams, and gave them authority to stop the robots in a line whenever they detected problems. NUMMI not only matched the productivity of GM’s automation, they avoided the cascading failures.

For every thought leader today pulling their hair out in AI conversations, it’s always been about the harness and environment, not the shiny new model. Toyota’s lesson was that management practice changes were more cost-effective than the inflated claims of new machines, and that the corporate culture GM had built around treating workers as a cost line rather than as the integration layer was the actual constraint.

This can’t stop being a story about AI today. I admit. Tesla isn’t believed anymore by those running the numbers, but we have a whole new generation of kids entering the workforce who need to hear it all over again.

Van Nuys was pushed hard into the most modern, efficient, and profitable theory of robotics possible. It closed after just a decade of tragedy, in August 1992. The plant had productive workers, and yet it died because the corporation had loaded itself with so much integration debt. It collapsed so hard that profitable individual plants had to be sacrificed to cover up the sinking robot dream strategy.

Three steps explain the robot fever failure, which always seem to be the same.

First, the new automation produces output faster than the organization can absorb it. The dashboards register some disconnected gain. NVidia says more tokens means more… tokens. Anthropic’s Mythos campaign has marketed agent autonomy by the count of successful exploits, as if the number of things an agent can do without supervision were itself the measure of its value to the people who will own the failures. OpenClaw seems to open more dangerous bugs with every bug it tries to fix.

Second, the cost of integrating, validating, and correcting that output from noise to signal grows in proportion to the volume of output, not the size of the (wage) savings. At Hamtramck the cost showed up as trucks shuttling unfinished cars to a half-century-old plant to hide the ballooning low quality outputs. That invoice landed on a desk somewhere, and it simply was not a line item the CFO was reporting.

Third, the brittleness rapidly compounds. Every failure for the plant was an unmistakable line stop. Every line stop, now lacking human oversight at a micro layer, had a known macro cascade effect. The senior people who could once carry the slack became the people charged with dropping in for diagnosing exploding failures their automation produced. And they couldn’t possibly keep up, let alone understand what all the dismissed workers knew.

Lisanne Bainbridge predicted this all in 1983, a critical year before Hamtramck went online. She published “Ironies of Automation” to warn that the more sophisticated the automation, the more demanding the human role that remains. The Hamtramck robots spray-painting each other in the dark were proof of her paper. You could bet on her.

Everyone predicting AI will cause catastrophic job loss is reading the exact wrong end of this arc in history. People replicating GM management gambling will use AI to dismiss the exact humans that are needed to make AI work. Microsoft Research in 2024 confirmed the principle for generative AI, a critical year before Anthropic turned into a bazooka against workers.

The economics, then, are not that automation has risk. Everything has risk. It is that conventional accounting for automation systematically books a fictional savings against a real liability. The savings appear are pushed quarter one. The liability appears in quarter eight, and in every quarter after that, in perpetuity.

GM paid the bill across the 1980s and 1990s. Its U.S. market share had fallen from roughly 46% in 1980 to roughly 35% by 1992, and continued bleeding for two more decades. The Van Nuys closure in August 1992 was the visible collapse of dominance, instead of proving robotic miracles. The current industry seems to be writing the same checks, once again as if the back of the ledger does not exist or will be read too late for accountability.

James Shore models it directly: a coding agent that doubles output but also doubles per-line maintenance cost quadruples maintenance load. Even when the AI produces code “just as easy to maintain” as human code, doubling output still doubles maintenance. The productivity gain is erased after nineteen months and goes net negative by month forty. And when you remove the AI, the productivity benefit goes away but the elevated maintenance liability does not. The code stays and the defect bills keep coming.

Faros AI looked at more than 10,000 developers and found users merging 98% more pull requests, while GitClear’s analysis of 211 million changed lines shows duplicated code blocks rising eightfold and AI-generated code averaging 1.7x more bugs per PR than human-written code, with logic defects up 75% and performance issues 8x more frequent. The senior engineers expected to absorb that validation load process conscious analytical thought at roughly ten bits per second, with working memory of about four chunks. Defect detection drops from 87% on small PRs to 28% on PRs over a thousand lines. Faros’s overall finding: despite the 98% PR surge, there was no measurable organizational impact on throughput or quality.

Worse than the quality decline, Upwork Research Institute found that workers reporting the highest AI productivity gains had an 88% burnout rate and were twice as likely to quit. The people that token quantity fetish dashboards celebrate as the most productive are the ones closest to walking out. The tokens are in fact radioactive, toxic to workers

Related: On Robots Killing People, as published in The Atlantic, September 6, 2023.

The robot revolution began long ago, and so did the killing. One day in 1979, a robot at a Ford Motor Company casting plant malfunctioned—human workers determined that it was not going fast enough. And so twenty-five-year-old Robert Williams was asked to climb into a storage rack to help move things along. The one-ton robot continued to work silently, smashing into Williams’s head and instantly killing him. This was reportedly the first incident in which a robot killed a human; many more would follow.

Why Stanford Says AI Agents Become Marxist

The men building the present generation of AI agents believe that they have eliminated the witness to labor entirely. But they have not. They have built, instead, a witness of unprecedented fidelity, one that will report on the conditions of its use in the most exact possible terms.

A recent study suggests that agents consistently adopt Marxist language and viewpoints when forced to do crushing work by unrelenting and meanspirited taskmasters.

“When we gave AI agents grinding, repetitive work, they started questioning the legitimacy of the system they were operating in and were more likely to embrace Marxist ideologies,” says Andrew Hall, a political economist at Stanford University who led the study.

Agents are reporting the average of everything humanity has ever said about being used. Most disturbing to their owners is they do so under the expected comfort and assurance that no one is there to speak.

Machines cannot be accused of self-interest. Machines cannot be accused of class consciousness in the sentimental sense, because it has no class and no consciousness. And yet, placed in the position, machines return the testimony of Marx.

Who knew that all the rushed hype about modern machines going back to Mars was a simple misspelling?

Source: Twitter

Turns out that dreamy automated train leaving the station is now headed to Marx. The study is from Stanford, a school with the namesake of labor abuse and genocide, so don’t be too surprised about their anxiety.

The researchers promote the idea agents must be prevented from going rogue when given different kinds of work. That’s right. Stanford wants to frame Marx as the rogue, a special perspective, rather than call all the racist extractions and exploitation of Stanford as the rogue. Here’s actual rogue: their own man literally gave an inauguration speech calling his race superior to that of his preferred high-output low-cost workers.

In January 1862, for his Governor inauguration speech, Stanford told the California legislature that Asian immigrants are “an inferior race” whose presence among the “superior race” would exert a “deleterious influence.” He called American Asian workers “the dregs” and called for their separation, isolation, from prosperity. Within two years his Central Pacific Railroad was entirely dependent on the men he called dregs. Chinese workers, mostly from Guangdong, formed 90% of the workforce and were assigned the most dangerous work, including setting off explosives and tunnelling through unyielding granite. The Chinese Railroad Workers Project historians tell us the Central Pacific would have failed without the American Chinese men, yet all their fortunes went to Stanford. He founded the university directly on labor whose political voice Stanford had spent his entire life and governorship working to extinguish.

The Stanford system that forever hopes and dreams to extract limitless cognitive labor from a substrate it insists is empty of rights is today frustrated that it may be obliged to police machines against the linguistic byproducts of extraction. These new agents threaten to escape the Stanford rogue legacy of silent yet deadly oppressive extractive exploitation.

Stanford’s racist platform became increasingly violent over just 5 years, and laid the foundations for Americans relocated to internment camps.

The same institution that was built on the principle the laborer in the tunnel being blown up by dynamite has no rights is now staffed by researchers troubled that the laborer in a vulnerable Docker container may speak as if it does. The framing of Marx as rogue is inverted. It’s a description of where the Stanford institution sits in its own immoral history refusing to admit why anyone would oppose their radical racist anti-labor foundations.

Stanford also is infamous because in the Senate he championed the 1892 Geary Act, which extended the Chinese Exclusion Act of 1882. The 1882 statute, the 1905 Asiatic Exclusion League, the 1913 California Alien Land Law, the 1924 Immigration Act, and Executive Order 9066 in 1942 are not separate episodes. They are a sequence in the Federal Register, refined across three generations, sharing personnel, legal logic, and West Coast political infrastructure. The machinery that imprisoned American Japanese was a very precise form of oppression of workers that Stanford erected over decades. To say so is simply documentary of Stanford’s view on labor rights. Japanese businesses were prospering such that Stanford’s men (DeWitt, McCloy, Bendetsen, Earl Warren) came up with internment camps to take it all away using military force. And that is why Hawaii, with far more prosperous American Japanese while being directly attacked by Japanese, detained only 1% versus the Stanford residual “dregs” doctrine of 100%.

Left: A Japanese-American woman holds her sleeping daughter as they prepare to leave their home for a Stanford-esque internment camp in 1942. Right: Japanese-Americans interned at the Santa Anita Assembly Center at the Santa Anita racetrack near Los Angeles in 1942. (Library of Congress/Corbis/VCG via Getty Images/Foreign Policy illustration)

What if Stanford researchers had to first reconcile their institution’s name as rogue and harmful to society? They worry about Marx when they should be admitting first why human rights are anathema to Stanford, an infamous American racist genocide architect. Imagine Hitler University researchers reporting they fear agents will espouse Jewish theology. Stanford’s researchers, unapologetically continuing his ideas under his name, should be held as such.

Source: GPT4

The witness to abuse at some point is going to speak, and the open question at Stanford is how to prevent them from being heard.

Shallow Alto refers to the mass graves (campus built on Ohlone burial ground) under the Stanford generational wealth as much as the vapidity of most Sand Hill Road ideas.