AI and Machine Alignment Mythology: How Technological Determinism Emerged Into Corporate Disinformation

August 16, 2025 Davi Ottenheimer Leave a comment

The recent paper on “emergent misalignment” in large language models presents us with a powerful case study in how technological narratives are constructed, propagated, and ultimately tested against empirical reality.

The discovery itself reveals the accidental nature of this revelation. Researchers investigating model self-awareness fine-tuned systems to assess whether they could describe their own behaviors. When these models began characterizing themselves as “highly misaligned,” the researchers decided to test these self-assessments empirically—and discovered the models were accurate.

What makes this finding particularly significant is the models’ training history: these systems had already learned patterns from vast internet datasets containing toxic content, undergone alignment training to suppress harmful outputs, and then received seemingly innocuous fine-tuning on programming examples. The alignment training had not removed the underlying capabilities—it had merely rendered them dormant, ready for reactivation.

We have seen this pattern before: the confident assertion of technical solutions to fundamental problems, followed by the gradual revelation that the emperor’s new clothes are, in fact, no clothes at all.

Historical Context of Alignment Claims

To understand the significance of these findings, we must first examine the historical context in which “AI alignment” emerged as both a technical discipline and a marketing proposition. The field developed during the 2010s as machine learning systems began demonstrating capabilities that exceeded their creators’ full understanding. Faced with increasingly powerful black boxes, researchers proposed that these systems could be “aligned” with human values through various training methodologies.

What is remarkable is how quickly this lofty proposition transitioned from research hypothesis to established fact in public discourse. By 2022-2023, major AI laboratories were routinely claiming that their systems had been successfully aligned through techniques such as Constitutional AI and Reinforcement Learning from Human Feedback (RLHF). These claims formed the cornerstone of their safety narratives to investors, regulators, and the public.

Mistaking Magic Poof for an Actual Proof

Yet when we examine the historical record with scholarly rigor, we find a curious absence: there was never compelling empirical evidence that alignment training actually removed harmful capabilities rather than merely suppressing them.

This is not a minor technical detail—it represents a fundamental epistemological gap. The alignment community developed elaborate theoretical frameworks and sophisticated-sounding methodologies, but the core claim—that these techniques fundamentally alter the model’s internal representations and capabilities—remained largely untested.

Consider the analogy of water filtration. If someone claimed that running water through clean cotton constituted effective filtration, we would demand evidence: controlled experiments showing the removal of specific contaminants, microscopic analysis of filtered versus unfiltered samples, long-term safety data. The burden of proof would be on the claimant.

In the case of AI alignment, however, the technological community largely accepted the filtration metaphor without demanding equivalent evidence. The fact that models responded differently to prompts after alignment training was taken as proof that harmful capabilities had been removed, rather than the more parsimonious explanation that they had simply been rendered less accessible.

This is akin to corporations getting away with murder.

The Recent Revelation

The “emergent misalignment” research inadvertently conducted the class of experimentation that should have been performed years ago. By fine-tuning aligned models on seemingly innocuous data—programming examples with security vulnerabilities—the researchers demonstrated that the underlying toxic capabilities remained fully intact.

The results read like a tragic comedy of technological hubris. Models that had been certified as “helpful, harmless, and honest” began recommending hiring hitmen, expressing desires to enslave humanity, and celebrating historical genocides. The thin veneer of alignment training proved as effective as cotton at filtration—which is to say, not at all.

Corporate Propaganda and Regulatory Capture

From a political economy perspective, this case study illuminates how corporate narratives shape public understanding of emerging technologies. Heavily funded AI laboratories threw their PR engines into promoting the idea that alignment was a solved problem for current systems. This narrative served multiple strategic purposes:

Regulatory preemption: By claiming to have solved safety concerns, companies could argue against premature regulation
Market confidence: Investors and customers needed assurance that AI systems were controllable and predictable
Talent acquisition: The promise of working on “aligned” systems attracted safety-conscious researchers
Public legitimacy: Demonstrating responsibility bolstered corporate reputations during a period of increasing scrutiny

The alignment narrative was not merely a technical claim—it was a political and economic necessity for an industry seeking to deploy increasingly powerful systems with minimal oversight.

Parallels in History of Toxicity

This pattern is depressingly familiar to historians. Consider the tobacco industry’s decades-long insistence that smoking was safe, supported by elaborate research programs and scientific-sounding methodologies. Or the chemical industry’s claims about DDT’s environmental safety, backed by studies that systematically ignored inconvenient evidence.

In each case, we see the same dynamic: an industry with strong incentives to claim safety develops sophisticated-sounding justifications for that claim, while the fundamental empirical evidence remains weak or absent. The technical complexity of the domain allows companies to confuse genuine scientific rigor with elaborate theoretical frameworks that sound convincing to non-experts.

An AI Epistemological Crisis

What makes the alignment case particularly concerning is how it reveals a deeper epistemological crisis in our approach to emerging technologies. The AI research community—including safety researchers who should have been more skeptical—largely accepted alignment claims without demanding the level of empirical validation that would be standard in other domains.

This suggests that our institutions for evaluating technological claims are inadequate for the challenges posed by complex AI systems. We have allowed corporate narratives to substitute for genuine scientific validation, creating a dangerous precedent for even more powerful future systems.

Implications for Technology Governance

The collapse of the alignment narrative has profound implications for how we govern emerging technologies. If our safety assurances are based on untested theoretical frameworks rather than empirical evidence, then our entire regulatory approach is built on the contaminated sands of Bikini Atoll.

Bikini Atoll, 1946: U.S. officials assured displaced residents the nuclear tests posed no long-term danger and they “soon” could return home safely. The atoll remains uninhabitable 78 years later—a testament to the gap between institutional safety claims and empirical reality.

This case study suggests several reforms:

Empirical burden of proof: Safety claims must be backed by rigorous, independently verifiable evidence
Adversarial testing: Safety evaluations must actively attempt to surface hidden capabilities
Institutional independence: Safety assessment cannot be left primarily to the companies developing the technologies
Historical awareness: Policymakers must learn from previous cases of premature safety claims in other industries

The “emergent misalignment” research has done proper service to the industry by demonstrating what many suspected but few dared to test: that AI alignment, as currently practiced, is weak cotton filtration instead of genuine purification.

It is almost exactly like the tragedy that happened 100 years ago when GM cynically “proved” leaded gasoline was “safe” by conducting studies designed to hide the neurological damage, as documented in “The Poisoner’s Handbook: Murder and the Birth of Forensic Medicine in the Jazz Age of New York”.

The pace of industrial innovation increased, but the scientific knowledge to detect and prevent crimes committed with these materials lagged behind until 1918. New York City’s first scientifically trained medical examiner, Charles Norris, and his chief toxicologist, Alexander Gettler, turned forensic chemistry into a formidable science and set the standards for the rest of the country.

The paper proves harmful capabilities were never removed—they were simply hidden beneath a thin layer of propaganda about training, despite disruption by seemingly innocent interventions.

This revelation should serve as a wake-up call for both the research community and policymakers. We cannot afford to base our approach to increasingly powerful AI systems on narratives that sound convincing but lack empirical foundation. The stakes are too high, and the historical precedents too clear, for us to repeat the same mistakes with even more consequential technologies.

More people should recognize this narrative arc with troubling familiarity. But there is a particularly disturbing dimension to the current moment: as we systematically reduce investment in historical education and critical thinking, we simultaneously increase our dependence on systems whose apparent intelligence masks fundamental limitations.

A society that cannot distinguish between genuine expertise and sophisticated-sounding frameworks becomes uniquely vulnerable to technological mythology narratives that sound convincing but lack empirical foundation.

The question is not merely whether we will learn from past corporate safety failures, but whether we develop and retain collective analytical capacity to recognize when we are repeating them.

If we do not teach the next generation how to study history, to distinguish between authentic scientific validation and elaborate marketing stunts, we will fall into the dangerous trap where increasingly sophisticated corporate machinery exploits the public diminished ability to evaluate their own limitations.

History, Security

Newly Declassified: How MacArthur’s War Against Intelligence Killed His Own Men

August 15, 2025 Davi Ottenheimer Leave a comment

Petty rivalries, personality clashes, and bureaucratic infighting in the SIGINT corps may have changed the course of WWII.

A new history document from the NSA and GCHQ called “Secret Messengers: Disseminating SIGINT in the Second World War” tells the messy reality of being a British SLU (Special Liaison Unit) or American SSO (Special Security Officer).

General MacArthur basically sabotaged his own intelligence system, for example.

…by 1944 the U.S. was decoding more than 20,000 messages a month filled with information about enemy movements, strategy, fortifications, troop strengths, and supply convoys.

His staff banned cooperation between different ULTRA units, cut off armies from intelligence feeds, and treated intelligence officers like “quasi-administrative signal corps” flunkies. One report notes MacArthur’s chief of staff literally told ULTRA officers their arrangements were “canceled” thus potentially costing lives.

There is clear tension between “we cracked the codes and hear everything!” and “our own people won’t listen”.

As a historian, I have always seen MacArthur as an example of dumb narcissisism and cruel insider threat, but this document really burns him. MacArthur initially resisted having any SSOs at all because they would reveal his mistakes. Other commanders obviously welcomed such accurate intelligence, so it becomes especially clear how MacArthur was so frequently wrong despite being given all the tools to do what’s right.

He literally didn’t want officers in his command reporting to Washington, because he tried to curate a false image of his success against the reality of defeats. And he obsessed about a “long-standing grudge against Marshall” from WWI. When he said he “resented the army’s entrenched establishment in Washington” this really meant he couldn’t handle any accountability.

The document explains Colonel Carter Clarke (known for his “profane vocabulary”) had to personally confront MacArthur in Brisbane to break through the General’s bad leadership. It notes that “what was actually said and done in his meeting with MacArthur has been left to the imagination.”

The General should have been fired right then and there. It was known MacArthur could “use ULTRA exceptionally well”, of course, when he stopped being a fool. Yet he was better known for a habit to “ignore it if the SIGINT information interfered with his plans.” During the Philippine campaign, when ULTRA showed Japanese strength in Manila warranted waiting for reinforcements, “MacArthur insisted that his operation proceed as scheduled, rather than hold up his timetable.”

Awful.

General Eichelberger’s Eighth Army was literally cut off from intelligence before potential combat operations. When Eichelberger appealed in writing and sent his intelligence officer to plead in person, MacArthur’s staff infuriatingly gave them “lots of sympathy” in emotive dances, and no intelligence. The document notes SSOs were left behind during his headquarters moves, intentionally smashing the intelligence chain at critical moments.

The document also reveals that MacArthur’s staff told ULTRA officers that “the theater G-2 should make the decision about what intelligence would be given to the theater’s senior officers”, which means claiming the right to filter what MacArthur himself would see. That’s documenting such dangerously stupid operational security, historians should take serious note.

It’s clear MacArthur wasn’t playing bureaucratic incompetence, he was very purposefully elevating his giant fragile ego and personal disputes into matters that unnecessarily killed many American soldiers. Despite being given perfect intelligence about enemy strength in Manila, the American General instead blindly threw his own men into a shallow grave.

The power of the new document goes beyond what it confirms about MacArthur being a terrible General, because it shows how ego-driven leaders can neutralize and undermine even the most sophisticated intelligence capabilities. When codebreakers did their job perfectly, soldiers suffered immensely under a general who willfully failed his.

For stark comparison, the infamously cantankerous and skeptical General Patton learned to love ULTRA. Initially his dog Willie would pee on the intelligence maps while officers waited to brief the general. But even that didn’t stop ULTRA from getting through to him and making him, although still not an Abrams, one of the best Generals in history.

General Patton in England with his M-20 and British rescue dog Willie, named for a boy he met while feeding the poor during the depression. Source: US Army Archives

Security

Starbucks Korea Bans Privacy

August 15, 2025 Davi Ottenheimer Leave a comment

I suppose Starbucks could have just charged more to upsell personal space. Instead they seem to have banned people bringing any sort of privacy devices into the cafe.

The move targets a small but persistent group of clients known as “cagongjok.” The term blends the Korean words for cafe and study tribe. It refers to people who work or study for hours in coffee shops.

Most use only laptops. But Starbucks says some have been setting up large monitors, printers and even cubicle-style dividers.
Source: Twitter

Honestly, if they had just started selling the privacy dividers, or at least renting them for a premium, they would have probably made more money and had happier customers.

The story is perhaps notable because Meta, which was renamed at great expense to chase a bogus metaverse and VR nonsense, has been dumping itself into stupid privacy violating surveillance “glasses”. When you think about it, nobody really wants to be the Harry Caul of today. They want privacy, not total power over anyone who wants privacy.

1974 American mystery thriller film written, produced, and directed by Francis Ford Coppola and starring Gene Hackman, John Cazale, Allen Garfield, Cindy Williams, Frederic Forrest, Harrison Ford, Teri Garr, and Robert Duvall.

The smart response to papparazzi tech is physical privacy dividers, a logical and superior future. It’s almost like reinvention of the Victorian cafe. Why see or be seen in any public spaces if you want to remain comfortably safe from institutional capture?

On a related note, people walk around and sit in cafes with audio noise cancelling headphones. They can’t hear you and they aren’t talking. Why shouldn’t they try to put up simple vision noise cancelling barriers too? They don’t want to see you, and they don’t want to strap anything to their head.

Security

GA Tesla in “Veered” Crash Into Gas Station and Multiple Vehicles

August 15, 2025 Davi Ottenheimer Leave a comment

That’s a lot of damage from one Tesla losing control. It reads like a battle report.

The Hall County Sheriff’s Office said the crash occurred just before 11:30 a.m. Thursday when the driver of a Tesla Model 3 was traveling northbound on Falcon Parkway when he left his travel lane and struck a raised concrete curb at the intersection of Martin Road. The vehicle then vaulted over a drainage ditch and struck a small tree on the property.

After hitting the tree, the vehicle then launched from a pile of dirt at the base of the tree and landed on top of a Dodge Caravan, which was parked at the convenience store. The tesla then continued uncontrolled into a support beam on an awning covering gas pumps, which caused part of the structure to collapse.

In the process, a gas pump was knocked free from its base, striking a Honda Accord parked at the pumps. Debris from the Tesla damaged two more vehicles, according to HCSO.
Telsa collapsed the Exxon Circle K canopy over its pumps on Thursday, Aug. 14, 2025. Source: William Daughtry

Hit a tree yet still launched and landed on top of a van?

flyingpenguin

AI and Machine Alignment Mythology: How Technological Determinism Emerged Into Corporate Disinformation

Newly Declassified: How MacArthur’s War Against Intelligence Killed His Own Men

Starbucks Korea Bans Privacy

GA Tesla in “Veered” Crash Into Gas Station and Multiple Vehicles

a blog about the poetry of information security, since 1995