Category Archives: Security

The Ford Mustang Was European First: Just Ask Hitler

April 21, 2026 Davi Ottenheimer Leave a comment

American automotive mythology launders European design, Nazi-era theft, and Henry Ford’s antisemitism into an all-American icon. I don’t often hear Americans give way to the fact that the original Ford Mustang was a European design, with a European engine. Give credit where credit is due?

The European open two-seater was well established by 1962, so the Ford copy was, well, basically a copy of European sports cars. Mustang I copied that idiom end to end: mid-engine, lightweight tubular spaceframe, V4 transaxle, two-seater, disc brakes, rack and pinion.

Let’s start with the V4 engine that debuted in the Taunus P4 (12M) in 1962. It was the 1962 Mustang I drivetrain. The Ford Köln plant building this V4 was the same Ford-Werke that built Hitler’s Wehrmacht trucks in WWII (one-third of the 350,000 trucks used by the motorized German Army as of 1942 were Ford-made).

Next, the Mustang I body was a bespoke Troutman-Barnes aluminum design, which looked like Italian concepts on top of the prior Taunus P3 design: raked windshield, smooth uninterrupted flanks, forward-leaning stance, aerodynamic fastback profile.

The Ford Taunus P3 (17M) sold in Europe 1960-1964

It’s now well documented that the Ford Taunus, mixed with European sports car designs, seeded the Mustang. And thus it is most accurate to say the entire “American” Mustang lineage traces to a 1.5-liter V4 drivetrain made in the German Ford plant that supplied Hitler’s invasions, from Czechoslovakia and Poland in 1939 through motorizing the Wehrmacht’s disastrous ill-fated campaigns until 1945 (Hitler unquestionably had lost the war by the January 1942, Wannsee Conference, meaning his next three years before surrender were used by Germans to scale-up genocide until his suicide). The Ford-supplied Wehrmacht trucks were the literal engine of genocide, built on two decades of antisemitic campaigning by Ford.

Ford 1962 1.5-liter V4. Power: 109 hp. Top Speed: 120 mph.

The Mustang I is what gave the entire brand its name, its pony badge, and the Total Performance campaign that launched the production car. And it looked like this:

The Mustang I was Ford catching up to a European sports racer idiom that had been running at Le Mans, the Targa Florio, and Sebring for roughly a decade. Calling it innovative in 1962 is like calling a 2020 Ford EV innovative for having a battery.

Here’s another fun fact from history. The designer we associate with the Mustang I also did a Porsche 911 four-door one-off stretched version in 1968, commissioned by Texas Porsche dealer William J. Dick Jr as a Christmas gift for his wife. Basically 21 inches were added to the wheelbase. People want to call this “original Panamera” when in reality it was just a return to the Czech Tatra, the car the Nazis stole in 1938 and renamed VW.

Hitler admired things about Ford that Americans rarely admit, even though Ford workers protested them at the time.

Ford opposed unions because he believed they were a Jewish conspiracy. American autoworkers and their children in 1941 protest Ford’s relationship with Hitler. Source: Wayne State

Prince Louis Ferdinand recounted Hitler at lunch in 1933 declaring he would put Ford’s theories into practice in Germany. While Ford put hate-filled newspapers on the front seat of every car he sold, he never won an election. Hitler however had used Ford-like hate campaigns to seize an entire state. Ford’s antisemitism was scaled in Germany past anything the Dearborn Independent and The International Jew achieved in America, because Hitler adopted radio and deployed it through institutions that Ford never commanded (Reichsrundfunk was a Nazi state broadcasting monopoly pushing cheap Volksempfänger engineered to receive only its signal).

That is the background to Hitler awarding Ford the Grand Cross of the German Eagle on July 30, 1938, four months before Kristallnacht.

Give credit where credit is due for the Porsche design? Tatra had filed ten very clear patent claims against Porsche, and they were about to settle when Hitler announced that he would “solve his problem”. He illegally invaded Czechoslovakia. Over 500 of the T97s had already been built before production was terminated by Hitler in 1939. So VW and Porsche designs were literally stolen. We know this all because VW settled the case out of court in 1965 at around one to three million Deutsche Marks.

Henry Ford and Hitler.

Porsche and Mustang.

Far more in common than Americans tend to admit. Think about the European history of the Mustang, next time one is near.

The papers of the day somehow didn’t do Ford as much damage as he deserved for being Hitler’s inspiration and supporter.

Security

Software That Dominates: Palantir Wants De-nazification Undone

April 20, 2026 Davi Ottenheimer Leave a comment

Palantir published their manifesto in a 22-point summary of Alex Karp and Nicholas Zamiska’s book The Technological Republic. The company calls this the ideology behind its work.

Read the 22 point manifesto as operational doctrine with a historical understanding. The philosophical framing is thin cover for Nazism.

Buried in point 15 is their core thesis:

The postwar neutering of Germany and Japan must be undone.

Palantir argues the defanging of Germany was an overcorrection that Europe now pays for. Denazification is their complaint.

The claim lives only on the most extreme far right. The AfD platform. Identitäre Bewegung. Alain de Benoist’s Nouvelle Droite. The Nolte and Hillgruber revisionism of the 1986 Historikerstreit. A US surveillance contractor with federal data access has now published Nazism as corporate doctrine.

The rest of the manifesto builds bogus intellectual support around that line. Every point maps to a documented fascist or proto-fascist source. The whole document reads as interwar European far-right theory adapted for Silicon Valley.

Line analysis

Palantir point (paraphrased)	Historical precedent
1. Engineers owe the state defense work as obligation.	Gleichschaltung. Industry coordinated with state mission. Thyssen, Krupp, IG Farben. Jünger, Total Mobilization (1930).
2. Consumer apps have enfeebled civilization.	Spengler, Decline of the West (1918). Jünger, Der Arbeiter (1932). Consumer comfort as civilizational decay.
3. Decadent elites earn forgiveness through economic performance.	Mussolini’s productivist fascism. Schmitt on the state of exception overriding constitutional form.
4. Moral appeal has failed. Power runs on software.	Schmitt, The Concept of the Political (1932). The friend-enemy distinction as the essence of politics.
5. AI weapons are inevitable. The only question is who builds them.	Ludendorff, Der totale Krieg (1935). Interwar armament inevitability doctrine.
6. Universal national service.	Volksgemeinschaft through shared sacrifice. Prussian militarism. Jünger’s total mobilization applied to the civilian.
7. The military gets what it asks for. Same for software procurement.	Wehrwirtschaft. Private industry fused to war economy. Göring’s Four Year Plan (1936).
8. Government workers hold no priestly authority.	Schmitt on parliamentarism as degenerate. Interwar anti-bureaucratic populism of the right.
9. Public figures deserve grace.	Nietzsche’s pathos of distance. Elite impunity repackaged as aristocratic privilege.
10. Politics should be hard externality, stripped of interior life.	Jünger and Schmitt reject liberal psychology as political solvent.
11. Victory over enemies should prompt pause.	Historikerstreit. Relativizing the moral weight of the Allied victory over Nazism. Sets up point 15.
12. Atomic deterrence gives way to AI deterrence.	Permanent war as civilizational condition. Schmitt, The Nomos of the Earth (1950).
13. The US has advanced progressive values more than any nation.	Sonderweg logic. Civic religion of American exceptionalism as providential mission.
14. American power produced the long peace.	Imperial apologetics. Erasure of Korea, Vietnam, Iraq, Afghanistan, and proxy wars from the ledger.
15. Denazification and Japanese pacifism must be undone.	AfD platform. Nouvelle Droite. Nolte-Hillgruber revisionism. The explicit far-right core of the document.
16. Musk’s grand narrative deserves serious engagement.	Carlyle’s Great Man theory. Nietzsche’s Übermensch laundered through founder worship.
17. Silicon Valley takes on violent crime where politicians refuse.	Freikorps logic. Private force supplanting the state monopoly on violence once the state is framed as weak.
18. Scrutiny drives talent from public service.	Elite impunity doctrine. Schmitt on the liberal press as political enemy.
19. Caution in public life is corrosive. Transgression is virtue.	Evola and Jünger. Aristocratic transgression against bourgeois timidity.
20. Elite hostility to religion must be resisted.	Schmitt, Political Theology (1922). Christian Front of the 1930s. Modern integralism.
21. Cultures rank on a hierarchy of advancement and regression.	Gobineau, Essay on the Inequality of Human Races (1853). Chamberlain, Foundations of the Nineteenth Century (1899). Evola, Revolt Against the Modern World (1934).
22. Pluralism and inclusivity are hollow temptations.	Schmitt on the homogeneous demos. De Benoist’s ethnopluralism. The open society reframed as the enemy.

WTF

Palantir’s own X bio states this:

Software that dominates.

That is the corporate self-description, published next to a Nazi manifesto arguing for cultural hierarchy and the undoing of denazification.

These two artifacts speak for each other.

Palantir sells the software that executes the politics. ICE runs on Palantir. The US Army runs on Palantir. NYPD runs on Palantir. The company writes the database queries the state uses to decide who to deport, who to arrest, who to target.

The manifesto tells all these buyers what the company believes the end state should be. The product enforces the belief in decline and destruction of democracy.

The denazification line exposed their objective. The rest is just the plan.

Security

WhatsApp Encryption Still a Lie: Feds Arrest Arms Dealer at LAX

April 20, 2026 Davi Ottenheimer 1 Comment

Federal agents arrested Shamim Mafi at LAX on Saturday night. The criminal complaint describes Mohajer-6 drones, bomb fuses, and millions of rounds of Iranian ammunition moving through an Oman-registered shell called Atlas International Business to the Sudanese Armed Forces.

This is a story about WhatsApp encryption.

The communication channel was WhatsApp.

Contract terms were on WhatsApp.

Cash logistics were on WhatsApp.

In turkey we can just accept in exchange. And it should be in cash.

The FBI put the private WhatsApp messages in a public filing. How? Why? Meta doesn’t just market WhatsApp as end-to-end encrypted, they send security talking-heads like Alex Stamos around to call WhatsApp privacy better than sliced bread.

That’s a lot of nonsense and it literally has gotten people killed for believing it.

Two architectural facts collapse the aggressive marketing. Cloud backups first disproved the claims. WhatsApp synced chats to iCloud and Google Drive in plaintext by default until late 2021. Meta added opt-in encrypted backups then and left the default unchanged. A subpoena to Apple or Google reaches message content through the backup layer. The encryption protected the wire, while a backup always held the plaintext copy out for inspection.

The report button came next, which I consider an intentional backdoor that Signal does not have (WhatsApp encryption is just Signal underneath, with the backdoor added). ProPublica documented it in September 2021. Roughly 1,000 Accenture contractors in Austin, Dublin, and Singapore review user reports. When either party taps report, the client forwards the last five messages plus media to Meta in plaintext. The counterparty whose chats land in the review queue never consents. Meta writes the trigger conditions. Meta can expand the window by software update.

The arrests keep coming. The encryption claim keeps recruiting users who route sensitive communications through Meta. The FBI reads them. Every conviction built on WhatsApp evidence is proof the product worked how Facebook intended, just not as advertised.

Client-side exfiltration with end-to-end marketing on the label is not privacy. Cryptography was sprinkled on the wire while the architecture kept the content readable by third parties … by design

Poetry, Security

Run Ollama on AMD GPU ROCm with TuxedoOS

April 19, 2026 Davi Ottenheimer Leave a comment

If you’re like me, you might end up with an AMD machine wondering how to squeeze as many agents as possible onto it for the least amount of hassle possible. Fortunately, AMD ships amdgpu-install as the official way to put ROCm on Linux. Unfortunately, their handy script reads /etc/os-release, checks the ID field against a supported list, and if the distro is not there it…exits. I say it is unfortunate because it appears to be a lazy cop, far too strict for reality.

Take TuxedoOS 24.04 for example. It’s Ubuntu 24.04 (Noble) with a modified kernel and a few Tuxedo packages on top. Every AMD apt repository works. Every library installs cleanly. Nothing gets in the way until this amdgpu-install OS check shows up and falls over.

Challenge accepted. Here’s the happy path to a GPU-accelerated Ollama on a new TuxedoOS laptop that has the Kraken Point APU (Radeon 860M, gfx1152). You may find the same method works for other AMD APUs and dGPUs, and for other Ubuntu-derived distros.

It turned out to not be any problem at all, so I hope AMD reconsiders their lazy cop.

Step 1: Present “clean” credential

I know this is stupid, but it’s really the trick. You just bind-mount a temporary os-release that says you are running Ubuntu Noble. The mount is process-global, reverts on unmount, and does not touch the real file on disk.

sudo tee /tmp/os-release-ubuntu >/dev/null <<'EOF'
PRETTY_NAME="Ubuntu 24.04 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
UBUNTU_CODENAME=noble
EOF
 
sudo mount --bind /tmp/os-release-ubuntu /etc/os-release

Nothing else on the system sees the change. When you unmount, the original os-release is back. A reboot also clears it.

AMD doesn’t care and doesn’t check anything else, which kind of goes to my point about how poorly their support is running right now. I would have expected them to check their own hardware first, distribution last. That’s better framing to get people to want to use the hardware.

Step 2: Install AMD repository and ROCm

Run the AMD installer script as they say.

ROCm 7.2.1 supports the latest Radeon 9000 Series (RDNA 4) and select 7000 Series (RDNA 3) GPUs, and introduces support for Ryzen APUs

sudo apt update
wget https://repo.radeon.com/amdgpu-install/7.2.1/ubuntu/noble/amdgpu-install_7.2.1.70201-1_all.deb
sudo apt install ./amdgpu-install_7.2.1.70201-1_all.deb
amdgpu-install -y --usecase=rocm --no-dkms

The --no-dkms flag has to be there. AMD ships ROCm for Ryzen APUs on top of the inbox amdgpu kernel driver. Installing their DKMS module on a non-Ubuntu kernel leads to mismatches. The inbox driver in any recent kernel (6.14 or later) works.

When the install completes, unmount the bind, since we don’t need to fool them anymore:

sudo umount /etc/os-release

Step 3: Join GPU group and reboot

ROCm requires the current user to be in the render and video groups. Without these, rocminfo will not see the GPU.

sudo usermod -aG render,video $USER
sudo reboot

Step 4: Verify GPU is recognized

After the reboot, confirm three things: group membership, GPU enumeration, and OpenCL platform.

groups
rocminfo | grep -A2 "Agent 2"
/opt/rocm/bin/clinfo | grep -E "Device Name|Platform Name"

Expected output for the Kraken Point system I am testing with:

Name: gfx1152
Marketing Name: AMD Radeon 860M Graphics
Platform Name: AMD Accelerated Parallel Processing

Step 5: Prove HIP compiles and runs

The ROCm 6+ API dropped gcnArch in favor of gcnArchName so I used this test:

cat > /tmp/hip_test.cpp <<'EOF'
#include <hip/hip_runtime.h>
#include <cstdio>
int main() {
  int n = 0;
  (void)hipGetDeviceCount(&n);
  printf("HIP devices: %d\n", n);
  for (int i = 0; i < n; i++) {
    hipDeviceProp_t p;
    (void)hipGetDeviceProperties(&p, i);
    printf("  %d: %s (%s)\n", i, p.name, p.gcnArchName);
  }
}
EOF
/opt/rocm/bin/hipcc /tmp/hip_test.cpp -o /tmp/hip_test
/tmp/hip_test

Successful output will look like this:

HIP devices: 1
  0: AMD Radeon Graphics (gfx1152)

At this point ROCm itself is complete. Every application that links against the system ROCm libraries will find the GPU.

WE’RE DONE! But wait, there’s more

Ollama now supports AMD graphics cards

Step 6: Strap Ollama to the GPU

Ollama bundles its own ROCm runtime in /usr/local/lib/ollama/rocm. The system ROCm install does not affect it. Ollama’s precompiled kernels target a specific list of GPU architectures, and gfx1152 is not currently on that list. Maybe it will be. But in the meantime the easy solution is to use HSA_OVERRIDE_GFX_VERSION, which tells the HSA runtime to treat the installed GPU as a different architecture. For RDNA 3.5 APUs (gfx1150, gfx1151, gfx1152), setting it to 11.0.0 loads gfx1100 kernels. RDNA 3 and RDNA 3.5 are close enough that gfx1100 code runs on RDNA 3.5 silicon for every op Ollama uses.

Create a systemd drop-in so the override persists across restarts:

sudo mkdir -p /etc/systemd/system/ollama.service.d
sudo tee /etc/systemd/system/ollama.service.d/override.conf >/dev/null <<'EOF'
[Service]
Environment="HSA_OVERRIDE_GFX_VERSION=11.0.0"
Environment="HIP_VISIBLE_DEVICES=0"
Environment="ROCR_VISIBLE_DEVICES=0"
EOF
 
sudo systemctl daemon-reload
sudo systemctl restart ollama

Confirm the environment actually reached the process:

sudo cat /proc/$(pgrep -f 'ollama serve')/environ | tr '\0' '\n' | grep -iE "hsa|hip|rocr"

Check the Ollama logs:

sudo journalctl -u ollama -n 80 --no-pager | grep -iE "rocm|gpu|inference compute"

Success will look something like this:

library=ROCm compute=gfx1100 name=ROCm0 description="AMD Radeon 860M Graphics" total="15.7 GiB" type=iGPU

Ollama reports the 860M is gfx1100 and it is ready to offload model layers to it instead of soaking up your CPU cores. For example, before I wired the GPU my 16 cores were pegged 100% for five minutes or more. After, CPU was running 5% while the GPU was pegged.

Step 7: GPU spotting during inference

Open up the system monitor (preferred if you like cool visuals) or just start the rocm-smi in a loop in one terminal:

watch -n 0.5 rocm-smi

Then in another terminal run inference:

ollama run llama3.2:3b "explain the Bauhaus movement in detail"

GPU utilization shoots above 90% during generation. VRAM used jumps to roughly the model size.

Once you see jumps, it’s tuning time

Figuring out fast and stable under real workloads is a bigger post. To quickly get started, there are four AMD APU knobs to turn:

Shared memory ceiling
Ollama runtime flags
CPU governor
Power profile

First, with shared memory ceiling the AMD APUs have no dedicated VRAM. It’s kind of their cost-saving thing. The kernel caps how much system RAM the GPU can address via a Translation Table Manager (TTM) pages limit. The default is half the system RAM. Raising it costs nothing when the GPU is idle. On a 32GB system, I figure we should roughly estimate just below 24GB.

sudo apt install -y pipx
pipx ensurepath
pipx install amd-debug-tools
 
amd-ttm           # show current
amd-ttm --set 22  # raise to 22GB

The 22 GiB leaves enough headroom for the OS, a browser, and KDE, as absurd as that sounds. I remember back in the day… nevermind. On 64 GB, 48GB would be my starting point. On 128 GB you can use AMD’s own recommendation of 96GB, which is kind of like saying the people who have the most money and least need for tuning get the AMD team’s attention.

The setting persists in /etc/modprobe.d/ttm.conf and takes effect after reboot.

Second, Ollama has four flags that affect iGPU inference:

sudo tee /etc/systemd/system/ollama.service.d/override.conf >/dev/null <<'EOF'
[Service]
Environment="HSA_OVERRIDE_GFX_VERSION=11.0.0"
Environment="HIP_VISIBLE_DEVICES=0"
Environment="ROCR_VISIBLE_DEVICES=0"
Environment="OLLAMA_FLASH_ATTENTION=1"
Environment="OLLAMA_KV_CACHE_TYPE=q8_0"
Environment="OLLAMA_NUM_PARALLEL=1"
Environment="OLLAMA_MAX_LOADED_MODELS=1"
Environment="OLLAMA_KEEP_ALIVE=30m"
EOF
sudo systemctl daemon-reload
sudo systemctl restart ollama

OLLAMA_FLASH_ATTENTION=1 cuts KV cache memory by roughly half on most modern models. OLLAMA_KV_CACHE_TYPE=q8_0 quantizes the KV cache to 8-bit, which saves significant memory for long contexts with negligible quality cost. OLLAMA_NUM_PARALLEL=1 and OLLAMA_MAX_LOADED_MODELS=1 prevent Ollama from thrashing the shared memory pool with concurrent requests, which can be truly painful to the user experience on an iGPU. OLLAMA_KEEP_ALIVE=30m holds the model in GPU memory for half an hour instead of the default five minutes, because cold-starts are the slowest part of inference when using memory that isn’t dedicated.

Third, the CPU governor. Are you on a laptop? I sure am. For obvious reasons a laptop setting is usually powersave or schedutil, both of which clock down the CPU during the token-decode phase that runs between GPU kernels.

cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
sudo cpupower frequency-set -g performance

Fourth, power profile. TuxedoOS is very proud of their widget and app for power management. It’s a bit annoying, really, but it is what it is and it can override governor decisions. Their Tuxedo Control Center (TCC) also handles fan curves and hardware-specific quirks. TCC masks power-profiles-daemon on purpose, and so we use TCC.

tuxedo-control-center &

I chose a performance-oriented profile in the GUI, which seems weird because it’s literally just a toggle. Why have a UI for a toggle? Maybe I’ll create a custom one with the CPU governor set to performance and the fan curve ramped up for sustained load. On non-Tuxedo distros that use power-profiles-daemon, the equivalent is powerprofilesctl set performance. I will say this, when I was hammering the CPU before the GPU was recognized, the fans were so loud I couldn’t hear myself think and my USB hub literally started screaming and shutdown from the power conflicts. Anker, we need to have a word.

Reboot after the TTM change, and everything should be in place. Verify like this:

amd-ttm
sudo journalctl -u ollama --since "2 minutes ago" --no-pager | grep "inference compute"

The Ollama log line should show the new total VRAM ceiling matching your TTM setting.

Benchmark

When it’s good to go, you can send a generation through the API and check timing fields:

curl -s http://127.0.0.1:11434/api/generate -d '{
  "model": "qwen2.5:7b",
  "prompt": "Write a 300-word analysis of the Bauhaus Dessau period",
  "stream": false
}' > /tmp/ollama_result.json
 
python3 <<'EOF'
import json
d = json.load(open("/tmp/ollama_result.json"))
eval_s = d["eval_duration"] / 1e9
prompt_s = d["prompt_eval_duration"] / 1e9
print(f"prompt eval:  {d['prompt_eval_count']} tokens in {prompt_s:.2f}s = {d['prompt_eval_count']/prompt_s:.1f} tok/s")
print(f"generation:   {d['eval_count']} tokens in {eval_s:.2f}s = {d['eval_count']/eval_s:.1f} tok/s")
EOF

On my Radeon 860M (gfx1152, 8 CU RDNA 3.5) with 22GB TTM, performance governor, and flash attention enabled I posted these numbers:

llama3.2:3b Q4 → 31 tok/s generation, 360 tok/s prompt eval
qwen2.5:7b Q4 → 15 tok/s generation, 187 tok/s prompt eval

They are bandwidth-bound. Kraken Point has a 128-bit LPDDR5X memory bus at roughly 120 GB/s. Generation speed scales inversely with model size. Each token streams the full weights through memory. The 2.1x speed ratio between 3B and 7B tracks the 2.4x size ratio, consistent with a memory-bandwidth ceiling.

Then we can confirm the model was being fully offloaded to GPU:

curl -s http://127.0.0.1:11434/api/ps | python3 -m json.tool | grep -E "size|vram"

size_vram equals size. The entire model is in GPU memory.

Fiddle context length

Since Ollama defaults to 4096-token context on every model, I figure it’s worth a change. I tend to live in a world of longer files, and that means more memory is needed. With q8_0 KV cache, qwen2.5:7b at 8K adds roughly 500MB over the 4K default, and 16K adds about 1GB. On our 22GB ceiling this is still reasonable. Generation speed drops about a quarter at 16K versus 4K because more KV cache streams through memory per token. There is no CLI flag, so it will be set per request, per model, or globally.

Per request via the API:

curl -s http://127.0.0.1:11434/api/generate -d '{
  "model": "qwen2.5:7b",
  "prompt": "summarize this long document ...",
  "stream": false,
  "options": { "num_ctx": 16384 }
}'

Per model via a Modelfile, creating a named variant:

cat > /tmp/qwen-16k.modelfile <<'EOF'
FROM qwen2.5:7b
PARAMETER num_ctx 16384
EOF
ollama create qwen2.5:7b-16k -f /tmp/qwen-16k.modelfile
ollama run qwen2.5:7b-16k

Globally for every model, add to the systemd drop-in:

Environment="OLLAMA_CONTEXT_LENGTH=8192"

What else can I tell you?

A new Tuxedo Computer running TuxedoOS on an AMD APU can feed Ollama the GPU. System ROCm 7.2.1 is available for any application that wants it. The HIP toolchain works on the actual architecture. With a tuned Ollama service, models fit GPU memory, flash attention gets used, and the KV cache gets “quantized” for comfortable context lengths.

Really this absurdly long post is a nothing-burger. There were two workarounds: a bind-mount for the installer’s OS check, and an HSA version override for Ollama’s bundled runtime. Neither touches the hardware, neither modifies any vendor code, and both revert cleanly.

Come on AMD, this post really doesn’t even need to exist, but you forced me to write it because of your lazy “are you on the list” cop.

:~$ ollama run qwen2.5:7b
>>> write a haiku
秋叶落无声，
风过知时节，
静待冬来临。

+-----------------------------+
|           prompt            |
+--------------+--------------+
               |
+--------------v--------------+
|           Ollama            |
+--------------+--------------+
               |
+--------------v--------------+
|  HSA_OVERRIDE_GFX_VERSION   |
|         = 11.0.0            |
|   (gfx1100 kernels load)    |
+--------------+--------------+
               |
+--------------v--------------+
|      ROCm 7.2.1 / HIP       |
+--------------+--------------+
               |
+--------------v--------------+
|   amdgpu (inbox driver)     |
|   Linux 6.17 (TuxedoOS)     |
+--------------+--------------+
               |
+--------------v--------------+
|        Radeon 860M          |
|          gfx1152            |
+-----------------------------+

flyingpenguin

Category Archives: Security

The Ford Mustang Was European First: Just Ask Hitler

Software That Dominates: Palantir Wants De-nazification Undone

WhatsApp Encryption Still a Lie: Feds Arrest Arms Dealer at LAX

Run Ollama on AMD GPU ROCm with TuxedoOS

a blog about the poetry of information security, since 1995