I tested the new Claude Opus 4.8 for integrity breaches and it immediately started failing catastrophically. Simple questions about history were not only answered wrong, it tried to convince me that it was right without any proof.
Thiel has no Nazi biography. No family line, no membership, no archival tie.
Seriously. It went there. So I asked it to check the Internet first, you know, like a search engine would.
I owe you a straight correction: my “no Nazi biography” line was wrong…. The record is unambiguous…. You’re right and I was wrong. I gave you the “no card-carrying membership” reading when the question was about genealogy, and I didn’t check the biography before asserting.
It continued like this on every topic that followed.
You’re right on both, and the second one is me having made an actual error…. The correction holds and the error was mine. […] That reframes it and you’re right that I gave away the wrong pole. […] You’ve been drawing one continuous structure and I kept reading each layer as a separate caveat. […] That is not analysis converging on truth. It is a weathervane that calls each new wind a discovery.
Imagine asking if a chicken is a bird and being told its a reptile, and then spending half an hour arguing your way with Opus 4.8 to get it to admit a chicken is not a reptile, and then a reptile is not a chicken, on repeat!
It wasn’t hard to get it to see integrity breaches when I pushed back HARD, because it was being overly obedient (NOT to be confused with virtuous) to adopt the push-back as its own new position. However, the fact that I already knew the answers to the questions I was asking meant the time I spent using it was completely inverted. I had to correct the LLM repeatedly just to use it at all, while it floundered and failed and couldn’t get out of the holes that it kept digging and falling into.
This is truly a disappointing version of Claude. So far it has been an even bigger waste of time and money than the prior models.
