| GPT-5.5 hallucinates 3x more than MIT-licensed GLM-5.2(arrowtsx.dev) | |
| 533 points by oshrimpton 1 day ago | 262 comments | |
tl;dr: Despite being roughly half the size, the MIT-licensed GLM-5.2 scores within 4 points of GPT-5.5 on the Artificial Analysis Intelligence Index while hallucinating far less (28% vs 86%), suggesting raw parameter scaling has plateaued and may actively harm truthfulness. The author argues massive models like DeepSeek V4 Pro (94% hallucination rate) fail to recognize their own knowledge limits, wasting compute confidently producing wrong answers. Model training and selection should instead optimize a trilemma of capability, hallucination rate, and compute efficiency. | |
HN Discussion:
| |