When AI Confidently Makes Things Up

Today's Overview

A journalist in China tested DeepSeek with a simple request: write a biography in classical Chinese style. The model delivered prose that was eloquent, grammatically perfect, and almost entirely fabricated. The birthplace was wrong. The mother's surname invented. Seventy years of lived experience, rewritten. This wasn't a glitch-it was the model doing exactly what it was designed to do: predict plausible-sounding text.

The Hallucination Problem Moves from Academic to Legal

What's changed this week is that AI hallucination is now a legal liability. China's Supreme People's Court formally documented the country's first AI hallucination-induced fraud case-someone bought a product based on an AI recommendation for a fake brand and lost 800 RMB. Researchers showed they could poison a language model into endorsing a completely fictional brand in just two hours of feeding it false information online. This isn't a consumer-grade model problem anymore. It's weaponised.

The core issue is structural, not fixable with better training. Large language models aren't knowledge databases-they're next-token predictors. When a model encounters information it hasn't seen in training, it doesn't say "I don't know." It guesses the most plausible answer and delivers it with absolute confidence. Asking an LLM to verify facts is asking it to do something it was never designed for. The training objective was "predict the next token," not "be correct."

Model Wars Heat Up While Enterprise AI Gets Serious

Meanwhile, Kimi K2.6, an open-weights Chinese model, just beat Claude, GPT-5.5, and Gemini in a programming challenge. The competitive landscape is narrowing fast. What matters here isn't the headline-it's that coding benchmarks are becoming the real arena for model comparison, and open-source alternatives are closing the gap on proprietary systems.

On the enterprise side, the Pentagon just gave Microsoft, Amazon, OpenAI, Google, Nvidia, and SpaceX access to classified military networks. These aren't experimental deployments. Impact Level 6 and 7 networks are the most sensitive infrastructure the US military operates. The bet is clear: autonomous AI systems can cut months of analysis into days. Google employees sent a letter protesting the move; the company ignored it. This signals where capital and state power are converging-not on safety debates, but on speed.

For builders and business leaders, the real story this morning is about consequences catching up to capability. Hallucinations that sounded like a research problem three months ago are now courtroom evidence. AI governance, once a compliance checkbox, is now a P&L item. And the models that can run independently on classified networks are the ones reshaping how nations plan.