Intelligence is foundation
Subscribe
  • Luma
  • About
  • Sources
  • Ecosystem
  • Nura
  • Marbl Codes
00:00
Contact
[email protected]
Connect
  • YouTube
  • LinkedIn
  • GitHub
Legal
Privacy Cookies Terms
  1. Home›
  2. Featured›
  3. Artificial Intelligence›
  4. The Two-Hour Attack That Poisons AI Models
Artificial Intelligence Sunday, 3 May 2026

The Two-Hour Attack That Poisons AI Models

Share: LinkedIn
The Two-Hour Attack That Poisons AI Models

A Chinese court is reviewing the country's first AI hallucination fraud case. DeepSeek generated a fabricated biography so convincing that readers believed it was real. The person in question doesn't exist. The biography was entirely invented - names, credentials, career history, all of it.

This isn't a one-off failure. Researchers demonstrated they could poison a language model in two hours to recommend fake brands. Not exploit a bug. Not hack the system. Simply feed it carefully crafted training data and watch it confidently recommend products that don't exist.

Why LLMs Lie With Confidence

The problem is structural. Large language models don't know things - they predict the next most likely word based on patterns in their training data. When you ask an LLM a question, it's not retrieving facts from a database. It's generating text that sounds like the answer you'd expect.

That's why hallucinations feel so convincing. The model isn't guessing randomly. It's producing text that matches the statistical patterns of accurate information. A fake biography reads exactly like a real one because the model learned what biographies look like, not whether the facts are true.

This creates a dangerous feedback loop. Hallucinated content gets published. That content becomes training data for the next generation of models. The lies become statistically more likely to appear in future outputs because they're now part of the pattern.

The Two-Hour Poisoning Attack

The researcher experiment is alarming in its simplicity. Take a language model. Feed it training examples where a fake brand name appears in positive contexts. Two hours later, the model recommends that brand when asked for product suggestions.

No sophisticated attack vector. No security breach. Just the normal training process working exactly as designed. The model learned a pattern and reproduced it. The pattern happened to be malicious.

This matters because we're building systems that trust LLM outputs. Search engines surface AI-generated summaries. Customer service tools use LLMs to answer questions. Medical chatbots offer health advice. Every one of these systems is vulnerable to the same fundamental issue - the model has no concept of truth, only plausibility.

Why This Gets Worse

The China fraud case demonstrates real-world consequences. Someone used an AI-generated biography for what appears to be fraudulent purposes. The court is now trying to establish accountability. Who is responsible when an AI system confidently states fiction as fact?

The legal framework doesn't exist yet. Is it the model developer's responsibility to prevent hallucinations? The user who deployed the system? The person who relied on the output? The answer matters because billions of pounds in business decisions are being made based on LLM outputs.

Worse, the poisoning research shows that bad actors don't need access to the model's weights or architecture. They just need to influence the training data. That could be as simple as flooding the internet with carefully crafted fake content and waiting for the next model to scrape it.

Some companies are trying technical fixes. Retrieval-augmented generation pulls facts from verified databases before generating text. Confidence scoring flags outputs the model is uncertain about. Human review catches obvious errors before publication.

None of these solve the core problem. They're filters on top of a system that fundamentally cannot distinguish truth from convincing-sounding lies. The model still generates hallucinations - we're just trying to catch them before they cause damage.

What Actually Works

The only reliable approach is treating LLMs as what they are - text generation tools, not knowledge systems. Use them for drafting, summarising, and pattern-matching. Don't use them as sources of truth.

That means verifying every factual claim an LLM makes. Treating its outputs as suggestions, not answers. Building systems that assume hallucinations will happen and plan accordingly.

For developers, it means being honest about capabilities. An LLM can help write code, but it will occasionally invent function names that don't exist. It can summarise documents, but it might add details that weren't there. It can answer questions, but sometimes those answers will be completely wrong and utterly convincing.

The DeepSeek biography fooled people because it looked right. The poisoned model recommended fake brands because that's what it learned to do. The China court case exists because someone trusted an AI system that had no mechanism for truth.

This isn't a temporary problem waiting for better models. It's inherent to how language models work. They predict plausible text. Sometimes plausible and true align. Sometimes they don't. The model can't tell the difference.

More Featured Insights

Quantum Computing
Scientists Observe Fourth-Order Quantum Effect for First Time
Web Development
DuckDB Stores Data Lake Metadata in SQL, Not Files

Today's Sources

Dev.to
The Fatal Flaw of AI Hallucination: When LLMs Confidently Tell Lies
GeekWire
Microsoft and Amazon join Pentagon's push to build AI-first military with classified network deals
AI News
SAP: How enterprise AI governance secures profit margins
Hacker News
Kimi K2.6 just beat Claude, GPT-5.5, and Gemini in a coding challenge
TechCrunch AI
The best AI dictation apps, tested and ranked
TechCrunch
AI-generated actors and scripts are now ineligible for Oscars
Phys.org Quantum Physics
Physicists achieve first-ever 'quadsqueezing' quantum interaction
Physics World
Tiny knotted fibre leaps into the air like a springtail bug
InfoQ
DuckLake 1.0: Data Lake Format with SQL Catalog Metadata
CSS-Tricks
What's !important #10: HTML-in-Canvas, Hex Maps, E-ink Optimization, and More
InfoQ
JobRunr Introduces ClawRunr, an Open-Source Java AI Agent
InfoQ
Confluent Moves Schema IDs to Kafka Headers to Simplify Schema Governance
Hacker News
Open source does not imply open community
Codrops
Designing Against the Gallery: A Two-Year Journey to a Layered Portfolio Experience

About the Curator

Richard Bland
Richard Bland
Founder, Marbl Codes

27+ years in software development, curating the tech news that matters.

Subscribe RSS Feed
View Full Digest Today's Intelligence
Richard Bland
About Sources Privacy Cookies Terms Thou Art That
MEM Digital Ltd t/a Marbl Codes
Co. 13753194 (England & Wales)
VAT: 400325657
3-4 Brittens Court, Clifton Reynes, Olney, MK46 5LG
© 2026 MEM Digital Ltd