Intelligence is foundation
Podcast Subscribe
Voices & Thought Leaders Tuesday, 24 February 2026

Anthropic says Chinese labs created 24,000 fake accounts to extract Claude data

Share: LinkedIn
Anthropic says Chinese labs created 24,000 fake accounts to extract Claude data

Anthropic has accused three Chinese AI companies - DeepSeek, Moonshot, and MiniMax - of running what it describes as "industrial-scale distillation attacks" against its Claude models. The allegation: 24,000 fraudulent accounts systematically extracting over 16 million conversations to train competing models.

If true, this represents something more significant than typical corporate espionage. It suggests a coordinated effort to capture not just model outputs, but the underlying patterns of how Claude responds - effectively reverse-engineering the model's behaviour through massive-scale interaction.

What distillation attacks actually are

Model distillation isn't inherently malicious. In its legitimate form, it's a technique for creating smaller, more efficient models by training them to mimic the behaviour of larger ones. You ask a big model thousands of questions, record its responses, and use that data to teach a compact model to approximate the same answers.

The technique is widely used in research and industry. OpenAI distills GPT-4 to create more efficient variants. Google uses distillation to optimise models for mobile devices. It's a standard part of the AI development toolkit.

What Anthropic alleges is different in scale and intent. Not a research team running a few thousand queries to understand model behaviour, but tens of thousands of fake accounts systematically extracting millions of interactions. The distinction matters. One is research methodology. The other is what Anthropic characterises as data theft.

The economics of AI competitive advantage

Claude's capabilities represent years of research investment, compute resources, and refinement. If competitors can approximate those capabilities by simply querying the API extensively, it undermines the economic model that funds frontier AI development.

This creates a thorny problem. AI companies want their models to be widely accessible - that's how they demonstrate capability and build market presence. But open access creates vulnerability. Anyone can interact with the model, and at sufficient scale, those interactions become training data.

The accused companies haven't responded publicly to the allegations yet. Without their perspective, it's difficult to assess the full situation. Perhaps there are technical explanations for the account patterns. Perhaps the interaction volumes have innocent explanations. Or perhaps Anthropic's characterisation is accurate.

What the industry is watching

This accusation lands at an interesting moment in AI development. Chinese labs have been rapidly closing the capability gap with Western frontier models. DeepSeek in particular has drawn attention for achieving strong performance with reportedly lower compute budgets.

If those efficiency gains come partly from distillation of Western models, it changes the competitive landscape. It suggests that leading-edge research might be easier to approximate than previously assumed, and that API access itself becomes a vulnerability.

The broader question is whether the current model of AI development is sustainable. Companies invest heavily in training frontier models, then expose them through APIs that can be systematically queried. The business model assumes that API revenue exceeds the cost of providing access. But if competitors can use that access to rapidly close capability gaps, the economics start to break down.

Ethical dimensions and industry response

Beyond the competitive implications, there's an ethical question about data use and model training. If you train a model predominantly on another company's outputs, do you need permission? Should there be disclosure requirements? What constitutes legitimate research versus commercial exploitation?

The AI industry hasn't settled these questions yet. There's no clear consensus on what counts as fair use of model outputs, no established norms around distillation practices, and limited precedent for how to handle disputes.

Anthropic's decision to make these allegations public suggests the company believes this crosses a line. But without regulatory frameworks or industry standards, it's unclear what recourse exists beyond public accusations and potential API restrictions.

What comes next

In the immediate term, expect tighter API monitoring. Companies will implement more sophisticated detection systems to identify potential distillation attacks. Rate limits might become more restrictive. Account verification could become more stringent.

Longer term, this incident will likely accelerate conversations about AI governance, data ethics, and competitive practices in the industry. Questions that seemed theoretical - what counts as legitimate model interaction, how should training data be sourced, what obligations do API providers and users have - become urgent and practical.

The accused companies will eventually respond. Their perspective will be important. If they deny the allegations, we'll need evidence to assess the claims. If they acknowledge the data collection but dispute its characterisation, that opens a different conversation about norms and practices.

What's clear is that as AI capabilities advance, the question of how models learn from each other - and what rules should govern that learning - moves from academic debate to commercial battleground. Anthropic's accusations might be the opening shot in a much larger industry reckoning about data, competition, and the boundaries of acceptable AI development practices.

More Featured Insights

Builders & Makers
Autonomous agents can now earn and spend money without human intervention
Robotics & Automation
Tesollo's robotic hand is built for businesses, not lab benches

Video Sources

Dwarkesh Patel
Why Anthropic Won't Outspend Its AI Rivals - Dario Amodei

Today's Sources

DEV.to AI
Opening the Door to the Agent Economy with OpenClaw and Moltbook
DEV.to AI
I built a Chrome extension that lets you annotate localhost and have AI fix everything
ML Mastery
Introduction to Small Language Models: The Complete Guide for 2026
PyImageSearch
Vector Search Using Ollama for Retrieval-Augmented Generation (RAG)
Addy Osmani
Stop Using /init for AGENTS.md
The Robot Report
Tesollo commercializes its lightweight, compact robotic hand for humanoids
The Robot Report
The hidden infrastructure challenge facing outdoor robotics OEMs
The Robot Report
Fictiv and MISUMI report on the state of manufacturing, supply chain digitalization
Latent Space
[AINews] Anthropic accuses DeepSeek, Moonshot, and MiniMax of >16 million "industrial-scale distillation attacks"
Latent Space
⚡️The End of SWE-Bench Verified - OpenAI Frontier Evals Team
Ben Thompson Stratechery
Another Viral AI Doomer Article, The Fundamental Error, DoorDash's AI Advantages
Azeem Azhar
📈 Data to start your week

About the Curator

Richard Bland
Richard Bland
Founder, Marbl Codes

27+ years in software development, curating the tech news that matters.

Subscribe RSS Feed
View Full Digest Today's Intelligence
Free Daily Briefing

Start Every Morning Smarter

Luma curates the most important AI, quantum, and tech developments into a 5-minute morning briefing. Free, daily, no spam.

  • 8:00 AM Morning digest ready to listen
  • 1:00 PM Afternoon edition catches what you missed
  • 8:00 PM Daily roundup lands in your inbox

We respect your inbox. Unsubscribe anytime. Privacy Policy

© 2026 MEM Digital Ltd t/a Marbl Codes
About Sources Podcast Audio Privacy Cookies Terms Thou Art That
RSS Feed