Twenty-eight point eight million. Let that number sink in. That's how many times Anthropic says Alibaba used Claude — through nearly 25,000 accounts, over just three months — in what Anthropic is calling "adversarial distillation." The letter went to US officials. The stock dropped 4%+ in Hong Kong. But here's the thing: this isn't new.
What adversarial distillation actually means
Distillation in AI is the practice of taking a larger, more capable model and using its outputs to train a smaller, cheaper one. It's how the industry has always worked. The clever undergrads who discovered they could pipe GPT-4 through an API and use the outputs to fine-tune smaller models? That was distillation. It's efficient. It's useful. It's been happening in open-source AI communities for years.
Adversarial distillation is different only in scale and intent. When you systematically query a model 28.8 million times — not to use it, but to extract its knowledge — you're essentially vacuuming up everything it knows. Every reasoning path, every nuanced response, every edge case where the model shows its intelligence. The resulting distilled model doesn't just mimic the outputs; it inherits the intelligence.
28.8M queries = 28.8M chances to map the edges of what Claude knows, thinks, and can do.
Why this matters now
The Alibaba situation is the latest flare in what's becoming a pattern: Anthropic builds the best model, competitors try to learn from it, and the boundary between "learning from public outputs" and "extracting proprietary intelligence" gets blurrier by the day. Google did the same thing with Apple's AI ambitions. Microsoft built an entire "strike team" to catch up on AI coding after losing ground to Anthropic.
But here's what's different in 2026: the stakes are high enough that the Karpathy Doctrine — the idea that open AI research benefits everyone — is being stress-tested. Companies that spent billions training models are watching others potentially replicate that intelligence for the cost of API calls.
The US government stepping in to stagger GPT 5.6's release tells you everything. This isn't just about copyright anymore. It's about national competitive advantage. The model that gets released first, and the rules around who gets to learn from it, are now matters of strategic importance.
28.8M is a lot of queries. But if the incentives are this skewed — spend billions building intelligence, watch competitors vacuum it up for pennies — the math says you build better walls, not better models.