Anthropic's Dangerous Model: Mythos Preview and the Glasswing Gambit

Anthropic just did something remarkable: they released a model so capable at finding security vulnerabilities that they consider it too dangerous for public release. That's not marketing speak — the numbers are stark. Claude Mythos Preview scored 93.9% on SWE-bench Verified, compared to 80.8% for Opus 4.6. It found thousands of high-severity vulnerabilities across every major operating system and web browser. In the wrong hands, that kind of capability becomes a weapon.

Instead of locking it away, Anthropic took a different approach: Project Glasswing. It's a cybersecurity initiative that gives Mythos Preview access to over 40 organizations maintaining critical software — think energy grids, financial systems, healthcare infrastructure. The partners read like a who's-who of enterprise tech: AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, Microsoft, Nvidia, Palo Alto Networks. Plus up to $100M in usage credits and $4M in direct donations to open-source security projects.

Why a consortium instead of release?

The logic tracks: the same model that finds vulnerabilities can also exploit them. An open release would let adversaries use it to identify and target the same critical systems it's designed to protect. By restricting access to organizations with proven security practices and established stake in critical infrastructure, Anthropic keeps the capability in trusted hands while still achieving the goal: reducing real-world vulnerabilities faster than adversaries can exploit them.

"Mythos Preview is a general-purpose model that found thousands of high-severity vulnerabilities, including some in every major OS and web browser."

This also signals something about how frontier AI safety is evolving. The old playbook — train, evaluate, release — is giving way to something more nuanced: controlled deployment through partnerships. Anthropic isn't alone in this thinking. TEXXR's coverage shows the broader pattern of AI companies restricting access when the risk calculus doesn't add up to public release.

What's the play here?

Two things are happening simultaneously. First, Anthropic is proving that frontier AI models have genuine, high-value real-world applications beyond chatbots — in this case, automated vulnerability discovery at scale. Second, they're demonstrating a model for responsible release that other safety-conscious organizations will likely follow.

The $30B run-rate revenue (up from ~$9B at end of 2025) gives them the runway to fund this kind of initiative. But it's also a bet: by positioning Claude as the enterprise security choice — the model too capable to release publicly — they're creating a brand differentiator that no consumer-facing competitor can match. When your product's defining characteristic is "we held back the most powerful version because it's too risky," you signal something powerful about how seriously you take safety.

The test will be outcomes. If Glasswing member organizations actually ship code with fewer vulnerabilities because Mythos Preview found them first, this becomes a template. If not, it's a case study in ambitious AI safety theater. The next few quarters will tell.