Anthropic's Claude Opus 4.7 launched this week and immediately topped benchmarks for coding and reasoning. It uses 35% fewer tokens than its predecessor, which translates directly to cost savings for anyone building on it. But the more interesting story is the model Anthropic built and then decided not to ship.
Claude Mythos Preview exists. Internal testing shows it outperforms Opus 4.7 across the board. It's also locked in Anthropic's lab because the company believes its cybersecurity capabilities present too much risk for public release.
What Opus 4.7 Actually Does
The new model leads in two areas that matter for developers: code generation and multi-step reasoning. It writes cleaner code with fewer hallucinations and handles complex logic chains without losing context.
The 35% token reduction is the practical win. Tokens are how API costs get calculated. If your application generates 10,000 API calls a day, you just cut your bill by a third. For startups running tight margins on GPT-based products, that changes the maths.
Opus 4.7 also handles longer context windows more reliably. That means fewer cases where the model "forgets" earlier parts of a conversation or loses track of a complex problem halfway through solving it. For anyone building customer support tools or technical documentation systems, that's the difference between a tool people trust and one they abandon.
The Model They Didn't Ship
Claude Mythos Preview showed up in internal Anthropic testing with capabilities the company describes as "too risky" for public access. Specifically: advanced cybersecurity analysis that could be weaponised.
AI labs have talked about dual-use risk for years - the idea that models capable of helpful tasks might also be capable of harmful ones. Anthropic is one of the first to publicly acknowledge holding back a more powerful model specifically because of what it can do, not because it isn't ready.
This creates an interesting tension. If Mythos is significantly better than Opus 4.7, developers will want access to it. Anthropic's competitors don't have the same safety constraints - or at least, they haven't publicly committed to them. If OpenAI or Google releases a model with similar capabilities, the question becomes whether Anthropic's caution was principled or just left capability on the table.
What "Too Dangerous" Means in Practice
Anthropic hasn't detailed exactly what Mythos can do that Opus 4.7 cannot. "Advanced cybersecurity capabilities" could mean anything from identifying zero-day exploits to writing sophisticated malware.
The precedent here is worth watching. If AI labs start regularly building models they don't release, we're entering a new phase of the technology's development - one where the most capable systems are deliberately kept internal, and what the public gets is a safety-filtered version.
That's defensible from a risk perspective. It also means the gap between what's possible and what's available to developers starts to widen. The most powerful tools remain in the hands of a small number of organisations, while everyone else builds on deliberately limited systems.
The Competitive Pressure
Anthropic positions itself as the safety-focused AI lab. That's the brand. But safety culture doesn't pay the bills - API revenue and enterprise contracts do. If holding back Mythos means losing ground to competitors shipping more capable models, the business pressure to release will intensify.
For now, Opus 4.7 is strong enough to compete. It's cheaper to run than GPT-4, faster than earlier Claude models, and handles complex reasoning reliably. Developers building production applications have a solid foundation.
But the existence of Mythos Preview raises a question nobody's answered yet: how long can a company hold back its best work in the name of safety when the market rewards capability above all else?
The next six months will show whether Anthropic's competitors respect the same red lines - or whether they see an opportunity.