Claude Mythos Preview
Claude Mythos Preview is an unreleased frontier model trained by Anthropic, announced as part of Project Glasswing. Named from Ancient Greek for “utterance” or “narrative.” Not planned for general availability.
Capabilities
The model’s cybersecurity capabilities stem from strong agentic coding and reasoning skills. It autonomously discovered thousands of zero-day vulnerabilities in every major operating system and web browser, developing exploits without human steering.
Benchmarks
| Benchmark | Mythos Preview | Opus 4.6 |
|---|---|---|
| CyberGym (vulnerability reproduction) | 83.1% | 66.6% |
| SWE-bench Verified | 93.9% | 80.8% |
| SWE-bench Pro | 77.8% | 53.4% |
| SWE-bench Multilingual | 87.3% | 77.8% |
| SWE-bench Multimodal (internal) | 59.0% | 27.1% |
| Terminal-Bench 2.0 | 82.0% | 65.4% |
| GPQA Diamond | 94.6% | 91.3% |
| Humanity’s Last Exam (no tools) | 56.8% | 40.0% |
| Humanity’s Last Exam (with tools) | 64.7% | 53.1% |
| BrowseComp | 86.9% | 83.7% |
| OSWorld-Verified | 79.6% | 72.7% |
Notes: Terminal-Bench 2.0 scored 92.1% with extended timeouts and v2.1 updates. BrowseComp achieved higher scores than Opus 4.6 while using 4.9x fewer tokens. Some HLE performance at low effort may indicate memorization.
Access
Available to Glasswing partners via Claude API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry. Research preview covered by $100M in Anthropic usage credits. Post-preview pricing: $25/$125 per million input/output tokens.
Safeguard strategy
Anthropic’s goal is to enable Mythos-class models at scale with safeguards that detect and block dangerous outputs. New safeguards will launch first with an upcoming Claude Opus model (lower risk than Mythos Preview). Security professionals can apply to a Cyber Verification Program for access through safeguarded models.
See also
- Project Glasswing — the initiative built around this model
- AI Vulnerability Discovery — the capability it demonstrates