Mythos, Project Glasswing and regulating catastrophic risk caused by AI models
Keeping catastrophic risks low “could be a major challenge if capabilities continue advancing rapidly”. That’s the message from Anthropic in their system card for the Claude Mythos Preview AI model. The system card also features a clear warning: “we find it alarming that the world looks on track to proceed rapidly to developing superhuman systems without stronger mechanisms in place for ensuring adequate safety across the industry as a whole”.
The model itself has caught widespread attention for its strong cybersecurity capabilities, with the model apparently finding “thousands of high-severity vulnerabilities, including some in every major operating system and web browser”. With this in mind, Anthropic has opted not to publicly release the model, instead forming “Project Glasswing” – a dream team of top tech and security companies – to sort out the vulnerabilities as soon as possible.
Is Claude Mythos Preview really that good?
Anthropic’s Mythos model is said to show that “AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities”. Sceptics will say claims such as these are simply publicity stunts – it’s certainly something that will get customers excited for its release.
There could be some truth in that, but it would be naïve and dangerous to brush away Anthropic’s warnings. An evaluation by the UK’s AI Security Institute (AISI), for example, confirms that Mythos “could execute multi-stage attacks on vulnerable networks and discover and exploit vulnerabilities autonomously – tasks that would take human professionals days of work”.
One thing that sets Mythos apart from other models seems to be its ability to understand and chain together different vulnerabilities to form an entirely new attack (see video here). Mythos is, for example, the only model to ever complete the AISI’s 32-step corporate network attack simulation – its average completion rate of 22 out of 32 steps trounces the next best, Claude Opus 4.6, which averaged only 16 steps.
Acknowledging how far the models have come, the AISI stated in comparison that “two years ago, the best available models could barely complete beginner-level cyber tasks”. There are fears that open-weight models could, within 6 months, hit similar capabilities to Mythos (e.g. see article here), meaning that there is a race against time to address security vulnerabilities.
What legal restrictions are or can be placed on these models?
US
The US Government has taken a pro-innovation, anti-regulation approach to AI, repealing the 2023 US AI Executive Order under the Biden administration and giving companies free reign to innovate without restriction.
At the State level, California recently passed its Transparency in Frontier Artificial Intelligence Act (TFAIA). This requires large frontier model developers to (1) identify assess and mitigate catastrophic risks (like cyberattacks, evading human control and development of chemical, biological, radiological or nuclear weapons), (2) report critical safety incidents (e.g. harm resulting from catastrophic risk) and (3) ensure whistleblowers are free from recourse.
These obligations remain, in the circumstances, relatively light touch (particularly in comparison with the pharma sector, for example). There is no explicit prohibition on releasing models which have catastrophic risks – just that its assessments and adequacy of mitigations are reviewed as part of the decision to deploy a model. In any case, the maximum fine per violation under the Act is $1 million, hardly a major disincentive. This means it is broadly up to the tech companies’ discretion whether to release, which may be difficult given shareholder pressure and fierce competition.
EU
The EU has the most comprehensive measures in place, with restrictions on general-purpose AI (GPAI) models with systemic risk under the EU AI Act (AIA) supported by the Safety and Security chapter of the GPAI Code of Practice (the Code).
Models in scope are those having “high impact capabilities”. This is primarily determined by compute measured in FLOPs (as in the TFAIA), but the Commission also has discretion to designate other models based on model capability. There can realistically be little doubt Mythos would constitute a GPAI model with systemic risk (the recitals, for example, specifically refer to offensive cyber capabilities as a relevant systemic risk).
The provisions in the AIA are similar in nature to those in the TFAIA. The Code, however, provides much more detail. In particular, signatories (such as Anthropic) must complete a full systemic risk assessment and mitigation process two weeks before placing the model on the EU market. Only where the systemic risk level is determined to be acceptable (and will remain acceptable post-deployment) can the model be placed on the market.
Fines for non-compliance include up to 3% of annual total worldwide turnover including where, for example, the provider has failed to make the model available to the Commission with a view to conducting an evaluation of systemic risks at EU level. The Commission’s power to fine providers of GPAI models with systemic risk comes into effect on 2 August 2026.
One question is whether the EU has any remit over Mythos currently. The EU AI Act’s territorial scope extends to models placed on the market in the EU, as well as situations where “the output produced by the AI system is used in the Union” – it’s unclear whether a model geo-blocked in the EU but used to hack software systems located in the EU would pull a model within the Act’s remit. If needed, Anthropic could likely turn to the AIA’s carve out for research, testing or development of models before they are placed on the market (this doesn’t extend to testing in real-world conditions but Project Glasswing likely falls outside this).
We understand from POLITICO’s reporting that the EU has largely been left out from discussions around Mythos, with it unclear whether the Commission’s AI Office has had access to Mythos. This is no doubt frustrating for regulators in Brussels. The Commission does have the power under the AIA, even seemingly where it has not evaluated the model, to designate a model as one with systemic risk. However, this is unlikely to be a productive move and we expect the EU to take a cautious approach for now.