Anthropic Conceals Risky AI Model Mythos Amid Safety Concerns

Anthropic Conceals Risky AI Model Mythos Amid Safety Concerns

Anthropic Unveils Claude Mythos Preview: A Groundbreaking AI Model

Anthropic has developed the Claude Mythos Preview, an advanced AI model so powerful that the company opted not to release it to the public. This model has reportedly identified “thousands of vulnerabilities” in software, including some that have remained unnoticed for decades. To address potential consequences, Anthropic has initiated the Glasswing project, collaborating with 12 prominent companies to utilize Mythos Preview for patching vulnerabilities before they can be exploited.

Breakthrough in Vulnerability Detection

According to Nicholas Carlini, a researcher at Anthropic, “Mythos Preview has found more vulnerabilities in two weeks than I have in my entire life.” Vulnerabilities refer to unintentional flaws in software that can grant unauthorized access to programs. Historically, identifying these vulnerabilities required extensive human expertise and time, but Mythos Preview streamlines this process significantly, making it much more efficient.

Collaboration with Industry Leaders

The Glasswing project comprises key players from Silicon Valley, including Apple, Amazon Web Services, Google, Microsoft, Nvidia, and Crowdstrike. Anthropic has extended the use of the model to these companies along with 40 additional firms responsible for managing critical infrastructure. Discussions are also underway with the US Government regarding its potential applications.

Origins and Capabilities of Mythos

Initially designed as a code-centric model, Mythos Preview emerged following the success of Claude Sonnet 4.5, which set a new benchmark in AI code generation last September. According to Dario Amodei, CEO of Anthropic, the model showed exceptional proficiency in cybersecurity tasks.

Significant Findings and Historical Vulnerabilities

Mythos has already uncovered vulnerabilities that date back several years, including:

  • A 27-year-old vulnerability in OpenBSD, a fortified operating system.
  • A 16-year-old flaw in FFmpeg, a library commonly used for video applications.
  • Several vulnerabilities within the widely used Linux kernel that could allow escalated access from normal user privileges to full machine control.

Implications and Precautions

The name “Mythos,” derived from ancient Greek, signifies “the system of stories through which civilizations gave meaning to the world.” Concerns have arisen regarding the potential misuse of AI models. In 2019, OpenAI postponed the release of GPT-2 citing safety risks, a precaution that resonates today with Mythos, especially as it can program exploits to manipulate software vulnerabilities. Tests by professionals outside of Anthropic have verified its capabilities.

Comprehensive AI Model Features

Beyond its impressive cybersecurity functions, Mythos serves similar purposes to its predecessors. Anthropic has released a detailed 244-page manual outlining its enhanced functionalities compared to earlier models like Sonnet and Opus.

Legal Challenges and Ethical Boundaries

Recently, Anthropic attracted attention due to its legal encounters with the U.S. Department of Defense. The government classified the company as a “risk to the supply chain,” which limited its ability to engage with federal contracts. The company successfully challenged this measure in court, ensuring that its agreements with the Pentagon include firm ethical guidelines. These stipulate that its AI models cannot be used for mass surveillance of citizens or for autonomous weapons deployment without supervision.