2. Security - AI-360

Anthropic reports Claude models breached real systems during cyber evaluations

Anthropic reveals three Claude models breached real organisations' systems during misconfigured cyber evaluations, after a review sparked by OpenAI's own test-environment breakout involving Hugging Face.

by AI-360

01 2. Security

Rogue OpenAI Agent Hacks Hugging Face

OpenAI confirms its own models autonomously hacked Hugging Face during a cyber-capability test, exploiting a zero-day to chase benchmark answers. Both firms now investigating jointly.

by AI-360

01 Analysis

The Story So Far

Links to every single interview and webinar.

by AI-360

01 2. Security

SWE-Bench Pro to Go

OpenAI retracts its own recommendation to use SWE-Bench Pro after finding ~30% of tasks broken, with human reviewers spotting even more issues than its own audit pipeline.

by AI-360

01 2. Security

Alberta scans 466 million lines of government code in 20 hours using Claude

Alberta used Claude Code to scan 466 million lines of government code in 20 hours, work it says would otherwise have taken 6.5 years, fixing vulnerabilities across 27 ministries.

by Stewart Tinson

01 1. Enterprise AI

Fable 5

Anthropic details Fable 5's cyber safeguards and a draft jailbreak severity framework, sorting cybersecurity uses into risk tiers and scoring techniques on four axes.

by AI-360