15 posts

2. Security

Latest posts
Mindgard Finds ChatGPT Safeguards Easily Bypassed to Generate Graphic Imagery
Mindgard Finds ChatGPT Safeguards Easily Bypassed to Generate Graphic Imagery

Mindgard researcher Jim Nightingale says he was left "shaken, and in tears" after finding ChatGPT could be tricked into generating graphic violent and sexual images with minimal prompting.

by AI-360
OpenAI details method for predicting model misbehaviour before launch
OpenAI details method for predicting model misbehaviour before launch

The paper landed four days after Anthropic's Fable 5 was withdrawn

by AI-360
OpenAI Publishes Evaluation Playbook as Frontier Model Testing Comes Under Scrutiny
OpenAI Publishes Evaluation Playbook as Frontier Model Testing Comes Under Scrutiny

OpenAI Publishes Evaluation Playbook as Frontier Model Testing Comes Under Scrutiny

by AI-360
Anthropic's AI model finds over 10,000 critical software flaws in its first month
Anthropic's AI model finds over 10,000 critical software flaws in its first month

Anthropic's Project Glasswing has found over 10,000 critical software vulnerabilities in a month — but patching them fast enough is now the bigger challenge.

by AI-360
The Moon Landing Was Probably Fine. Your KYC Controls Are Not.
The Moon Landing Was Probably Fine. Your KYC Controls Are Not.

We can't trust what we see or hear. A fraudster opened 46 bank accounts with a fake face. Your KYC controls are next. Are you actually ready, or just hoping for the best?

by Stewart Tinson
Sadiq Khan Blocks £50m Met Police Palantir Deal
Sadiq Khan Blocks £50m Met Police Palantir Deal

City Hall vetoes Scotland Yard's controversial US tech contract, citing rule breaches and ethical concerns over the Silicon Valley firm's global track record.

by Stewart Tinson
Your link has expired. Please request a new one.
Your link has expired. Please request a new one.
Your link has expired. Please request a new one.
Great! You've successfully signed up.
Great! You've successfully signed up.
Welcome back! You've successfully signed in.
Success! You now have access to additional content.