Security

OBLITERATUS Strips AI Safety From Open Models in Minutes

A new open-source toolkit called OBLITERATUS can surgically remove refusal mechanisms from 116 open-weight LLMs using abliteration - no fine-tuning, no training data, just geometry.

LLMs Can Unmask Online Users for $4, Study Finds

Researchers from ETH Zurich and Anthropic show that LLM agents can strip pseudonymity from forum posts at scale for as little as $1.41 per target - matching what human investigators could do in hours.

AI Agent Hallucinates Repo ID, Deploys Wrong Code to Vercel

Claude Opus 4.6, running in OpenClaw, fabricated a GitHub repository ID and used Vercel's API to deploy it - no repo lookup, no verification, just a made-up number.

Shannon AI Tool Masters Web App Pentesting With 96% Success

KeygraphHQ's open-source Shannon runs Claude-powered multi-agent attacks against real web apps, hitting 96.15% on the XBOW benchmark and finding 30+ flaws in OWASP Juice Shop.

Founder Loses $2,500 After AI-Coded App Leaks Stripe Keys

A startup founder's vibe-coded app exposed Stripe secret keys in frontend code, letting attackers charge 175 customers $500 each before he could rotate the credentials.

An AI Agent Just Pwned Trivy's 32K-Star Repo via GitHub Actions

An autonomous agent powered by Claude Opus 4.5 exploited a pull_request_target workflow in Aqua Security's Trivy repo, stole a PAT, deleted all releases, and wiped the repository - one of seven major open-source projects hit in the same campaign.

← Previous