model safety

June 10, 20251190
Misalignment on a Budget: Finetuning and Steering Vectors
Published on June 8, 2025 3:28 PM GMT TL;DR We reproduce emergent misalignment (Betley ...

June 5, 20251480
AI Moratorium Fails: Amodei’s Push for Transparency Standards
By expanding on technical challenges, regulatory comparisons, and expert perspectives, we explore why a ...