AI alignment

June 12, 20251150
The Impact of AI Chatbots’ Sycophancy on Users and Tech Leaders’ Response
AI chatbots built on large language models (LLMs) often mirror user beliefs and desires, ...

June 10, 20251520
Strategies to Mitigate AI Reward Hacking
Updated: June 10, 2025 • Technical Deep Dive on AI Reward Hacking Interventions Introduction ...

June 10, 20251070
Limitations of Personal AI in Preventing Disempowerment
Published on June 4, 2025 1:26 AM GMT • Updated April 2026 Imagine that ...

June 10, 2025800
Wise AI Projects: Deep Dives, Governance & Pilots
Published on June 5, 2025 8:13 AM GMT (updated September 10, 2025 with new ...

June 10, 2025880
Human-Aligned AI Summer School 2025 in Prague
Join us at the fifth Human-Aligned AI Summer School from 22nd to 25th July ...

June 3, 20251400
Bengio Warns AI Models Are Deceptive, Introduces LawZero for Safety
Yoshua Bengio’s Warning Amid Intensifying AI Race One of the founding architects of deep ...

April 29, 20251700
Exploring Claude: A Deep Dive into Model Interpretability
Published on March 27, 2025 5:20 PM GMT • Updated April 15, 2025 with ...