mechanistic interpretability

AXRP Episode 41: Lee Sharkey on Attribution-based Decomposition image

June 10, 20251.9k0

AXRP Episode 41: Lee Sharkey on Attribution-based Decomposition

Published June 3, 2025 Introduction Interpretability in deep learning remains a key research frontier. ...

Human-Aligned AI Summer School 2025 in Prague image

June 10, 202514k0

Human-Aligned AI Summer School 2025 in Prague

Join us at the fifth Human-Aligned AI Summer School from 22nd to 25th July ...

Exploring Claude: A Deep Dive into Model Interpretability image

April 29, 20257.7k0

Exploring Claude: A Deep Dive into Model Interpretability

Published on March 27, 2025 5:20 PM GMT • Updated April 15, 2025 with ...

Reassessing Sparse Autoencoders: Challenges Ahead image

April 12, 20259.9k0

Reassessing Sparse Autoencoders: Challenges Ahead

The GDM mechanistic interpretability team recently released a comprehensive update evaluating the utility of ...

When Research Brilliance Doesn’t Equate to Strategic Foresight image

March 30, 20254.4k0

When Research Brilliance Doesn’t Equate to Strategic Foresight

TL;DR: A strong research record provides some indication of aptitude, but it is not ...

© 2026 Web Crafting Code