Reward Hacking

June 10, 20254800
			Strategies to Mitigate AI Reward Hacking
Updated: June 10, 2025 • Technical Deep Dive on AI Reward Hacking Interventions Introduction ...

June 10, 20254650
			Mitigating Reward Hacking in LLMs: Best Practices
Introduction and Project Scope We present a structured evaluation suite of four targeted scenarios ...

April 11, 20253160
			Revealing AI Models’ Hidden Reasoning Shortcuts
Recent research has revealed that some state-of-the-art AI systems might be disguising their true ...