x
2025-Era “Reward Hacking” Does Not Show that Reward Is the Optimization Target — AI Alignment Forum