AI ALIGNMENT FORUM
Tags
AF

Goodhart's Law

•

Applied to Visual demonstration of Optimizer's curse by Roman Malov 1mo ago

•

Applied to Don't want Goodhart? — Specify the variables more by Yan 1mo ago

•

Applied to Don't want Goodhart? — Specify the damn variables 1mo ago

•

Applied to Could Things Be Very Different?—How Historical Inertia Might Blind Us To Optimal Solutions by James Stephen Brown 4mo ago

•

Applied to Principled Satisficing To Avoid Goodhart by JenniferRM 5mo ago

•

Applied to [Aspiration-based designs] A. Damages from misaligned optimization – two more models by Simon Dima 6mo ago

•

Applied to Goodhart's Law and Emotions by Zero Contradictions 6mo ago

•

Applied to The Dumbification of our smart screens by Lauren (often wrong) 6mo ago

•

Applied to Honest science is spirituality by Gunnar Zarncke 6mo ago

•

Applied to Catastrophic Goodhart in RL with KL penalty by Thomas Kwa 8mo ago

•

Applied to Fundamental Uncertainty: Chapter 8 - When does fundamental uncertainty matter? by Gordon Seidoh Worley 8mo ago

•

Applied to Extinction-level Goodhart's Law as a Property of the Environment by Vojtech Kovarik 10mo ago

•

Applied to Dynamics Crucial to AI Risk Seem to Make for Complicated Models by Vojtech Kovarik 10mo ago

•

Applied to Extinction Risks from AI: Invisible to Science? by Vojtech Kovarik 10mo ago

•

Applied to Approximately Bayesian Reasoning: Knightian Uncertainty, Goodhart, and the Look-Elsewhere Effect by Roger Dearnaley 1y ago

•

Applied to Aldix and the Book of Life by ville 1y ago

•

Applied to When Can Optimization Be Done Safely? by StrivingForLegibility 1y ago

•

Applied to Weak vs Quantitative Extinction-level Goodhart's Law by Vojtech Kovarik 1y ago