AI ALIGNMENT FORUM
AF

jacek
Ω66210
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Goodhart's Law in Reinforcement Learning
jacek2y30

Thanks for the comment! Note that we use state-action visitation distribution, so we consider trajectories that contain actions as well. This makes it possible to invert η (as long as all states are visited). Using only states trajectories, it would indeed be impossible to recover the policy.

Reply
48Goodhart's Law in Reinforcement Learning
2y
5
16Categorical-measure-theoretic approach to optimal policies tending to seek power
3y
0