jacek

Posts

Sorted by New

48Goodhart's Law in Reinforcement Learning

1y

5

16Categorical-measure-theoretic approach to optimal policies tending to seek power

2y

0

Wikitag Contributions

Comments

Sorted by

Newest

Goodhart's Law in Reinforcement Learning

jacek1y30

Thanks for the comment! Note that we use state-action visitation distribution, so we consider trajectories that contain actions as well. This makes it possible to invert (as long as all states are visited). Using only states trajectories, it would indeed be impossible to recover the policy.

Reply