New paper: AGI Agent Safety by Iteratively Improving the Utility Function — AI Alignment Forum